Towards A Model-Driven Design Tool for Big Data Architectures

June 4, 2017 | Autor: Saeed Tajfar | Categoria: Design Tools, Model Transformation, MDE, Meta models, Big Data Architecture, Big Data Applications

Share Embed

Denunciar este link

Descrição do Produto

Towards A Model-Driven Design Tool for Big Data Architectures Michele Guerriero, Saeed Tajfar, Damian A. Tamburri, Elisabetta Di Nitto Politecnico di Milano Dipartimento di Elettronica, Informazione e Bioingegneria via Golgi 42, Milano, 20133 - Italy [michele.guerriero,damianandrew.tamburri,elisabetta.dinitto]@polimi.it

ABSTRACT Big Data technologies are rapidly becoming a key enabler for modern industries. However, the entry costs inherent to “going Big” are considerable, ranging from learning curve, renting/buying infrastructure, etc. A key component of these costs is the time spent on learning about and designing with the many big data frameworks (e.g., Spark, Storm, HadoopMR, etc.) on the market. To reduce said costs while decreasing time-to-market we advocate the usage of Model-Driven Engineering (MDE), i.e., software engineering by means of models and their automated manipulation. This paper outlines a tool architecture to support MDE for big data applications, illustrating with a case-study.

CCS Concepts •Software Engineering → Big Data; Model-Driven Development;

Keywords Big Data Applications Design; MDE; meta-models; model transformation; architecture framework; design tool

1.

INTRODUCTION

Big Data technologies have rapidly achieved widespread adoption for many reasons, e.g., thanks to the versatility with which they foster innovative products by direct analysis of various user contents (e.g., tweets, blogposts, likes, pictures, etc.). However, designing and developing Big Data applications is still a considerable problem since: (a) it involves many side-costs such as learning curve for desired technological frameworks; (b) it requires to balance out infrastructural and corporate governance costs [17] with (nontrivial) development and deployment costs; (c) it most likely requires additional costs for the various trial-and-error experiments needed to match desired performance. We argue that a relevant part of said costs can be saved by tackling the design, development and deployment of Big Data Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

BIGDSE’16, May 16 2016, Austin, TX, USA c 2016 ACM. ISBN 978-1-4503-4152-3/16/05. . . $15.00

DOI: http://dx.doi.org/10.1145/2896825.2896835

applications with Model-Driven Engineering (MDE) [15]. MDE essentially predicates the use of models as means to quickly and flexibly develop code. MDE takes place mainly by means of meta-modelling (i.e., devising a “model of a model”) and model-transformation (i.e., manipulating models in an automated way). By means of MDE, a considerable part of the effort required to design, develop and deploy Big Data applications would be reduced to modelling Big Data jobs using ad-hoc meta-models (e.g., for technological frameworks to be considered) and manipulating them with modeltransformations; subsequently, refining these standard models, designers can elaborate a complete deployable application image, based on the desired technological specifications (e.g., Hadoop/MapReduce, Storm, Spark, etc.) by means of model2text transformation (e.g., think of technologies such as XText1 or JET2 ). This paper makes an essential step towards providing modeldriven design facilities for Big Data, by offering three novel contributions: (a) an architecture in support of model-driven Big Data design (see Sec. 2); (b) the introduction and discussion of a series of meta-models for supporting said design activity (see Sec. 3); (c) an evaluation of the above using an illustrative case-study (see Sec. 4).

2.

MODEL-DRIVEN BIG DATA DESIGN: AN ARCHITECTURE

This section elaborates on the architectural details behind our proposed solution. The preliminary research idea and conceptual foundations for said idea stem from the state of the art [5]. More in particular, in [5] Casale et Al. articulate a preliminary exploration of a model-driven architecture solution as part of the formulation of the DICE EU project3 for the purpose of elaborating and further refining a data-intensive application by means of three abstraction layers consistently with the Model-Driven Architecture framework [7]. Quoting from [5]: “Models in DICE shall be formulated at three levels of abstraction, called DPIM (DICE Platform Independent Model), DTSM (DICE Technology Specific Model), DDSM (DICE Deployment Specific Model): (a) the DPIM model corresponds to the OMG MDA PIM layer and describes the behaviour of the application as a graph that expresses the dependencies between computations and data; (b) a DTSM, consists of a refinement of the DPIM and includes technology specific concepts and frame1

http://www.eclipse.org/Xtext/ https://eclipse.org/modeling/m2t/?project=jet 3 http://www.dice-h2020.eu/ 2

Figure 1: An architecture for model-driven big data design. works, both for computational logic and data storage, but that are still independent of the deployment; (c) the DDSM is a specialisation of the DTSM model which adds information about the dimensions of the technology in use and other application deployment characteristics”. In this paper we offer the architectural elements (see Fig. 1) that elaborate and support the layers and relations defined above. Such architecture envisions four components: DPIM Component: this component allows the assembly of a big data application through established componentbased design practices [16]. Essentially, by means of this component, the designer states the structural view of the application by specifying which components or nodes is the application composed of. For example, a simple batch processing application counting words from a web-source is composed of at least three nodes, i.e., a source node where the data is generated, a computation node, where the data is processed and a storage node where the data is stored after processing. The DPIM Component shall only allow designers to express requirements and/or required properties for the big data architecture to be designed. Therefore, this component shall contain the most generic series of elements possible, to encompass all possible big data domains. DTSM Component: this component allows evaluating multiple possible architecture layouts and alternatives, e.g., resulting from multiple possible combinations of similar big data technologies such as Spark and Storm which equally process streaming data. Essentially, by means of this component, the designer refines the structural-view previously sketched using the DPIM component with technology-specific details. As part of this component, our architecture provides a library of technological packages (or meta-models, in our case) from which the designer may choose whatever technology fits with her peculiar scenario or requirements. Said technological packages shall be defined to contain:

(1) Application Logic: the technological package shall allow direct instantiation of all constructs (with the appropriate relations) needed to develop and implement an application with the chosen framework. For example, in the case of the Hadoop Map Reduce framework, the related package shall contain all constructs necessary to structure a map-reduce application, that is, Mappers, Reducers, Joiners, etc; (2) Framework Logic: the technological package shall also contain the direct framework-related classes (e.g., by inheritance or association) that need to be imported in the application logic so the application can work by means of the framework. For example, the job that needs to run via the Hadoop Map Reduce framework needs to import classes in the “[...]hadoop.mapred.*” among others; (3) Framework Configuration: the technological package shall also contain the framework configuration classes and envision the possibility to override key framework defaults that may modify or affect desired Quality of Service properties. For example, within the Hadoop Map Reduce framework, necessary configurations may need to be provided to finetune the behaviour of the Job- and Task-Trackers; DDSM Component: this component allows to produce a deployable map for the implementable view of the big data application design realised and refined within the DTSM component. Said map, essentially relies on core-constructs that are common to any cloud-based application (of which big data is a subset). Similarly to the related DTSM Component, the DDSM component comes with ad-hoc deployment configuration packages which are specific per every technology specified in the DTSM component library. Designers that are satisfied with their DTSM model may use this component to evaluate several deployment alternatives, e.g., matching ad-hoc infrastructure needs. For example, the MapReduce framework typically consists of a single master JobTracker and one slave TaskTracker per cluster-node.

Besides configuring details needed to actually deploy the MapReduce job, designers may change the default operational configurations behind the MapReduce framework. Also, the designer and infrastructure engineers may define how additional Hadoop Map Reduce components such as Yarn may actively affect the deployment. Said deploymentspecific technological packages shall be defined to contain: (1)Deployed Component Dependencies: for example the deployment of Spark applications also requires a dependency to deploying Hadoop Map Reduce as well; (2)Deployable Component Recipes: for example configurable Chef recipes need to be provided for deploying both Spark and Map Reduce; (3)Deployable Topology Configuration: for example, key details concerning the infrastructure that needs to be provided, expressed using a standard notation, such as the “Topology and Orchestration Specification for Cloud Applications” (TOSCA) standard [4]. Model Transformations: given the above components, the following model transformations shall be supported, in order to automatize the entire modeling process: DPIM2DTSM::Refinement: this model transformation covers the scenario in which designers are happy with the contents and details specified through the DPIM component and need refining those contents with technological aspects. At this stage, the Refinement transformation shall scan the DPIM model and shall attempt to find a technological solution, that matches specified requirements. For example, if the DPIM model contains computation nodes that perform batch process using Hadoop MR, the model transformation shall instantiate the related technological package in the output DTSM model. DPIM2DTSM::Tradeoff: this model transformation, shown in Figure 2, covers the scenario in which designers have specified the DPIM architectural application structure and are evaluating framework trade-offs (e.g., as part of the architecture trade-off analysis method [6]) across multiple possible configurations and overlaps. At this stage, the Tradeoff transformation shall scan the DPIM model and compute all possible consistent combinations of technologies currently available in the DTSM library (the technological packages), that match the requirements specified in the DPIM model. For example, if the DPIM model contains a computation node for stream processing, a computation node for batch processing, a storage node, the Tradeoff transformation shall suggest, assuming that the DTSM package library contains Cassandra, HadoopMR, Storm, Spark Streaming and HDFS, the following possible model configurations, namely:

work may go wasted. The In-Place-Refactoring transformation makes sure that the new DTSM contains the edits that have been already applied. The In-Place Refactoring transformation calculates a ∆model to capture differences between the two input models and merges them in the output DTSM model. This scenario is captured in Fig. 4. DTSM2DDSM::Roll-out: this model transformation covers the scenario in which the designers are satisfied with their DTSM objectives and need deployment assistance. This transformation shall create a deployable TOSCA blueprint by matching the frameworks and technologies used in the DTSM model, with the actual deployment of their runtime platforms. Figure 3 shows that refinement and rollout transformations can be used in a sequence to get from an high level architectural model down to the corresponding deployable blueprint in TOSCA.

Figure 2: Tradeoff transformation scenario.

Figure 3: Refinement and roll-out transformation scenario.

{ComputeNode 1=HadoopMR,ComputeNode 2=Storm, StorageNode1=HDFS}; {ComputeNode 1=HadoopMR,ComputeNode 2=Storm, StorageNode 1=Cassandra}; {ComputeNode 1=HadoopMR,ComputeNode 2=Spark, StorageNode 1=HDFS};

{ComputeNode 1=HadoopMR,ComputeNode 2=Spark, StorageNode 1=Cassandra}; DTSM2DTSM::In-Place-Refactoring: this model transformation covers the scenario in which designers generate and edit a DTSM model, but then re-edit the DPIM model and re-generate (using the DPIM2DTSM::Refinement transformation) a new DTSM without previous edits. As a consequence consistency between the two layers is broken and

Figure 4: In-place refactoring transformation scenario. As a first attempt to provide an implementation for the above architecture we operated as follows.

First, we studied a selected number of key big data technologies and, through manual coding and card-sorting (i.e., manual conceptual clustering), we isolated the component types and relations that are needed to structure a big data application within the DPIM specification. We encoded said types and relations in the DPIM meta-model (see Sect. 1). Second, per each technological framework to be included in the DTSM component we: (a) installed and tested the framework; (b) we elicited through reverse-engineering a model of said framework and of a series of applications using the framework; (c) we analysed these reverse-engineered models to prepare: (1) a single core meta-model that contained elements we found and observed across all the technological frameworks under investigation - this became the core meta-model behind the DTSM component; (2) a single meta-model package per each technological framework under investigation, matching the specifications provided in the previously “DTSM Component” architecture element (see mid-component on Fig. 1). Third, we inherited and adapted the MODACloudsML [2] deployment technology to reflect a deployment blueprint based on TOSCA [4] i.e., the “Topology and Orchestration Specification for Cloud Applications”. To do so, we reverseengineered TOSCA 4 and applied a systematic mapping procedure to encode in a series of ad-hoc meta-models all constructs needed for: (a) TOSCA node types specific for big data technologies (e.g. HDFS NameNode and DataNode, Hadoop JobTracker, Storm Nimbus etc.); (b) big data node deployment semantics and relationships coded in the technological packages (e.g., node dependencies and configuration parameters); (c) the node configuration needs specific for both (a) and (b) using ad-hoc Chef5 recipes. More details on the preliminary results for this specification process are contained in Section 3.

3.

MODEL-DRIVEN BIG DATA DESIGN: NECESSARY META-MODELS

This section shows preliminary results6 . These consist of the meta-models definition for each abstraction layers previously defined. Figure 5 shows an extract of the DPIM metamodel. Also, Table 1 describes each DPIM meta-element. At this layer the high-level components of an application are specified, along with associated Quality of Service (QoS) requirements and input/output data components. Components at the DPIM layer are essentially black boxes performing a specific type of processing or generating/storing data. Moreover, for each component a desired target technology (if known at this layer) can be optionally specified. From these premises, A DPIM model can be automatically transformed into an equivalent DTSM core model. In fact, the DTSM component is composed by a core meta-model and multiple technological packages. The DTSM core meta-model looks conceptually quite similar to the DPIM meta-model (SimpleElement, CompositeElement, ComputeNode, etc.), but its main responsibilities are to act as a bridge between the 4 an image of our reverse-engineered and commented TOSCA meta-model is available online: http://tinyurl.com/ jcr9zwx 5 https://www.chef.io/chef/ 6 All results reported in this section are freely available as open-source Apache 2 Licensed material and can be found here: https://github.com/dice-project/DICE-Models/

DPIM model and its further refinements by means of the DTSM component. Table 1: DPIM entities description. Entity DIAElement

SimpleElement

CompositeElement

ComputationNode

SourceNode

StorageNode

QoSRequiredProperty

DataSpecification

Description An element of a Data Intensive Application. It can be a SimpleElement or a CompositeElement. It represents an atomic element of the Data Intensive Application, which mean it directly perform actions over the application data. It can be a SourceNode, a ComputeNode or a StorageNode. It represents an element of the Data Intensive Application, which has an high-level logical role, but that is then implemented by a composition of other elements. Represents an element of the application whose goal is to perform some computation on the application data. This entity represents an element of the Data Intensive Application acting as a data source, or generating application data. Reseprents an element of the application whose goal is to store the application data,( i.e a database or a file system). Represents the QoS constraints associated with an element of the Data Intensive Applicaion. This entity represents data characteristics like the model and the format.

On one hand, the DTSM core allows to capture the possible analyses that can be run on said DPIM model, or on the same DPIM expanded with technological assumptions. For example, designers might want to verify throughput of a DPIM, or throughput of a DPIM that uses both Storm and Spark together. Also, as another example, consider the scenario where a component of the DPIM model has an associated QoS requirement, like safety or high performance, which has to be respected, the DPIM2DTSM::Refinement transformation will instantiate all the Property required to quantify such requirement, in order for example to be then verified at design time, by mean of a suitable model-checking tool. Figure 6 reports an extract of the DTSM core meta-model, which focuses on the abstraction behind the last described scenario: what was a QoSRequirement at the DPIM level is now expressed by mean of a Property (that can be Simple or Composed ), which at the end can be quantified with a specific Metric. On the other hand, the DTSM core also acts as means to capture all those tecnological aspects that can be inferred from DPIM models. For example, if a designer has specified a technological requirement on a DIAElement (i.e. it

Figure 5: An extract of the DPIM meta-model.

Figure 7: A extract of the Hadoop Map Reduce DTSM meta-model.

Figure 6: An extract of the DTSM meta-model.

has to be implemented with HadoopMR), the main entity ( and connected classes ) from the corresponding technological package is automatically instantiated in the generated DTSM core model. The developer can detail the generated DTSM model, describing the logic of each component by leveraging the available technological packages. Back to the previous example, if the root entity of the HadoopMR package has been instantiated for a given component, according to DPIM level technological requirements, starting from this the developer can elaborate more the logic of such component, using concepts defined in the HadoopMR package itself. Figure 7 shows an extract of the HadoopMR meta-model, reporting the main elements that are part of the framework. The HadoopMRDIAMain represents indeed the main of an Hadoop MR application. An Hadoop MR job tipically consists of JobConf which contains all the configuration of the job itself, like the used Mapper, which splits the input into multiple KeyValuePair, and Reducer, which apply the reduce function over sets of KeyValuePair. The JobConf allows also to specify the InputFormat and OutputFormat, representing the specification of the input dataset and of the output results. The application also has a JobClient, which is responsible for tracking and managing the job execution. In the last modelling phase, the DDSM layer allows designers to define deployment characteristics of the application. Being Big Data applications a subset of the socalled Cloud-based applications family, the DDSM metamodel has to provide all the concepts required to specify the deployment of a Cloud application. To this aim the decision it to adopt the MODACloudsML [8] domain mod-

eling language, from the MODAClouds project [2], which provides a language for describing the deployment of multicloud, component-based applications. MODACloudsML offers abstractions to capture the runtime architecture and infrastructure needs for most if not all cloud applications. We inherit those concepts as a basis to be configured with big-data specific constructs. As previously specified, these specific constructs are realised in the form of technological packages themselves. Also, we mapped these specific constructs to the related node types and templates within our own TOSCA metamodel. This allows us to build complete TOSCA blueprints using an model-to-model transformation (from MODACloudsML to TOSCA) and, subsequently, serialise the resulting model into a well-formed TOSCA-YAML blueprint via Xtext 7 . All the above meta-models are implemented in the Eclipse EMF [14], while model transformations are expressed in Eclipse ATL 8 .

4.

THE WORD-COUNT CASE-STUDY

In this Section we illustrate our framework with a simple case-study, i.e., the Word Count application implemented with Hadoop Map Reduce. The Word Count application is the equivalent of the ”Hello World” in the context of parallel processing, since it can be easily implemented according to the MapReduce paradigm. It basically takes as input a textual dataset and counts word frequency within the dataset. In order to do this in parallel, the dataset is divided into multiple splits, that are then processed in parallel by a cluster of instances of the Map function. An execution of the Map function takes an input split, further divide it into tokens separated by whitespaces (words) and emit key-value pairs, of type hword,1i. The output key-value pairs are grouped by key and each group is scheduled to an instance 7 8

http://www.eclipse.org/Xtext/ https://eclipse.org/atl/

Figure 8: The WordCount Application DPIM model. of the Reduce function, which simply sums up the value of each pair and stores the final result, i.e., a word count.

Figure 10: The final DTSM model of the Word Count application

Figure 9: The DTSM core model of the Word Count application

Given the architectural simplicity of the WordCount Application, its DPIM model result in being a very simple component diagram made of 3 elements, a SourceNode providing the input dataset, a ComputationNode, performing the calculation and a StorageNode for storing results. Figure 8 shows the resulting DPIM model, assuming that the source node has been constrained to be an HDFS instance, while HadoopMR had been selected as the target technology implementing the computational task. Let’s now make the simple assumption that HDFS is the only supported storage solution. In this case the DPIM2DTSM::Refinement transformation selects HDFS for the StorageNode as well, leading to the model in Fig. 9, in which objects instantiated from the DTSM core meta-model are marked in green, while objects from the HadoopMR DTSM package are marked in red.. At this point the developer can instantiate objects from the HadoopMR package in order to model technology specific aspects. As we said, the WordCount application can be realised with a single Mapper and a single Reducer. The Mapper retrieve each split of the input dataset by mean of an InputReader and uses a RecordReader to generate KeyValuePair(s). The Reducer has a reduce() function and uses a RecordWriter to store the output results. Moreover the designer can detail the framework logic and its configuration. Figure 10 reports the output of this refinement phase (covered manually, in this case) on the DTSM model. For the sake of space, we avoid to show the final modelling step that lead to the DDSM model. Simply put the RollOut transformation will automatically generate a DDSM model representing the basic deployment of the modeled application. This model contains the required infrastructural resources along with instructions on how to install,

configure and run each employed system, such as HDFS and HadoopMR in the case of the Word Count case study. The deployment engineer can elaborate the generated model adding or modifying deployment information, like the number of instances in a cluster or their size. Once the editing phase ends, the TOSCA blueprint for the WordCount application can be automatically generated. Let’s assume now that at a given point the software developer change a technological requirements at the DPIM layer, for example she selects Apache Cassandra has the technology implementing the StorageNode. By following the path of automatic transformations shown in Figure 4, the developer can obtain a new DTSM, which keep all the elaboration shown in Figure 10, but in which the StorageNode is now a Cassandra instance. Of course Cassandra is set up with a default configuration applied by the transformations, but the DTSM is overall still consistent and with minimal tuning the switch to Cassandra is completed. This change in the technological requirements, that would require a lot of time to be implemented in a traditional way, could be done in few minutes and clicks.

5.

RELATED WORK

The model driven approach is a well known one and has been widely exploited in many areas of software engineering, from the web and mobile application development, linke in the case of WebRatio 9 , but also more recently in the development of multi-clouds applications with the MODAClouds EU FP7 Project [2]. Recently in the literature some works have been proposed which attempt to take advantage from Model-Driven development (MDD) concepts and technologies within the context of Big Data application. In [12] an interesting approach is proposed with the aim to allow MDD of Hadoop MR applications through a meta-model which can also be used for automatic code scaffold generation. Similarly, Stormgen [13] 9

http://www.webratio.com/site/content/it/home

aims to provide a DSL for defining Storm topologies. In this work the common Ecore format is used for building the metamodel and Xtext for generating the grammar of the language, are exploited. Stormgen also provides automatic code generation using the Xtend language, a dialect of Java. Also in this case the user has to specify for each element of the topology (Bolts and Spouts) the desired implementation, since the main focus is on designing the topology. They plan to have also a graphical DSL coupled with the textual one using eclipse GMF (Graphical Modeling Framework). In our work, we aim at tackling the co-existence of multiple big data technologies at the same time, e.g., Hadoop Map Reduce for “batch” processing on one side of the application and Storm for “streaming” processing. In so doing, our research solution aims at supporting the phase where designers decide how to combine multiple technologies in sight of their cooperation for solving ad-hoc Big Data problems. To this day, none of the technologies present in the state of the art are able to offer such support. Also, none of said technologies offer mechanisms to extend the internal framework logic to welcome technological additions. Given the rapid and almost furious expansion of the big data architecture landscape, this is a serious shortcoming that our model-driven solution plans to tackle head-on.

6.

[4]

[5]

[6]

[7]

[8]

CONCLUSIONS AND FUTURE WORK

Big data applications are rapidly gaining attention for their immense potential. To encourage and ease big data adoption, this paper offers a preliminary but essential step in supporting model-driven design of big data applications. We presented an architecture in support of said model-driven design activity. Also, we provided the foundational means to realise said architecture, i.e., by presenting and elaborating on a series of technological meta-models already implemented in EMF - these may be used as a starting point to further elaborate tool-support for model-driven big data design. Finally, we evaluated our proposed architecture and meta-models using a simple case-study. We concluded observing the great potential behind modeldriven engineering in assisting the design and development of big data applications. In the future we plan to further elaborate on our meta-models and implement them in practice using model-driven tool-support development technology such as the Eclipse/GEF, the GME framework or related technologies. Also, we plan to evaluate extensively on which technologies may be an essential part of our model-driven tool support, for example, by considering the inclusion of additional frameworks such as Flink, or, from a data-specific perspective, technologies such as HBase or HDFS2.

7.

[3]

ACKNOWLEDGMENT

[9]

[10]

[11]

[12]

[13]

[14]

Some of the authors’ work is partially supported by the European Commission grant no. 610531 (FP7 ICT Call 10), SeaClouds and the European Commission grant no. 644869 (H2020 - Call 1), DICE [15]

8.

REFERENCES

[1] Formal Concept Analysis, Foundations and Applications, volume 3626. Springer, 2005. [2] D. Ardagna, E. Di Nitto, G. Casale, D. Petcu et Al. Modaclouds: A model-driven approach for the design and execution of applications on multiple clouds. In

[16] [17]

Proceedings of the 4th International Workshop on Modeling in Software Engineering, MiSE ’12, pages 50–56, Piscataway, NJ, USA, 2012. IEEE Press. A. Behm, V. R. Borkar, M. J. Carey, R. Grover, C. Li, N. Onose, R. Vernica, A. Deutsch, Y. Papakonstantinou, and V. J. Tsotras. Asterix: towards a scalable, semistructured data platform for evolving-world models. Distributed and Parallel Databases, 29(3):185–216, 2011. T. Binz, G. Breiter, F. Leymann, and T. Spatzier. Portable cloud services using tosca. IEEE Internet Computing, 16(3):80–85, 2012. G. Casale, D. Ardagna, M. Artac, F. Barbier, E. D. Nitto et Al. Dice: Quality-driven development of data-intensive cloud applications. In Proceedings of the 7th International Workshop on Modelling in Software Engineering (MiSE), pages –, May 2015. P. Clements, R. Kazman, and M. Klein. Evaluating Software Architectures: Methods and Case Studies. Addison-Wesley, 2001. D. Frankel. Model Driven Architecture: Applying MDA to Enterprise Computing. John Wiley & Sons, Inc., New York, NY, USA, 2002. G. E. Gon¸calves, P. T. Endo, M. A. Santos et Al. Cloudml: An integrated language for resource, service and request description for d-clouds. In C. Lambrinoudakis, P. Rizomiliotis, and T. W. Wlodarczyk, editors, CloudCom, pages 399–406. IEEE Computer Society, 2011. N. Marz and J. Warren. Big Data Principles and best practices of scalable realtime data systems. Softbound print, 2015. Object Management Group. Common Warehouse Metamodel. http://www.omg.org/cwm, seen at April 2008. R. Qasha, J. Cala, and P. Watson. Towards automated workflow deployment in the cloud using tosca. In Cloud Computing (CLOUD), 2015 IEEE 8th International Conference on, pages 1037–1040, June 2015. A. Rajbhoj, V. Kulkarni, and N. Bellarykar. Early experience with model-driven development of mapreduce based big data application. In Software Engineering Conference (APSEC), 2014 21st Asia-Pacific, volume 1, pages 94–97, Dec 2014. S. Santurkar, A. Arora, and K. Chandrasekaran. Stormgen - a domain specific language to create ad-hoc storm topologies. In Computer Science and Information Systems (FedCSIS), 2014 Federated Conference on, pages 1621–1628, Sept 2014. M. Scheidgen and A. Zubow. Map/reduce on emf models. In Proceedings of the 1st International Workshop on Model-Driven Engineering for High Performance and CLoud Computing, MDHPCL ’12, pages 7:1–7:5, New York, NY, USA, 2012. ACM. D. C. Schmidt. Model Driven Engineering. IEEE Computer, 39(2):25–31, Feb. 2006. C. Szyperski. Component Software. Addison-Wesley, Reading, MA, 1998. P. P. Tallon. Corporate governance of big data: Perspectives on value, risk, and cost. IEEE Computer, 46(6):32–38, 2013.

Lihat lebih banyak...

Towards A Model-Driven Design Tool for Big Data Architectures

Descrição do Produto

Comentários