A workflow modeling system for capturing data provenance

May 21, 2017 | Autor: Girish Joglekar | Categoria: Mechanical Engineering, Chemical Engineering, Knowledge Management

Descrição do Produto

Computers and Chemical Engineering 67 (2014) 148–158

Contents lists available at ScienceDirect

Computers and Chemical Engineering journal homepage: www.elsevier.com/locate/compchemeng

A workﬂow modeling system for capturing data provenance Girish S. Joglekar ∗ , Arun Giridhar, Gintaras Reklaitis School of Chemical Engineering, Purdue University, West Lafayette, IN 47907, USA

a r t i c l e

i n f o

Article history: Received 6 January 2014 Received in revised form 1 April 2014 Accepted 9 April 2014 Available online 18 April 2014 Keywords: Workﬂow modeling Knowledge management Recipe management Data provenance

a b s t r a c t A workﬂow is an abstraction of the steps associated with the underlying work process and is typically modeled as a directed graph. The workﬂow concept under its various manifestations has been used to model applications in diverse areas, including project planning, manufacturing, scientiﬁc experiments, execution of computer software, and publishing. While the Open Provenance Model Core Speciﬁcation had laid the foundation for deﬁning the key concepts in a workﬂow, a simpliﬁed high level graphical representation of a workﬂow that is widely applicable is not available. In this paper we describe a novel general framework for building workﬂows and implementing the associated actions, which will facilitate understanding of work processes across multiple disciplines. As such, most work processes are organized hierarchically with well deﬁned control and management responsibilities. This framework will facilitate integration and coordination of activities across associated domains. Additionally, it will act as a template to refer to the associated metadata as well as reference to access the instance data from archives of completed workﬂow cases. When a speciﬁc case is in progress, a ﬁnite state machine will guide the user through the steps and provide up to date information about the current state. We describe the main building blocks in the framework, their functionalities and illustrate the integration of workﬂows between an experimental and a scientiﬁc process. © 2014 Elsevier Ltd. All rights reserved.

1. Introduction In the past decade there has been an unprecedented growth in the amount of information being generated and managed both within the technical domain and in the broader business setting. Due to the ever accelerating advances in the applications of information technologies to all aspects of running an enterprise, the problems of managing data, creating knowledge, making better decisions and doing so in real time will continue to become more complex and acute. In order to derive maximum beneﬁt, it is imperative that the information be captured and stored in a structured way, and that it be machine accessible. Only then can such vast amount of information be processed by computer assisted methods to provide effective and timely decision support. The proverb ‘prevention is better than cure’ applies fully to the state of information management. If the information is not stored in a structured, semantically rich fashion to begin with, then it becomes very expensive, and sometimes impossible, to retrieve the desired items of information later. This is evident, for example, from the plant data historians or the electronic lab notebooks that are in use today. Even the current very efﬁcient search engines make brute

force searches highly inefﬁcient and time consuming. The current solution for such situations is to write custom computer software for every special requirement linking multiple information repositories with disparate data identiﬁers and creating speciﬁc search protocols to mine for data related to the issue at hand. Similarly, there are software companies that specialize in annotating information and reports that were created using word processors or spreadsheets, using natural language processing techniques. Therefore, moving forward, in order to avoid such case-by-case solutions, it is important insure that all information is captured in a semantically rich format. Aside from deﬁning the meaning of each data item that is stored in a repository, its metadata, it is also important to deﬁne the context of information. The context principally deﬁnes the various steps executed in creating the information and the conditions associated with each step. The steps in essence constitute the workﬂow, alternatively called the provenance of information. The interest in collecting provenance is growing because it is necessary for a variety of functions such as checking the validity and quality of information, facilitating reproducibility, as well as analysis and creation of new knowledge. 2. Workﬂows

∗ Corresponding author. Tel.: +1 765 404 0065. E-mail addresses: [email protected], [email protected] (G.S. Joglekar). http://dx.doi.org/10.1016/j.compchemeng.2014.04.006 0098-1354/© 2014 Elsevier Ltd. All rights reserved.

The concept of workﬂows has been in use in a wide variety of domains such as business processes, manufacturing, scientiﬁc

G.S. Joglekar et al. / Computers and Chemical Engineering 67 (2014) 148–158

149

research, computing and medicine (Schwartz, 2006). The most common technique for modeling a workﬂow is to create a high level graphical representation consisting of a directed graph and associated nodes and edges. A graph deﬁnes the sequence of and interactions between various steps associated with a workﬂow. Simple directed graphs provide adequate expressing power for some domains such as business processes and scientiﬁc computing where the action(s) associated with a node inﬂuence only itself and its immediate neighbors. However, in most manufacturing situations, particularly in chemical manufacturing, an action pertaining to a node may require relationships with a set of nodes beyond the immediate neighbors. Moreover, unlike simple directed graphs where edges typically represent information exchange or signals, in manufacturing some edges may represent continuous transfer of material or exchange of a discrete quantity of material or some entity. Whenever material is exchanged the control of execution of the associated steps is not necessarily the same as that represented by a directed edge, namely the ‘from node’ controls the execution. As a result, additional descriptors are necessary to speciﬁcally identify the node that controls the execution of a series of steps. The Open Provenance Model (OPM) core speciﬁcations (Moreau et al., 2011), which are general and applicable to workﬂows, do not adequately address the special situations arising in manufacturing systems. The S88 standards (International Society for Measurement and Control, 1995) for batch processes deﬁne the concepts and terminology for manufacturing recipes and use a network description for batch recipes at several levels of speciﬁcity (general, site, master). The S88 representation of a recipe is thus the equivalent of a workﬂow. Under S88 at the process level, execution of the batch recipe is managed by means of a procedural control system. The workﬂow system described in this paper provides a similar mechanism for step by step execution of a workﬂow using an engine which also is applicable to non-manufacturing applications. For developing a knowledge framework for bioprocesses a workﬂow based approach was strongly recommended (Junker et al., 2011). Fig. 2. An example of a scientiﬁc workﬂow.

2.1. Workﬂow types There are four main categories of work procedures, or workﬂow types: business workﬂows, scientiﬁc workﬂows, experimental procedures and manufacturing recipes. Individually they represent different domains. A business workﬂow is mainly concerned with the modeling of business rules, policies, and project management, and therefore is often control-and activity-oriented. Typically a business workﬂow has one or more speciﬁc deliverables associated with it. Those deliverables could be concrete decisions, information that will support decision or publishable information that is part of a knowledge base. An example of a business workﬂow is shown in Fig. 1. A

Material Requirement Request Approval Yes

No

Notify

Decide Next Action Quit

Edit

Delete Request

Revise Request Resubmit

Fig. 1. Example of a business workﬂow for processing material requirement request.

comparison of the main approaches to modeling business processes is given in Borger (2012). A scientiﬁc workﬂow models the execution of computational or data manipulation steps in a scientiﬁc application (see Fig. 2). Typically the nodes alternate between data nodes and program execution nodes. The data nodes represent either the data input to or data generated by a computational node. An experimental workﬂow models the steps executed while conducting an experiment at the laboratory, pilot plant or test-bed level. The main use of an experimental workﬂow is to record the conditions deﬁning a given experimental run and to record the values of the observed variables. This information may include speciﬁc protocols, calibration procedures as well as on- and off-line analyses. Typically a set of experiments is performed based on the design of experiments deﬁned to meet certain objectives. The selection of variables and their ranges is typically a result of a scientiﬁc workﬂow. A manufacturing workﬂow, or recipe, models the steps executed during the manufacture of a product in a manufacturing facility. The recipe typically deﬁnes the preferred values of operating variables and the preferred operating sequence in order to make a given product. The manufacturing recipes instruct the operators and/or provide the information for a plant-wide procedural control system. The recipe may lead to the production of discrete entities or bulk product in a batch or continuous mode. For certain classes of experiments, the manufacturing recipes and experimental workﬂow may be identical at the conceptual level, the only difference arising in the scale of manufacture or the identify of the equipment

150

G.S. Joglekar et al. / Computers and Chemical Engineering 67 (2014) 148–158

used. An example of a manufacturing recipe will be given in following section. 2.2. Existing workﬂow modeling approaches Several workﬂow management systems have been developed for scientiﬁc workﬂows, these include, Taverna (Oinn et al., 2006), Kepler (Kepler Web Site, 2013), and Pegasus (Deelman et al., 2005). Typically, a scientiﬁc workﬂow is modeled as a directed acyclic graph, with nodes representing executable programs connected to nodes representing data. The data node(s) upstream of a program node represents data input to the program, while the data node(s) downstream of a program node represents data created by the program. An example of a scientiﬁc workﬂow for estimating individualized dosage regimen for gabapentin (Laínez et al., 2011) is shown in Fig. 2. The user enters the data required by the two tools (Bayesian parameter estimation and Dosage regimen individualization nodes), which are represented by the ‘GUI Input’ node. The entered data is written out in an xml ﬁle. Each tool interprets the xml ﬁle and uses data relevant to itself. First, the ‘Bayesian parameter estimation’ tool is executed which estimates the posterior distributions of the parameters of a ﬁrst order one compartment model. This tool employs a Markov chain Monte Carlo (MCMC) approach for the Bayesian estimation of the posterior distribution. The ‘Dosage regimen individualization’ tool, which is executed next, uses posterior distribution generated by the ﬁrst tool to determine the optimal dosage regimen for an individual. The inputs are the plasma concentration–time data for an individual, the size of the parameter sampling, the tuning parameter for the MCMC, an initial guess of the parameters, the conﬁdence level for the estimation of the dose, and the preferred interval of administration. The outputs include the marginal probability distribution of the parameters, the dose range that achieves the selected conﬁdence level, and the conﬁdence region for the concentration given the individualized dose. These types of systems for managing scientiﬁc workﬂows are generally not well suited for other types of workﬂows, which are resource and procedure centric. The Smart Manufacturing Leadership Coalition (Davis et al., 2012) offers a platform for intelligent manufacturing that has the ability to orchestrate workﬂows that integrate information and decision making processes. The workﬂows are modeled as scientiﬁc workﬂows using the Kepler system (SMLC Workshop, 2013). A survey of various workﬂow management tools used for massive data analysis performed in Grid computing environment is presented by Senthil and Santhosh (2012). Graph-based representations of activity networks are also used in Petri net models and in discrete event simulators, including speciﬁc batch process simulation systems such as Batch Process Developer (BatchProcess Developer, 2013) and Batches (Batches Users Manual, 2003). Petri nets are principally used to model logical networks representing discrete decisions and do not have a direct link to mechanisms for data generation, retrieval and storage. Discrete event simulators (e.g., ExtendSim) use graphical representation of activity networks, which typically can be represented by a series of steps whose termination is controlled by state and time events and initiation and/or alternative routing is controlled by dispatching rules. Additionally, the typical semicontinuous steps in a chemical process, which span more than two stages, are difﬁcult to model with petri nets and the activity networks. Building petri nets is non-trivial and can become too large to generate all states of the system and difﬁcult to analyze (FAA Human Factor Tools, 2013). Finally workﬂows also can be used to drive a process simulation model consisting of sets of differential/algebraic equations. For example, one way to use the gPROMSTM simulator to model the recipe shown in Fig. 3 would be to ﬁrst create a ﬂowsheet consisting of all the unit operations equipment in the process. Then

A

PrepA FillA Mix Empty

B

WorkflowP1 React Filter Store FillB FillAB Filter FillA P1 Heat React TXData Empty

Fig. 3. Workﬂow for manufacture of product P1.

one would select a model from the gPROMSTM library to represent each subtask and create a ‘workﬂow’ to sequence each of the subtasks in the recipe in the required order. The workﬂow can then be used in multiple ways to drive the execution of the gPROMS model. The simplest way would be create the sequencing information from the workﬂow shown in Fig. 3. The sequencing is clearly deﬁned by the workﬂow and a simple interpreter can be developed to generate information that is in the format required by gPROMS. A more sophisticated interface could be created for generating a complete gPROMS model based on the workﬂow by allowing the selection of a subtask model for each subtask and using the process parameters which would be already deﬁned in the workﬂow. This would eliminate the duplication of information and reduce the time required to build a model. The detailed description of such an approach is beyond the scope of this paper. In this paper we describe a new, general purpose workﬂow model that provides a generalization covering the variety of workﬂows and can be used effectively to capture the provenance of data generated from the various domains and forms described above. This workﬂow based system consists of four components: a graphical editor to build a workﬂow for the given application, the core building blocks in the workﬂow model, a library of functions that perform certain actions and an engine that manages the execution of a workﬂow. A graphically oriented system has several beneﬁts. First, during the early stages of workﬂow development when the underlying steps are not yet ﬁnalized the graphical builder allows models to be created and modiﬁed very easily thereby allowing free exchange and convergence of ideas among the collaborators. This facilitates the creation of workﬂow that is robust and meets the approval of all. Typically a workﬂow template is used repeatedly to create various instances of its use in speciﬁc cases. In that role, a workﬂow serves as a framework for storing parameter values speciﬁc to each instance and the results or data output associated with that instance. Additionally, the associated workﬂow execution engine provides up to date status information about the instance in progress and assists the end user in managing the details of the current step as well as through the sequence of steps comprising the workﬂow. Additionally, in its role as a reference to all associated information, a workﬂow is useful in guiding the user to the required information or allowing computers to systematically access the required information by traversing the underlying structure. Thus, all the information stored in a repository becomes fully machine accessible, a crucial requirement for data analysis and data mining for new knowledge creation. 3. Workﬂow model The general purpose workﬂow modeling framework advanced in this paper consists of the following building blocks: a) workﬂow b) task c) subtask

G.S. Joglekar et al. / Computers and Chemical Engineering 67 (2014) 148–158

d) material ﬂow descriptors: source and sink nodes, subtask input/output ports, and transfer lines e) information ﬂow descriptors: input/output ports, information ﬂows and data nodes. The functions of the building blocks are described in the next few subsections. 3.1. Workﬂow and task A workﬂow is a sum total of all of the building blocks that describe the details of a procedure. It is in essence a container that deﬁnes the scope of the activities, which in turn are decided by the user. By default, a workﬂow will consist of at least one task. A task represents a series of steps performed on or by the assigned resource. In the case of experimental or manufacturing workﬂows, a task is performed on a piece of equipment or an instrument assigned to it. In the case of business workﬂow, a designated person or team is responsible for implementing the assigned task. In the case of scientiﬁc workﬂows, it is typically the hardware and or software by means of which a computation task is performed. Whenever a resource is shared by multiple tasks, the allocation of that resource to perform a given task is dictated by its availability, suitability and the priority of the task to be performed. The onset and execution of a task are delayed until the required resource is assigned to it. A task is similar to the concept of unit process deﬁned in the SA-88 standards. There is no equivalent concept in OPM. A subtask is a basic elementary step that is used in constructing a task. A series of subtasks constitutes a task and deﬁnes its scope. When a task is initiated, its execution begins with its ﬁrst subtask and continues until the last subtask in its series is completed. 3.2. Material and information ﬂows Depending on the nature of the workﬂow network of tasks, its execution will entail transfer of actual physical entities or materials, of information or both as these are used and generated. The material ﬂow (in a generic sense) requires deﬁnition of material source and sink nodes, locations of ports in speciﬁc subtasks which serve as inputs/outputs for that material and transfer lines indicating the movement of material between speciﬁc subtasks. The transfer lines deﬁne the logistical network for the advancement of material through the workﬂow. For purposes of graphical depiction in the following discussion, material transfers are represented by solid lines, originating either from a material source node (represented by yellow pentagon) or subtask output shown as a triangle on the right vertical edge of a subtask rectangle, and terminating either in a sink node (a yellow triangle) or a subtask input shown as a triangle on the left vertical edge of a subtask rectangle. The information ﬂow within a workﬂow likewise requires the deﬁnition of data sources and sinks, locations of ports in speciﬁc subtasks which serve as inputs/outputs for that information and arcs, or transfer lines, indicating the movement of information between speciﬁc subtasks. As in the case of material ﬂows, information ﬂows track the development of information and data as the workﬂow is executed. For graphical depiction purposes, information transfers are designated by dotted lines with a solid circle designating an output port of a subtask and a hollow circle an input port on a designated subtask. The concepts of material and information ﬂows are certainly not new: it has been used in the representation of computational strategies for the execution of steady state process ﬂowsheet simulation models (Westerberg et al., 1979). Corresponding to the notion of material source and sink nodes, the proposed representation also uses the construct of a data node.

151

3.3. Data nodes A data node serves two main purposes: ﬁrst, it contains the metadata for the information which is required input to the workﬂow or which is generated as output from it and, secondly, it contains pointers to the location where the data itself is stored. A workﬂow typically contains at least one data node, which is a child of that workﬂow. A data node can be connected only by an information ﬂow to a subtask. For purposes of graphical representation, a data node is shown as a red bordered rectangle with suitable name. A data node can either be a data creation node or a data speciﬁcation node. 3.3.1. Data creation node A data creation node deﬁnes completely the structure, or the metadata, of the data created during the execution of the associated subtask. A data structure may comprise a set of values, a table, or a combination thereof. The metadata of a set of values is simply a set of terms from a predeﬁned vocabulary. The vocabulary provides the interpretation of each term when necessary. A vocabulary in turn may be deﬁned as an ontology or simply via a table. The metadata of a table of data consists of a ﬁxed set of columns, each with a name, an index and a measuring unit. Alternatively, the name and measuring unit may be replaced by a term from a vocabulary. Each instance of a data creation node also contains the path or a pointer to the physical location where the data is stored. 3.3.2. Data speciﬁcation node A data speciﬁcation node identiﬁes the speciﬁc data item(s) that is to be retrieved from a data repository. A data item in a data speciﬁcation node is the reference to a speciﬁc metadata item from any of the data creation nodes deﬁned in the library of workﬂows in a repository. Thus, an item in a data speciﬁcation node may refer to another item in a data creation node, to a column in a table in a data creation node or a parameter associated with task or subtask. 3.4. Example As an example, the batch manufacturing workﬂow for making product P1 by reacting two chemicals A and B is shown in Fig. 3. (The complete set of workﬂow symbols used in the graphical representation in Fig. 3 and subsequent ﬁgures is summarized in Appendix A.) The production recipe consists of four tasks shown as green rectangles, PrepA, React, Filter and Store, each performed in the speciﬁc piece of equipment required by it. Each task, in turn, is modeled as a series of subtasks that correspond to the individual steps performed when that task is carried out. The PrepA task starts with ﬁlling the required amount of raw material A (yellow pentagon) during subtask FillA (white rectangle), mixing it for a the speciﬁed amount of time (subtask Mix) and then emptying the contents (subtask Empty) into the downstream reactor when that unit is ready to receive this material during operation of subtask FillA of task React. The React task starts with ﬁlling raw material B during subtask FillB and then transferring material A from the upstream equipment (subtask FillA). The content is heated (subtask Heat) until a certain temperature is reached and allowed to react (subtask React) until certain yields are achieved. During the reaction the temperature and composition are continuously monitored and recorded (data note TXData). After the reaction, the content is continuously ﬁltered (task Filter, subtask Filter), resulting in a product P1 stream (sink P1 as a yellow triangle) and a mixture of unreacted A and B that is stored in a storage tank during subtask FillAB.

152

G.S. Joglekar et al. / Computers and Chemical Engineering 67 (2014) 148–158

In this example, the information ﬂow between the subtask React and data node TXData is shown in Fig. 3 via dotted line. In general there may be additional data nodes associated with the on- or offline analysis of the product or of the reactants. Although this application is batch manufacturing oriented, the set of building blocks shown in Fig. 3 provides the necessary functionality to model all workﬂow types mentioned above, at the level of detail required by a given application. Moreover, the graphical representation of a workﬂow provides a very concise view of the logistics and controls employed in implementing that workﬂow. In principle, since any organized activity can be cast as a workﬂow, the workﬂow model proposed can facilitate the development of a common framework for managing related activities in various domains. Such a framework will facilitate common understanding and communication of concepts horizontally and vertically within an organization.

of the sequence of subtasks on the assigned piece of hardware and creation of the workﬂows that can be easily adapted to different sets of hardware. The additional complexities due to the physical layout of hardware can be managed easily by the concept of task as well. In the example discussed above, suppose there are two vessels available for task PrepA, P1 and P1, and two reactors for task React, R1 and R2, and P1 can be connected only to R1 and P2 to R2. Such hardware related constraints should not affect a workﬂow model. The concept of task allows accommodating of such constraints without affecting the subtask level representation of a workﬂow. The key step in executing a task is to assign a resource to it. Once the resource is assigned, the subtasks are executed in the order they are deﬁned in the workﬂow model, beginning with the ﬁrst. When the last subtask is executed, the task is completed and the assigned resource is released.

4. Workﬂow control

4.2. Subtask control

The logistics and control of the execution of a workﬂow require the concepts of independent/dependent tasks and active/passive subtasks, which are described in this section.

Given that the communication between tasks occurs at the subtask level, the control of information and material ﬂows between subtasks has to be performed at that level. To capture that level of control, it is convenient to introduce four subtask types: Master, Slave, Chaining and Decoupling. For simple depiction of subtask type in the workﬂow diagram, we introduce a suitable graphical coding of the subtask type. A material input on a subtask is represented by a triangle, hollow or solid, on the left vertical edge of a subtask, thus pointing toward the subtask. A material input that pulls material from upstream is depicted as a solid triangle, a pulling input. If a material input accepts material pushed by an upstream output it is depicted as a hollow triangle, a passive input. A material output on a subtask is represented by a triangle, hollow or solid, on the right vertical edge of a subtask, thus pointing away from the subtask. A material output that pushes material downstream is depicted as a solid triangle on the right vertical edge of a subtask, a pushing output. If a material output allows downstream input to pull material then it is depicted as a hollow triangle, a passive output. Thus, the output and input on a material transfer line must be off opposite ‘polarity’, that is, either the output pushes material (solid triangle) and input receives material (hollow triangle) or the input pulls material (solid triangle) and the output allows material withdrawal (hollow triangle). The type of a subtask can be determined by its depiction in the associated task. The rules for determining the subtask type are given below.

4.1. Task control In general, a task deﬁnes all the actions performed using some set of assigned resources. A task can be independent or dependent. An independent task is initiated by the application or the user at a speciﬁc point in time subject to the availability of a set of suitable resources. A dependent task is initiated by an upstream subtask that interacts with the ﬁrst subtask of that task via either material transfer or information transfer, subject to the availability of required resources. In Fig. 3, PrepA and React are examples of independent tasks, and Filter and Store are examples of dependent tasks. Since the ﬁrst subtasks of PrepA and React ﬁll raw materials, they are independent, whereas task Filter is triggered when the React task is ready to empty its contents, and similarly the Store task is triggered in order to receive material from the ﬁlter. If an application is concerned with just-in-time operation, the initiation of React would be tied to initiation of PrepA via the durations of subtasks FillA and Mix of task PrepA, and FillB of task React. The task type, independent or dependent, can be deduced from the workﬂow diagram created by the user. The task type plays a key role in inﬂuencing the assignment of resources and the logistics of task initiation during the execution of a workﬂow. By deﬁnition, a task contains at least one subtask. The systems available for modeling scientiﬁc and business workﬂows, such as Pegasus or Taverna, do not employ the concept of tasks, and thus a node in those workﬂows would be equivalent to a subtask. Of course in those systems the arcs represent only information ﬂows. The concept of task provides an extra dimension to the workﬂow representation, which is very useful in modeling complexities introduced due to the hardware. For example, when multiple equipment or instruments are suitable for a task such as parallel pieces of equipment or parallel CPUs, then those simply become parameters of task without affecting the subtask level representation. Similarly, if the same workﬂow is to be implemented using a different physical set up, then again only the task information needs to change while the rest of the workﬂow stays intact. In manufacturing situations, such as the workﬂow in Fig. 3, where several subtasks are performed in the same piece of equipment, the concept of task is central to creating a workﬂow model. Without the concept of task, with just nodes and edges it would be very difﬁcult to show accurately the three important characteristics of workﬂows: multiple suitable pieces of hardware, implementation

4.2.1. Master subtask A subtask is a master subtask if it has i) no material inputs or output and no information inputs or outputs, or ii) only pulling input(s) and no material output(s), or iii) only pushing output(s) and no material input(s), or iv) pulling input(s) and pushing output(s), or v) only information output(s). The various depictions that make a subtask a master subtask are shown in Fig. 4(a). 4.2.2. Slave subtask A subtask is a slave subtask if it has a passive input and no material output, or has passive output and no material input, or has an information input and no material input or output. The

G.S. Joglekar et al. / Computers and Chemical Engineering 67 (2014) 148–158

153

5. Workﬂow as information aggregator

Subtask Subtask Subtask

Subtask

Subtask

Subtask

Subtask

Subtask

(a) Master subtasks

(b) Slave subtasks

Subtask Subtask

Subtask

(c) Chaining subtasks (d) Decoupling subtasks

Fig. 4. Possible subtask depictions and the associated subtask types.

various depictions that make a subtask a slave subtask are shown in Fig. 4(b). 4.2.3. Chaining subtask A subtask is a chaining subtask if it has a passive input and a pushing output, or has a passive output and a pulling input. In the ﬁrst case, the upstream subtask pushes material into a chaining subtask, and in turn the chaining subtask pushes material downstream. In the second case, the downstream subtask pulls material from a chaining subtask, and in turn the chaining subtask pulls material from upstream. The various depictions that make a subtask a chaining subtask are shown in Fig. 4(c). 4.2.4. Decoupling subtask A subtask is a decoupling subtask if it has a passive input and a passive output. A decoupling subtask allows the upstream subtask to push material into it and allows the downstream subtask to pull material from it asynchronously. The depictions that makes a subtask a decoupling subtask are shown in Fig. 4(d). A subtask’s type inﬂuences the actions taken while executing that subtask. Whenever there is a transfer of material between subtasks, that transfer can be implemented only when the tasks to which the subtasks belong are in the right state. For the workﬂow shown in Fig. 3, if task PrepA advances to the Empty subtask it must wait for the React task to advance to the FillA subtask. Similarly, if React task advances to the FillA subtask, it must wait for the PrepA task to advance to Empty subtask. In this particular transfer of material, FillA is the master subtask as can be inferred from the diagram. The same is true if the material transfer spans more than two subtasks. Again for the workﬂow shown in Fig. 3, the Empty subtask of the React task, the Filter subtask of the Filter task and the FillAB subtask of the Store task form a semicontinuous chain of material ﬂow. The chain of subtask can operate only when the corresponding tasks are in the right subtasks. In this particular semicontinuous chain, the Empty subtask is a master subtask, Filter is a chaining subtask and FillAB is a slave subtask. When a task advances into a subtask, the following are the main steps in executing that subtask: wait until all the necessary conditions to start the subtask are satisﬁed, start all the subtasks controlled by the master subtask, implement the actions associated with the subtask as speciﬁed by the user, detect when the conditions for ending the subtask are satisﬁed, end the subtask, inform all other interacting subtask and advance the task to the next subtask. When the last subtask of a task is completed, the task is ended and the resource assigned to it is released. A speciﬁc set of actions is associated with each subtask. The actions are carried out when the subtask is ‘active’ and depend on the application for which the associated workﬂow is being used. For example, if the workﬂow in Fig. 3 is used in manufacturing, the subtasks may provide information to an operator such as rpm, temperature, duration and so on. On the other hand, if the workﬂow is used to drive a simulator, the dynamic model speciﬁed with a subtask is implemented. When used as a structure for referencing information, the building blocks in the workﬂow diagram provide link to access the associated information.

The graphical model of a workﬂow provides a well-deﬁned structure and reference for the associated information. The information is typically speciﬁed as pairs of keyword and values. While the graphical model highlights the relationships between various building blocks and control of their execution as described above, the parameters will be meaningful to a speciﬁc application. This feature can be used in a variety of ways to store information efﬁciently. For example, the parameters that are common to all applications could be stored as one set, while application dependent parameters can be stored as separate sets, all sets using the same workﬂow model. The same applies to storing the results or outputs of an application. For example, suppose the workﬂow in Fig. 3 is to be used for comparing the performance of two different manufacturing sites, 1 and 2, using a simulation tool. The main difference between the two sites may just be the pieces of equipment suitable for each task, all other information being identical for both sites. Then the information can be structured such that one set of data will contain all the information that is identical for both sites, one set that describes equipment at site 1 and one that describes the equipment at site 2. To run a simulation model for site 1, the application will use the common information and the set of data for site 1 and so on. The workﬂow may be used to store information used for manufacturing as a different set altogether. In the case of an experimental workﬂow application, the data nodes of the workﬂow provide the structure for recording and accessing experimental parameters and results. By way of example, consider an experimental workﬂow for the execution of a series of continuous blending experiments with a set of different API and excipients. The apparatus consists of several feeders, a continuous blender and a NIR instrument to measure API content. The study seeks to determine the impact of component material properties, component feed ratios and blender design and operating parameters (e.g., type of blender, impeller angle set, blender rpm) on the root mean squared deviation of the API composition of the blend that is produced. The workﬂow for the experiment is shown in Fig. 5. (The legend of workﬂow symbols used in this ﬁgure is given in Appendix A.) Note that there are 6 data nodes, and up to 4 input material streams, each with potentially different material properties. To record an experiment, the Enter Data link shown in Fig. 5(a) is opened. This link provides access to forms for entering the parameter values speciﬁc to that experiment. The following ﬁve groups of parameters are associated with each experiment: General information, Input Materials, Feeder, NIR and Mixer. In addition, the time series data associated with each feeder, and the raw NIR data extracted at the end of each experiment are uploaded into the database. The extracted data can be stored in various forms, such as Excel ﬁles. Thus, each experiment instance provides access to the input parameters as well as the results through a relational database. The form to access data associated with experiments is shown in Fig. 5(b). There are links to speciﬁc tables in the database or a link to a general information table, which in turn has links to the other tables. The General Information data table is shown in Fig. 6. There is one row for each experiment in this table. The ﬁrst ﬁve columns show the general information entered by the user. The next ﬁve columns are links to the associated tables. A Yes in the column indicates that data exists in the associated table for that category for that experiment. The Load Cell table stores links to the Excel ﬁles, which contain the time series data for each feeder used during the experiment. Each time series data ﬁle contains two columns, Time and Current Flowrate in kg/h. When a link is opened, the user is given the option to browse or download the Excel ﬁle.

154

G.S. Joglekar et al. / Computers and Chemical Engineering 67 (2014) 148–158

Fig. 5. Data entry and viewing for the blending experiments.

Once the campaign of experiments is completed, the data tables become a knowledge source, which can be consulted, for instance, to select operating conditions for blends similar to those for which data is already available by a simple search on the key constituents. Likewise the data set may serve as input to a scientiﬁc workﬂow involving the ﬁtting of a correlation for predicting RSD from key blend constituent properties and RPM. In the case of a workﬂow developed for purposes of representing a manufacturing process, such as shown in Fig. 3, the workﬂow provides the common structure for retaining all of the operating variables, supporting materials data and preparation steps as well as the interface to a DCS system. Speciﬁc data nodes in the workﬂow can serve to provide the links to speciﬁc time windows in the data historian to facilitate access to such information. Although the DCS system historian along with the chemometric tools linked to the DCS system do serve as archives for the raw data, which may satisfy regulatory requirements, these resources do not provide the integrated context for capturing all of the manufacturing information describing the run. From the perspective of data management, the workﬂow thus deﬁnes not only the procedure followed but also all of the associated input, process variable and output information generated in

a manner that both assures that all required data is recorded and that minimizes the duplication of information. A workﬂow acts as a common lens through which the data generated can be viewed and understood in the context in which it was generated. It is the context that is essential for building understanding and developing deeper knowledge about the activities that the workﬂows represent.

6. Workﬂow execution The end uses of a workﬂow based system may cover a wide range of applications, such as, experimental data management, information archival and retrieval for data mining, creating inputs for commercial simulators, and operator interfaces in plant operation. The software functionalities necessary for executing a workﬂow can be broadly divided into two sets. One set of functionalities, the core execution engine, implement the steps in a workﬂow according to the graphical model constructed by the user. The second set of functionalities implement the actions associated with the tasks and subtasks. The core execution engine is independent of the application for which a workﬂow is used, while the second set

Fig. 6. Example of the main table having a row for each experiment.

G.S. Joglekar et al. / Computers and Chemical Engineering 67 (2014) 148–158

155

of functionalities is tied to the application for which a workﬂow is being used.

Completed task Active task Active subtask Completed subtask

6.1. Core execution engine The execution of a workﬂow typically involves one pass or one cycle through all the activities modeled in that workﬂow. During a cycle, a workﬂow starts from an initial state, and transitions through all or some of the ﬁnite states possible for each of its building blocks. The trajectory of the ﬁnite states may be different for each cycle, and is governed by several speciﬁc factors such as the values of various parameters, the end application, the decision logic in the execution engine as well as user input at speciﬁc points in the workﬂow. The core execution engine (CEE) implements the logic built into or implied by the workﬂow diagram, and executes one cycle of the workﬂow. The engine expects the user to initiate a task when prompted and waits until the user in turn manually triggers the Start Task event. Also, it prompts the user to manually trigger events on a subtask level. The main subtask events are, Start Subtask and End Subtask on each master subtask. Thus, the workﬂow execution progresses based on the events triggered by the user. The execution logic of the CEE is brieﬂy summarized as follows. When the workﬂow execution starts, the status of all independent tasks is changed to ‘Active’. When a resource is assigned to a task, determined by a user-triggered action, the task is advanced to its ﬁrst subtask. At the beginning of workﬂow execution all subtasks are in the default state. During its execution, a subtask can be in any of the ﬁve states: default, waiting, ready, active, completed. The speciﬁc subtask level actions which are undertaken will depend on the nature of the subtask (master, slave, chaining, decoupling) and the state of its up- or down-stream subtasks. The logic of proceeding from the initial default state to the ready state consists of ﬁve cases: 1. If a subtask is a master subtask, and has either material input, output and/or output signal, then it waits for all the interacting subtasks to advance into ‘ready’ status, setting its own status to waiting. If a master subtask has no interaction with any other subtask, it sets its own status to ‘ready’. 2. If a subtask is a slave subtask, it immediately advances into ready state. 3. If a subtask is a chaining subtask and pushes material down through its output, then if its downstream subtask is not ready, it sets its status to waiting. If its downstream subtask is ready, it sets its status to ready and informs the upstream subtask that it is ready. 4. If a chaining subtask pulls material through its input and its upstream subtask is not ready, it sets is status to waiting. If its upstream subtask is ready, it sets its status to ready and informs the downstream subtask that it is ready. 5. If a subtask is a decoupling subtask, it sets its status to ready and informs the subtasks it interacts with.

PrepA FillA Mix Empty

A

B

WorkflowP1 React Filter FillB Filter FillA Heat React TXData Empty

P1

Fig. 7. Snapshot of task and subtask statuses for a workﬂow in progress.

2. A slave subtask changes its status to complete when the subtask it interacts with is completed, and advances to the next subtask in its task. 3. A chaining subtask changes its status to complete, informs the other subtasks it interacts with, and advances to the next subtask in its task. 4. In the case of a decoupling subtask, when an upstream or downstream subtask of a decoupling subtask is completed and if none of its other neighbors are waiting or active, it sets its state to ready. The user may change the status of a decoupling subtask to complete when all the upstream interactions are completed. The CEE displays the current status of all tasks and subtasks when a workﬂow is in progress. For example, the snapshot of the workﬂow for product P1 is shown in Fig. 7. As indicated by the legend, all subtasks of task PrepA have been completed, currently subtask Heat of task React is being executed, and the subtasks and tasks in their default states have not been executed. The schematics of workﬂow status diagram at two other times are shown in Fig. 8. In Fig. 8(a), the React subtask is in progress and data are being collected by the TXData node. In Fig. 8(b) the Empty subtask is in progress. Because they form a semicontinuous chain with the Empty subtask, subtasks Filter and FillAB are also in progress at the same time. The Filter subtask is the ﬁrst subtask of the Filter task. Since the Empty subtask pushes material downstream (it is a master subtask), when the React task advances to the Empty subtask, the CEE creates s request to initiate the Filter task and changes its status to ‘Active’. After the task is started by a user event, it advances to its ﬁrst subtask, namely Filter. Since Filter is a chaining subtask, the CEE checks if its outputs are satisﬁed. One of it outputs goes to sink P1, and therefore it is satisﬁed. However, one output is to subtask FillAB, which is the ﬁrst subtask of task Store. Again, the core execution engine creates a request to initiate the Store task and changes Completed task Active task Active subtask Completed subtask

The logic associated with moving subtasks from ready to active or completed state is covered by the following four cases depending on the nature of the subtask: 1. For a master subtask, when all the subtasks upstream and downstream of it have advanced to ready status, the master subtask advances to ready status. A user-triggered event is then required to advance the master subtask to active status from ready status. A subsequent event ends a master subtask advancing it to the completed status, and moves the task to the next subtask. In addition, when a master subtask is completed, it informs all its immediate subtask neighbors.

Store FillAB

React FillB FillA Heat React Empty

TXData (a)

React FillB FillA Heat React Empty

Filter

Store FillAB

Filter

P1 TXData (b)

Fig. 8. Snapshots when the React and Empty subtasks are in progress.

156

G.S. Joglekar et al. / Computers and Chemical Engineering 67 (2014) 148–158

Fig. 9. Example of core execution and application engine interactions.

its status to ‘Active’. After that task is started by a user event, it advances to the FillAB subtask. That is when the semicontinuous chain is initiated. When the Empty subtask is ended by a user triggered event, the Filter and FillAB subtasks are ended. Since they all are last subtasks of the corresponding tasks, the associated tasks are completed when these subtasks end. That marks the completion of one cycle of the entire workﬂow.

6.2. Application engine While the overall execution of a workﬂow is controlled by the core execution engine, an application engine implements the actions related to the end use. As an example, suppose the workﬂow shown in Fig. 3 is being used for running a process, and the associated application provides values of key operating parameters and supplementary information to the operator for all processing steps. The interplay between the CEE and the application engine that performs this function is shown in Fig. 6. When a subtask advances to the ‘Active’ status, the actions deﬁned by a certain group of user speciﬁed parameters of that subtask are implemented. The CEE invokes the application engine for that subtask. The application engine in turn may access any information with reference to the subtask passed by the CEE. Suppose the process advances into the Heat subtask of the React task in the workﬂow shown in Fig. 3. The CEE updates the status and displays it using a color code on the operator console. The current status of all subtasks is stored in the data repository, and can be accessed by any application through structured queries. When the operator clicks on the Heat subtask to get the instructions on how to operate it, the application engine retrieves the appropriate information from the repository, and creates a window as shown on the right hand side of Fig. 9. This window contains the key process parameters, any special instructions such as a check list or precautions, and a pair of buttons to mark the start and end of the subtask. Of course, these buttons are triggered by the operator. In this particular case, the application is simply providing the information to the operator and expecting the operator to interpret the information. Alternatively, a more sophisticated application may interpret these parameters as set points to be passed to controllers on the associated equipment. The buttons invoke the CEE and execute the corresponding actions. As described earlier, the CEE manages the current state of the associated workﬂow instance. Until the operator clicks the ‘End Subtask’ button, the React task stays active in that subtask. It should be noted that the application engines do not directly interact with the CEE, but instead use the current status of

ProjOptimumTRPM DoDOE DoDOE StartExpts

RunPilotPlant DoRuns StartAnalysis

Optimize Optimize Copy

Opt T RPM

ExptConds

Fig. 10. Workﬂow for a project to determine optimum T and RPM.

the various elements in the workﬂow to inﬂuence the actions. The workﬂow provides the references to all data in the repository. 7. Workﬂow hierarchy A workﬂow allows the capture of all the details, that is, the entire procedure associated with creating any speciﬁc data entry. The knowledge of these details is often as important as the resulting data. Almost always data creation is done with a purpose, and each data creation activity is a part of an overarching project. Typically, a project is driven by well-deﬁned objectives and carried out according to a plan. At the very least, a typical project consists of three sets of activities, project roadmap development, generation of data and data analysis. Each activity can be modeled as a workﬂow, and the relationship between the activities can be shown as a hierarchy of workﬂows. For example, suppose the workﬂow shown in Fig. 3 models the operation of a pilot plant to make product P1 that is part of a process development organization. Furthermore, suppose that a process development group would like to use the pilot plant to identify empirically the best temperature and stirrer RPM combination to maximize the yield of P1. Broadly speaking, such a project would start with identifying the main tasks to be accomplished, as shown by the workﬂow in Fig. 10. The ﬁrst task is to set up a design of experiments for the independent variables, two in this case, temperature and stirrer RPM. The next task is to perform the runs on the pilot plant. The last task is to analyze the results from the pilot plant runs. Suppose that the DOE activity is performed by a speciﬁc support group within the company. In that case, the DOE activity can be modeled as a separate workﬂow executed by that group. The workﬂow shown in Fig. 11 is an example of a scientiﬁc workﬂow for full-factorial design of experiments, where the DOE uses the independent variables/factors and the levels per factor as input, implements the program during subtask DOE and creates as output ﬁle RunConds. The output is a table, where each row has the values of the independent variable for that particular production run. The

G.S. Joglekar et al. / Computers and Chemical Engineering 67 (2014) 148–158

157

FullFactorialDesign Critical Params

SetupDOE DOE

RunConds

Fig. 11. Workﬂow for full factorial design of experiments.

Optimization

ExtractData

UseMatLab CreateMFile RunMatlab

Fig. 14. Components of the workﬂow based knowledge management system.

Opt T RPM

Fig. 12. Workﬂow for ﬁnding the optimum temperature and RPM.

FullFactorialDesign workﬂow is invoked by the DoDOE subtask in Fig. 6. Upon completion of the FullFactorialDesign workﬂow, the DoDOE task in Fig. 6 advances to the StartExpts subtask, which in turn triggers the DoRuns subtask. The run conditions generated by the experimental design constitute another input to the DoRuns subtask. The DoRuns subtask implements certain types of actions. For each row in the RunConds ﬁle, it invokes the workﬂow shown in Fig. 3 for the manufacture of P1. Only after all the runs are completed will the DoRuns subtask be completed and the task progression advanced to the StartAnalysis subtask which in turn will trigger the Optimize task. Suppose the optimization activity is also performed by a speciﬁc support group within the company. In that case the optimization activity is modeled as a separate workﬂow executed by that group. The workﬂow shown in Fig. 12 is an example of steps performed for ﬁnding the optimum. For example, suppose the MatLab package is used for determining the optimum. As a ﬁrst step, an M ﬁle must be created to perform the calculations. The M ﬁle will include code to extract and preprocess data, invoke the suitable optimization function of Matlab, and write the optimum values in a predeﬁned format. Once the M ﬁle is created, it is executed in the RunMatlab subtask. In the ExtractData data node, the user identiﬁes the speciﬁc data that is extracted from the pilot plant runs. Thus the entire project can be fully scoped out prior to its initiation with all the individual steps, their relationships and precedence order, the resource requirements, the metadata of all information created and consumed, and so on completely outlined. The complete ensemble of workﬂows is shown in Fig. 13. Fully deﬁning a project in this fashion has several beneﬁts during project implementation. The CEE and the application engines provide the up-to-date status of tasks and subtasks associated with all workﬂows. Therefore, the person who may be in charge of the overall project as well as persons responsible to individual tasks can monitor the overall progress very readily. Additionally, since the path and structure of all data created are fully deﬁned, data can be easily shared, retrieved, analyzed and validated at any point in time.

Fig. 13. Ensemble of workﬂow associated with the project.

8. Workﬂow implementation A knowledge management system (KMS), which uses workﬂows to model all information generating processes, has been developed at Purdue University. The three key components of the KMS are shown in Fig. 14. The KMS is implemented on the HUBzero® (McLennan and Kennell, 2010) middleware developed by Information Technology at Purdue (ITaP). HUBzero® is an open source software platform for building Web sites that support scientiﬁc discovery, learning, and collaboration which is based on the Joomla (Joomla, 2013) content management system. It also provides the MySQL server for the relational database functions and the Apache web server. One of the attractive features of the HUBzero implementation is that as a result the KMS applications are web-accessible with suitable security controls to enable all members of the team involved in the execution of any speciﬁc workﬂow to browse, search, use and modify the workﬂow itself or any of its elements based on a variety of levels of authorization. In implementing KMS, the server side scripting was done using PHP and the graphics are rendered using SVG. The workﬂow builder is an active graphics based tool that allows a user to build the graphical model template of a workﬂow and to deﬁne the parameters associated with each workﬂow component. The web interface allows a user to create instances of templates for creating new information. It has functionality to access existing information and view it through the associated template, or extract information as structured data for use in further processing. It can also create x-y plots for the pair wise data extracted from the data repository. Thus, the information recorded in the KMS is fully machine accessible with all relationships deﬁned by the graphical model of the associated workﬂow. The information can be utilized for a wide variety of end uses, such as creating reports, accessing data for analysis, developing workﬂows for new applications and evolving existing workﬂow to broaden their scopes. 9. Conclusions A typical knowledge repository holds a wide range of interrelated information. Associated with every information item is a process that provides an explicit model of the various activities and relationships associated with its creation, namely, its provenance. The provenance is as important as the information itself because it facilitates the full understanding, sharing and use of information. A workﬂow based framework has been developed to model the information creation processes. The graphical model of a workﬂow proposed is very intuitive and easy to understand and share. Additionally, the set of constructs proposed in this paper for modeling the material and information ﬂows, represent graphically the execution controls required for implementing the associated process. The data nodes provide direct links to information stored in the repository, as well as to the metadata for the associated information, while the workﬂow fully deﬁnes the context for the

158

G.S. Joglekar et al. / Computers and Chemical Engineering 67 (2014) 148–158

information. The explicit relationships deﬁned by the workﬂow are machine interpretable, and as such can be used for structured access to every information item, thereby facilitating querying, extraction and analysis of the data in the associated data repository. By organizing any decision making process as a collection of interrelated subprocesses, a hierarchy of workﬂows can be developed to model these interconnected subprocesses. A workﬂow based knowledge management system has tremendous potential for providing a uniform structure for linking disparate data, as well as encompassing multiple domains of discipline and authority. Acknowledgments The authors gratefully acknowledge the support of the US National Science Foundation through the Engineering Research Center for Structured Organic Particulate Systems under grant EEC0540855. Support for the HUB implementation of the workﬂow builder was provided under NSF CBET-0941302. The contributions of ITaP staff members, Michael McLennan, Michael Zenter and George Howlett were important to the implementation and are very much appreciated. Appendix A. Workﬂow symbols Workflow

Pulling material input on a subtask

Task

Passive material input on a subtask

Subtask

Pushing material output on a subtask

Material source

Passive material output on a subtask

Material sink

Information input on a subtask or a data node.

Data node Material flow Information flow

Information output on a subtask or a data node

References Batches Users Manual. Batch Process Technologies; 2003, West Lafayette, IN, USA. BatchProcess Developer. Aspen Technology, Inc.; 2013, http://www.aspentech.com/ products/aspen-batch-plus.aspx [accessed 01.03.14]. Borger E. Approaches to modeling business processes: a critical analysis of BPMN, workﬂow patterns and YAWL. Softw Syst Model 2012(11):305–18. Davis J, Edgar TF, Porter J, Bernaden J, Sarli M. Smart manufacturing, manufacturing intelligence and demand-dynamic performance. Comput Chem Eng 2012(47):145–56. Deelman E, Singh G, Su M, Blythe J, Gila Y, Kesselman C, Mehta G, Vahi K, Berriman GB, Good J, Laity A, Jacob JC, Katz DS. Pegasus: a framework for mapping complex scientiﬁc workﬂows onto distributed systems. Sci Program 2005;13:219–37. ExtendSim 7 User Guide, Imagine That Inc., San Jose, CA. Available from: . FAA Human Factor Tools, Mathematical Modeling. http://www.hf.faa.gov/ workbenchtools/default.aspx?rPage=Tooldetails&subCatId=31&toolID=200; 2013 [accessed 01.04.14]. gPROMSTM , www.psenterprise.com/gproms.htm [accessed 01.04.14]. International Society for Measurement and Control. Batch control. Part 1. Models and terminology. International Society for Measurement and Control; 1995. Joomla content management system. www.joomla.org; 2013 [accessed 01.04.14]. Junker B, Maheshwari G, Ranheim T, Altaras N, Stankevicz M, Harmon L, Rios S, D’Anjou M. Design-for-six-sigma to develop a bioprocess knowledge management framework. PDA J Pharm Sci Tech 2011(65):140–65. Kepler Web Site. www.kepler-project.org; 2013 [accessed 01.04.14]. Laínez JM, McLennan M, Mockus L, Reklaitis GV. Linking simulation tools using the PharmaHUB work-ﬂow management functionality. In: AIChE Annual Meeting 2011; 2011. McLennan M, Kennell R. HUBzero: a platform for dissemination and collaboration in computational science and engineering. Comput Sci Eng 2010;12(March/April (2)):48–52. Moreau L, Clifford B, Freire J, Futrelle J, Gil Y, Groth P, Kwasnikowska N, Miles S, Missier P, Myers J, Plale B, Simmhan Y, Stephan E, Van Den J. The Open Provenance Model Core Speciﬁcation (v1.1). Future Generation Comput Syst 2011;27(6):743–56. Oinn T, Greenwood M, Addis M, Alpdemir MN, Ferris J, Glover K, Goble C, Goderis A, Hull D, Marvin D, Li P, Lord P, Pocock MR, Senger M, Stevens R, Wipat A, Wroe C. Taverna: lessons in creating a workﬂow environment for the life sciences: research articles. Concurr Compt Pract Exp 2006;18(10):1067–100. Schwartz DG, editor. Encyclopedia of knowledge management. Idea Group Inc (IGI); 2006. Senthil MB, Santhosh KV. A survey of workﬂow management tools for Grid platform. Adv Inform Technol Manage 2012;1(1):1–3. SMLC Workshop web site. https://smartmanufacturingcoalition.org/readingmaterials/presentations-workshop-materials; 2013 [accessed 01.03.14]. Westerberg AW, Hutchison HP, Motard RL, Winter P. Process ﬂowsheeting. Cambridge University Press; 1979.

Lihat lebih banyak...

A workflow modeling system for capturing data provenance

Descrição do Produto

Comentários