r2 - 12 Jun 2007 - 16:15:03 - DanieleTuriYou are here: myGrid wiki >  Mygrid Web  > TavernaWorkbench > LogBook > DataLineage

LogBook DataLineage

DataLineage is a visualisation component added to LogBook version 1.2.4. It allows to:

  • display the data derivation graph of a workflow run
  • the actual data values
  • similar data

Launching

The data lineage visualisation of a workflow run is launched by pressing the DataLineage button in the GUI:
datalineage-start.png

A new window then opens with a graph on the left. In the current version, there is a very basic graph layout, so it will probably look messy, unbearably so if it is a very large graph:
datalineage-messy.png

You can, however, manually arrange the graph using your mouse:
datalineage-arranged.png

The nodes of the graph represent data; an arrow from a node A to a node B represent the fact that A is (directly) derived from data B. We distinguish between intermediate data and workflow inputs and outputs.

Intermediate Data

Intermediate data is yellow for data items and green for data collections (lists). The names consist of the processor name followed by colon follow by the corresponding output port name. Thus, eg, getImageLinks:imageLinks.

datalineage-list.png

Workflow Inputs and Outputs

The names of inputs and outputs of workflows are just the port names, eg todaysDilbert Workflow outputs are light blue with rounded edges and workflow inputs are darker blue with even more rounded edges:
datalineage-input.png

Every time a node is selected, the corresponding data is displayed in the right hand side of the window:
datalineage-dilbert.png Again, please note that you might need some window adjustment in order to display the data properly.

Failures

Data associated with failed processor runs is marked as red:
datalineage-failure.png

Similar Data

By clicking on the Similar Data button you get a list (in decreasing chronological order) of all data similar to the selecte one. At the moment the notion of similarity is purely sintactical, ie we look for data with the very same (complete) name:
datalineage-similar.png

By double-clicking on an item in the list a new data lineage visualisation widow opens corresponding to the selected data, together with its workflow run:
datalineage-similar-messy.png

One can then again organise the window and compare it with the previous one:
datalineage-comparison.png

Source Code

Anonymous CVS:

  • Host: cvs.mygrid.info
  • Path: /usr/local/cvs/mygrid
  • Module: datalineage

Author

Acknowledgements

Thanks to the other members of the myGrid team.

Code debugged using the YourKit Java Profiler.

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r2 < r1 | More topic actions
 
Powered by myGrid wiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding myGrid wiki? Send feedback