Taverna 2

DataflowOutputPortImpl is not thread safe

Details

  • Description:
    Hide
    WARN  2010-02-01 11:14:06,032 (net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Invoke:196) - Failed (INVOCATION) invoking Local service org.embl.ebi.escience.scuflworkers.java.StringConcat for job DispatchJobEvent facade1071:Workflow1:Workflow19:invocation84429:facade2133:Workflow19:Concatenate_two_strings[]: Uncaught exception while invoking Local service org.embl.ebi.escience.scuflworkers.java.StringConcat
    java.lang.NullPointerException
            at net.sf.taverna.t2.workflowmodel.impl.DataflowOutputPortImpl$1.receiveEvent(DataflowOutputPortImpl.java:65)
            at net.sf.taverna.t2.workflowmodel.impl.BasicEventForwardingOutputPort.sendEvent(BasicEventForwardingOutputPort.java:70)
            at net.sf.taverna.t2.workflowmodel.impl.ProcessorOutputPortImpl.receiveEvent(ProcessorOutputPortImpl.java:54)
            at net.sf.taverna.t2.workflowmodel.impl.ProcessorCrystalizerImpl.jobCreated(ProcessorCrystalizerImpl.java:66)
            at net.sf.taverna.t2.workflowmodel.impl.AbstractCrystalizer.receiveEvent(AbstractCrystalizer.java:88)
            at net.sf.taverna.t2.workflowmodel.impl.ProcessorImpl$2.pushEvent(ProcessorImpl.java:143)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.impl.DispatchStackImpl$TopLayer.receiveResult(DispatchStackImpl.java:277)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize.receiveResult(Parallelize.java:165)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.AbstractDispatchLayer.receiveResult(AbstractDispatchLayer.java:84)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.AbstractErrorHandlerLayer.receiveResult(AbstractErrorHandlerLayer.java:136)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.AbstractErrorHandlerLayer.receiveResult(AbstractErrorHandlerLayer.java:136)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Invoke$InvokeCallBack.receiveResult(Invoke.java:300)
            at net.sf.taverna.t2.activities.beanshell.BeanshellActivity$1.run(BeanshellActivity.java:181)
            at java.lang.Thread.run(Unknown Source)

    due to: DataflowActivity doing:

    facade.addResultListener(new ResultListener() {
    					int outputPortCount = dataflow.getOutputPorts().size();
    
    					Map<String, T2Reference> outputData = new HashMap<String, T2Reference>();
    
    					public void resultTokenProduced(
    							WorkflowDataToken dataToken, String port) {
    						if (dataToken.getIndex().length == 0) {
    							outputData.put(port, dataToken.getData());
    							synchronized (this) {
    								if (--outputPortCount == 0) {
    									callback.receiveResult(outputData, dataToken.getIndex());
    									facade.removeResultListener(this);
    								}
    							}
    						}
    					}
    				})

    notice that removeResultListener is called after receiveResults - meaning that it will be done after the output has been pushed up the stack and the next iteration of the nested workflow might already have started.

    As net.sf.taverna.t2.workflowmodel.impl.DataflowOutputPortImpl don't synchronize access to it's resultListeners field it does not help that the WorkflowInstanceFacadeImpl does when calling addResultListener or removeResultListener, as they call down to add/remove on the dataflow port, which also internally does:

    WorkflowDataToken newToken = token.popOwningProcess();
    				sendEvent(newToken);
    				for (ResultListener listener : resultListeners
    						.toArray(new ResultListener[] {})) {
    					listener.resultTokenProduced(newToken, this.getName());
    				}

    This can occur with iterations over nested workflows, for instance if running the workflow in T2-1124 including the T2-1124 fix.

    Show
    WARN  2010-02-01 11:14:06,032 (net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Invoke:196) - Failed (INVOCATION) invoking Local service org.embl.ebi.escience.scuflworkers.java.StringConcat for job DispatchJobEvent facade1071:Workflow1:Workflow19:invocation84429:facade2133:Workflow19:Concatenate_two_strings[]: Uncaught exception while invoking Local service org.embl.ebi.escience.scuflworkers.java.StringConcat
    java.lang.NullPointerException
            at net.sf.taverna.t2.workflowmodel.impl.DataflowOutputPortImpl$1.receiveEvent(DataflowOutputPortImpl.java:65)
            at net.sf.taverna.t2.workflowmodel.impl.BasicEventForwardingOutputPort.sendEvent(BasicEventForwardingOutputPort.java:70)
            at net.sf.taverna.t2.workflowmodel.impl.ProcessorOutputPortImpl.receiveEvent(ProcessorOutputPortImpl.java:54)
            at net.sf.taverna.t2.workflowmodel.impl.ProcessorCrystalizerImpl.jobCreated(ProcessorCrystalizerImpl.java:66)
            at net.sf.taverna.t2.workflowmodel.impl.AbstractCrystalizer.receiveEvent(AbstractCrystalizer.java:88)
            at net.sf.taverna.t2.workflowmodel.impl.ProcessorImpl$2.pushEvent(ProcessorImpl.java:143)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.impl.DispatchStackImpl$TopLayer.receiveResult(DispatchStackImpl.java:277)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize.receiveResult(Parallelize.java:165)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.AbstractDispatchLayer.receiveResult(AbstractDispatchLayer.java:84)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.AbstractErrorHandlerLayer.receiveResult(AbstractErrorHandlerLayer.java:136)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.AbstractErrorHandlerLayer.receiveResult(AbstractErrorHandlerLayer.java:136)
            at net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Invoke$InvokeCallBack.receiveResult(Invoke.java:300)
            at net.sf.taverna.t2.activities.beanshell.BeanshellActivity$1.run(BeanshellActivity.java:181)
            at java.lang.Thread.run(Unknown Source)
    due to: DataflowActivity doing:
    facade.addResultListener(new ResultListener() {
    					int outputPortCount = dataflow.getOutputPorts().size();
    
    					Map<String, T2Reference> outputData = new HashMap<String, T2Reference>();
    
    					public void resultTokenProduced(
    							WorkflowDataToken dataToken, String port) {
    						if (dataToken.getIndex().length == 0) {
    							outputData.put(port, dataToken.getData());
    							synchronized (this) {
    								if (--outputPortCount == 0) {
    									callback.receiveResult(outputData, dataToken.getIndex());
    									facade.removeResultListener(this);
    								}
    							}
    						}
    					}
    				})
    notice that removeResultListener is called after receiveResults - meaning that it will be done after the output has been pushed up the stack and the next iteration of the nested workflow might already have started. As net.sf.taverna.t2.workflowmodel.impl.DataflowOutputPortImpl don't synchronize access to it's resultListeners field it does not help that the WorkflowInstanceFacadeImpl does when calling addResultListener or removeResultListener, as they call down to add/remove on the dataflow port, which also internally does:
    WorkflowDataToken newToken = token.popOwningProcess();
    				sendEvent(newToken);
    				for (ResultListener listener : resultListeners
    						.toArray(new ResultListener[] {})) {
    					listener.resultTokenProduced(newToken, this.getName());
    				}
    This can occur with iterations over nested workflows, for instance if running the workflow in T2-1124 including the T2-1124 fix.
  1. as.t2flow
    (45 kB)
    Stian Soiland-Reyes
    2010-02-01 11:28

Issue Links

Activity

Hide
Stian Soiland-Reyes added a comment - 2010-02-01 11:28

Attached workflow never made it past iteration 1070 in the nested workflow.

Show
Stian Soiland-Reyes added a comment - 2010-02-01 11:28 Attached workflow never made it past iteration 1070 in the nested workflow.
Hide
Stian Soiland-Reyes added a comment - 2010-02-01 14:06

Temporary fixed together with T2-1124 for 2.1.1 patched release. (In DataflowActivity).

Real fix to be done in DataflowOutputPortImpl.

Show
Stian Soiland-Reyes added a comment - 2010-02-01 14:06 Temporary fixed together with T2-1124 for 2.1.1 patched release. (In DataflowActivity). Real fix to be done in DataflowOutputPortImpl.
Hide
Stian Soiland-Reyes added a comment - 2010-02-01 16:10 - edited

DataflowActivity patch for 2.1.1 (on branch dataflow-activity-1.0.1-T2-1124) avoids issue by not removing result listener (which will add another T2-1135 leak - but that's ok as we're not fixing T2-1135 for 2.1.1 anyway)

Real fix is in WorkflowOutputportImpl (for 2.1.2) - where the listener list is now synchronized

Show
Stian Soiland-Reyes added a comment - 2010-02-01 16:10 - edited DataflowActivity patch for 2.1.1 (on branch dataflow-activity-1.0.1-T2-1124) avoids issue by not removing result listener (which will add another T2-1135 leak - but that's ok as we're not fixing T2-1135 for 2.1.1 anyway) Real fix is in WorkflowOutputportImpl (for 2.1.2) - where the listener list is now synchronized
Hide
Hudson Daemon added a comment - 2010-02-04 16:08

Integrated in net.sf.taverna.t2.security #423
T2-698 T2-T2-1094 T2-1124 T2-1127 T2-1129 T2-1133 T2-1134 T2-1143 T2-1145 Updated version numbers to 1.0.1 instead of 1.0.1-SNAPSHOT
T2-698 T2-T2-1094 T2-1124 T2-1127 T2-1129 T2-1133 T2-1134 T2-1143 T2-1145 patches for 2.1.1

Show
Hudson Daemon added a comment - 2010-02-04 16:08 Integrated in net.sf.taverna.t2.security #423 T2-698 T2-T2-1094 T2-1124 T2-1127 T2-1129 T2-1133 T2-1134 T2-1143 T2-1145 Updated version numbers to 1.0.1 instead of 1.0.1-SNAPSHOT T2-698 T2-T2-1094 T2-1124 T2-1127 T2-1129 T2-1133 T2-1134 T2-1143 T2-1145 patches for 2.1.1
Hide
Alan Williams added a comment - 2010-03-22 09:58

Needs to be checked for 2.1.2.

Show
Alan Williams added a comment - 2010-03-22 09:58 Needs to be checked for 2.1.2.
Hide
Alan Williams added a comment - 2010-03-22 09:58

Needs to be checked

Show
Alan Williams added a comment - 2010-03-22 09:58 Needs to be checked

People

Dates

  • Created:
    2010-02-01 11:22
    Updated:
    2010-03-23 15:10
    Resolved:
    2010-03-22 09:58