Skip to end of metadata
Go to start of metadata

Note: This is the myGrid team's understanding and thoughts about a use case of caGrid doing secure services with Taverna and GT4. This is not "The t2 security solution" - but a viewpoint and concrete example of what the solution should be able to perform.

From Ravi Madduri to taverna-hackers on 2008-03-19:

This is very interesting discussion for us trying to use Taverna in caBIG land where we need taverna's remote execution service to be able to invoke a Grid Service as a user who started the workflow and not as the user who the taverna remote execution service is running as. We want to accomplish this independent of myProxy and at the same time not send user's private key or password on wire to the services.  The following is a rough outline of our thinking:

Scenario 1:
The Taverna UI (which is running on user's machine) has a security agent that holds user's delegated proxy credentials. Users create a workflow and the corresponding grid service processor puts the appropriate WS-Security things from the delegated proxy into the soap message and invokes the secure service as the user. The secure service authorizes and authenticates the user.
Scenario 2:
This gets a bit more complicated as this involves offloading the workflow execution to a remote execution service that is running on a different machine and all the taverna users for a VO connect to it and submit their workflows to the execution service running as a general user. This way the user can shutdown his laptop and reconnect to the the execution service to find status of his 24 hour long workflow at a later time without having to keep his machine up all the time and also to offload the computationally intensive workflows to a dedicated machine. Security becomes a bit tricky here. User agent (in this case taverna) delegates the user's credentials (proxy) to a delegation service on the same host as the remote execution service of taverna. The taverna processor is authorized to get the credential from the service and then invoke the services in the workflow on behalf of the user who submitted the workflow (using the user's credentials along with all the attributes etc). the user agent can be notified with the status of the workflow from the remote execution service and user can also access intermediate data etc by authorizing himself to the remote service using his credentials. 

Ideally, we want both of these scenarios supported in Taverna and are willing to work with anybody who is interested in these things

We discussed this specific scenario with Mike Jones on 2008-03-28, and came in the end up with this diagram: 
 
 
We identified the key features:

  • Taverna workflows could take long to run (more than 12 hours) and should be run remotely through the Taverna service
  • The Taverna workbench should be able to submit a job and then detach (as of today with the Taverna service)
  • As there might be many of these workflows, they should be executed on the grid
  • These grid jobs (executing the workflow) should be run as the user who submitted them, for accounting and security reasons
  • The workflow itself needs to be able to access various secured GT4 services, using the credentials of the user who submitted the workflow for execution

Essentially the idea here is that we have the Taverna Workbench GUI that can design the workflow and submit it for execution to a modified version of the Taverna Remote Execution Service (T-REX). This is done using the existing REST interface, but over HTTPS, and authenticated using the user's original X.509 certificate issued by the relevant CA (step 1 on the diagram above). This step contains 3 sub-steps:

  • 1.a user requests authentication for job submission from Taverna Service
  • 1.b Taverna Service asks the user for a proxy certificate L1-P1 (level 1 proxy certificate 1) by sending back a CSR
  • 1.c user sends back the "L1-P1" proxy generated from their original X.509 certificate and Taverna Service stores this together with the submitted workflow and job description. At this point, the Taverna Workbench can disconnect (user can turn off the laptop), and then come back later to ask the Taverna Service about the job progression.

This proxy certificate would be generated and delegated by the standard challenge-response mechanism.

While Taverna Service can have many jobs in it's queue at once, it will associate the proxy certificate L1-P1 with that particular workflow job. It will use this proxy certificate to submit a grid job to GRAM (Grid  Resource Allocation and Management) over HTTPS (step 2 in the diagram). The job in question will be the actual execution of the workflow, basically it's the "worker" process of the Taverna Service (which is actually only launching the job, rather than executing it). For doing so, Taverna Service will have to create a derived level 2 proxy certificate L2-P1 from L1-P1. It was also suggested to use GT2 GRAM service, rather than GT4 WS-GRAM.

The worker, when running as a grid job, will have access to proxy L2-P1 in a magic file, accessed through GT4 environment variables. The worker, i.e. the Taverna runtime, will add this to its keystore and will be able to use this proxy certificate to access secured grid webservices WS1, WS2, etc. from caGRID.

Additionally, the worker will use the proxy L2-P1 to authenticate itself back to the Taverna Service using REST over HTTPS (step 3 in the diagram), such as fetching the workflow, sending back progress reports, results, etc.

If the workflow accesses a service that itself needs to access other services, such as WS3 in the diagram which will internally communicate with WS4, then WS3 will again do a CSR request back to the worker, who will then create a level 3 proxy certificate L3-P1 based on L2-P1. 

Now, the problems here may arise if the workflow is taking long to execute, by which time L2-P1 (inside the GRAM job execution environment) might have expired, so the worker needs to request a new proxy certificate. This can be done upstream up to the Taverna Service, pausing the workflow meanwhile. If the Taverna Service's level 1 proxy certificate L1-P1 is still valid and valid for long enough, a new level 2 proxy L2-P2 can be created immediately. If L1-P1 is expired, or is about to expire, we will have to wait for the Taverna Workbench to come back. The Taverna Workbench - when next connected - would be able to see that there's a certificate request, and would then use the user's original credential to issue another level 1 proxy L1-P2 and pass it back to the Taverna Service, who will generate another level 2 proxy L2-P2 for the worker to resume the job. The worker picks up this new L2-P2 proxy and  will be able to create a level 3  L3-P2 for the WS3 to complete the workflow.

Alternatively, or later if you want, we assume we have The Magic P2P FrameWork (tm), which allows multiple security agents to be associated with each other in a peer group.  The security agent of the Taverna Service running on the grid will send out a request for a new proxy certificate to the peer group. The security agent of the user that launched the Taverna Workbench - if connected - is in the peer group and will be able to see the request and provide the new proxy certificate. So the agent will have an additional functionality of being able to generate proxy certificates on request.

There could then be other security agents in play as well, for instance on a mobile phone. The mobile phone also detect the request from the peer group, and then beep the user and ask him if he's OK with giving out a new proxy certificate based on the client certificate stored "safely" (assuming bluetooth is turned off!) on the phone. The new proxy certificate is sent back using the (presumably secure) P2P framework.

Now in these settings of course we would need to ensure that the request does actually come from the Taverna Service we think it is, and not just some evil intruder that managed to infiltrate the peer group. We assume that The P2P FrameWork will handle all of this, but additional security checks might eventually be needed.

Once this P2P thingie is working, then the initial plan of Tom can be used for some web services that uses WS-Security. In this case we can instead take the soap message that is to be sent and announce it to the p2p group, asking for someone to sign it for us. The mobile phone or Taverna Workbench will then sign the SOAP message, send it back to the Taverna Service, which can then send it to the web service.

This makes it possible to have an enactor that does not require proxy certificates, for maximum paranoia. The message signing would not be sufficient in the WS3->WS4 case, but the enactor could in that case forward on the proxy certificate request directly from ws3 to the peer group. As the private key for the CSR is kept inside ws3, the enactor would not be able to "abuse" that proxy certificate. The client signing the CSR would also be able to verify the the request comes from the expected web service (or at least the expected container!)

Remarks

With the t2 security agents and a secure p2p framework there is no inherent need to enact the workflow on the grid itself. It might in fact be a waste of resources to do so. The workflow could be enacted anywhere, with no privileges except network access. However, the workflow needs to run somewhere, and in this usecase, as far as we understand, caGrid wants to run the workflow as the same user as the one submitting the workflow job. The easiest way to do this is through the existing grid job submission system.

myProxy would be able to avoid the expiration problems. It is not considered here as (as far as we know) it is not currently a part of the gt4 stack of caGrid.

The use of "magic files" is here mentioned as one way to retrieve the proxy certificate, although it would be gt4 specific it could be implemented as part of a gt4 add-on.
This does not mean that this is the only way to do it, in fact it would not be the recommended solution. Our ideal solution is with the security agents over a secure p2p framework.

Security agents over a secure p2p framework does of course come with it's own challenges, for instance the "secure" bit should not be any weaker than that required by the grid in question. (Otherwise the system would not be allowed by the grid administrator).  Additionally there are ideal web services (where one could do message signing in the security agent, and the message itself is not sensitive or too large to be sent to the agent) and more demanding web services, such as WS3 in this picture, in where the security agent would have to also be able to participate in a proxy certificate delegation.

Stephen Langella to taverna-hackers on 2008-04-07:

I think we should consider scenario 2. I think the important thing that the UI will need to be able to handle are:

1) Allow user to login via Authentication Service / Dorian
2) Delegate credential to the caGrid's Credential Delegation Service (CDS). It is important to note that the CDS does not need to be running in the same container.
3) Submit the workflow to the remote execution service.
4) Monitor the status of the workflow.

The remote execution service will need to support the ability to obtain a user's credential from the CDS and use it to invoke the services in the workflow.




Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.