Abstract

Data Transfer Service (DTS) is an open-source project that is developing a document-centric message model for describing a bulk data transfer activity, with an accompanying set of loosely coupled and platform-independent components for brokering the transfer of data between a wide range of (potentially incompatible) storage resources as scheduled, fault-tolerant batch jobs. The architecture scales from small embedded deployments on a single computer to large distributed deployments through an expandable ‘worker-node pool’ controlled through message-orientated middleware. Data access and transfer efficiency are maximized through the strategic placement of worker nodes at or between particular data sources/sinks. The design is inherently asynchronous, and, when third-party transfer is not available, it side-steps the bandwidth, concurrency and scalability limitations associated with buffering bytes directly through intermediary client applications. It aims to address geographical–topological deployment concerns by allowing service hosting to be either centralized (as part of a shared service) or confined to a single institution or domain. Established design patterns and open-source components are coupled with a proposal for a document-centric and open-standards-based messaging protocol. As part of the development of the message protocol, a bulk data copy activity document is proposed for the first time.

1. Introduction

Data management is an increasingly pressing e-research challenge, especially in distributed and heterogeneous environments where different data sources and sinks do not provide compatible interfaces for third-party transfers. Data access, transport, management and repository services are also now widely recognized as key infrastructure requirements for supporting and enabling research. Reflecting this, the Australian National Collaborative Research Infrastructure Strategy (http://ncris.innovation.gov.au/Pages/default.aspx), through the Australian Research Collaboration Service (ARCS; http://www.arcs.org.au/), and OMII-UK (http://www.omii.ac.uk) are providing funding support for a collaborative project (i) to contribute towards the standardization of a bulk data copy activity document for describing the transfer of data between multiple (and different) data sources and sinks and (ii) to develop an implementation based on a set of loosely coupled, platform-independent components that aim to provide an efficient data transfer capability that can be scaled according to circumstance. Fundamentally, these components enable data access and transfer between a variety of different and incompatible storage resources through their strategic deployment as remotely callable third-party agents. The software is intended to be deployed and re-used in a variety of ways, from large-scale (distributed) deployments to small-scale (embedded) deployments for locally enabling applications. Usage is to include data copying onto and off a researcher’s desktop, onto and off the grid and into and out of different repositories. While the initial end-user focus of the project (http://code.google.com/p/dtsproject/) is the major facility user communities, the service is intended to be of benefit more generally to the wider research community in Australia, the UK and beyond.

In this paper, we present the progress made to date in developing a prototype service implementation. A core project principle is to re-use recognized patterns and established technologies. The design adopts a document-centric message-orientated middleware (MOM) model. We aim to base the underlying messaging protocol on open standards, and an assessment of the relevant open standards leads to a proposal for a new bulk data transfer activity document.

2. Requirement

A not uncommon dilemma in distributed and heterogeneous environments is how to remotely invoke and schedule the copying of data from the point of generation to different resources for further processing, analysis and archiving. To do this effectively across different domains, the data transfer activity has to be described in a generic and extensible message that is agnostic of the underlying data resource interface(s). Often, the protocols supported by different data sources and sinks may not be the same or mutually compatible, such as GridFTP (Allcock et al. 2001), SRB/iRODS (Rajasekar et al. 2003), SRM (Sim & Soshani 2007), SFTP, FTP, WebDAV, HTTP(S) and local files. While some facilities and research institutions will have connectivity and ready access to grid infrastructure and services, many will not. Copying potentially large quantities of data in a scheduled and scalable manner between different types of resources remains a key challenge. As a motivating example, instrument data generated and stored on a local file system as part of an experiment at the Australian Synchrotron Facility or the UK Diamond facility will probably need to be moved or copied to a remote compute resource for analysis or for safe-keeping within a repository at (or serving) the researcher’s home institution.

Currently, there are a small number of applications that can traverse differing domains and broker data between different sources and sinks by providing an intermediary bit-pipe between file input/output streams (figure 1) or by executing get/put sequences. The transfer process is usually performed synchronously and is manually orchestrated. We discuss examples of these applications in §3. This approach suffers some disadvantages. Firstly, the client requires access to both the source and sink, which may not always be possible if the client is remote. Secondly, the intermediary must be available and constantly ‘online’ throughout the duration of the transfer process. Thirdly, the buffering of large quantities of data introduces bandwidth and concurrency concerns for clients residing on networks with limited capability (e.g. wireless connectivity) relative to the direct connectivity between the source and the sink.

Figure 1.

Data copying between heterogeneous resources when third-party transfer is not available. An intermediary is required to broker/buffer bytes between incompatible data sources and sinks.

In scenarios such as experiments undertaken at synchrotron facilities, with datasets ranging up to several hundred gigabytes and a multitude of experiments generating data on local file servers, synchronously buffering data through a client application may not be possible and is not scalable or efficient. A better approach would be to asynchronously invoke an intermediary with a ‘fire and forget’ (batch) data copy activity and strategically place the intermediary wherever it can best access the data, for example within a particular network or close to a particular source/sink. This would follow the general best practice of processing data as close as possible to where it resides. Being able to schedule this activity as an ‘off-peak’ low-priority task would be beneficial, for instance, where there may be competing network services. In addition, being able to invoke the transfer activity asynchronously would enable automatic (pseudo-) real-time transfers from a data generator to a repository, as the data are being produced.

3. Existing technologies and services

Existing technologies do go some way to providing the sought capabilities. Software libraries exist that are able to broker data across different and incompatible protocols, such as the platform-independent (Java) Apache Commons Virtual File System that abstracts a range of different resources clients behind a common API (VFS; http://commons.apache.org/vfs/). This open-source library provides support for a growing number of transfer protocols and file systems, including local files, FTP and SFTP, HTTP and HTTPS and WebDAV. Cross-protocol data transfer is performed using a memory buffer that translates between file input and output streams, or as a third-party transfer when possible (e.g. between two GridFTP servers). Being extensible, individual protocol support can be developed and integrated as required. A recent collaboration between the ARCS, STFC and OMII-UK has driven the development of VFS plugins for GridFTP, SRB and iRODS. The open-source Commons VFS Grid project (http://sourceforge.net/projects/commonsvfsgrid/) provides these as a single implementation.

The VFS library is used within a number of applications such as Hermes, a desktop GUI tool for ‘drag-n-drop’ data transfer (http://sourceforge.net/projects/commonsvfsgrid/, http://wiki.arcs.org.au/bin/view/Main/HermeS), the National Grid Service (http://www.ngs.ac.uk) applications portal (http://portal.ngs.ac.uk) and GridSAM (McGough et al. 2008). Similarly, the VBrowser (Glatard et al. 2008) data management tool, being developed as part of the Virtual Laboratory for e-Science project (http://www.vl-e.org/), provides an explorer-like interface to GridFTP, SSH-FTP, SRM, LFC (LHC File Catalogue) and SRB resources using a similar virtual file system approach. The Java Universal Explorer (JUX; http://forge.in2p3.fr/wiki/jux) is another desktop application providing an explorer-like interface for drag-n-drop data transfer and is underpinned by JSAGA (Reynauld 2010). Developed by IN2P3, JUX supports access to local file, GridFTP, SSH, SRB, iRODS, SRM, HTTP and ZIP resources.

When third-party transfer is not available, applications such as these broker/buffer bytes via the client application itself when copying between remote data resources. These applications, therefore, are still subject to the same bandwidth, concurrency and scalability limitations described earlier, when buffering bytes directly through a single intermediary.

Services, rather than client applications, provide a more scalable, efficient approach for copying data between distributed and heterogeneous resources. Globus is currently developing Software as a Service (SaaS) infrastructure that has an initial focus on file transfer (http://www.globus.org/service/). The service is being built as a ‘scale-out’ Web application to be hosted by Globus, using Amazon Web services and taking a Web 2.0 approach. Replacing the Globus Reliable File Transfer (RFT) service (Madduri et al. 2002), the new service is to initially support ‘fire and forget’ transfers using HTTP, SCP and GridFTP resources.

The gLite FTS (Kunst et al. 2006) is built onto the layered architecture of GLITE, providing a capability for transferring large amounts of data sustainably over extended periods of time in a production environment (primarily using GridFTP). The key components are a scheduler, a service queue with an associated schema, service agents and channels defined by the source and sink endpoints. The service queue is implemented as a set of tables in a relational database.

Stork (Kosar & Livny 2004; Kosar & Balman 2009) engages with CONDOR (Thain et al. 2005) to provide a new data placement scheduler for bulk data transfer jobs that are fault-tolerant batch processes that can be queued, scheduled, monitored, managed and check-pointed. Stork supports transfers between local file systems, GridFTP, FTP, HTTP, SRB, NeST and SRM. The C-based scheduler uses the ClassAd job description language to represent data placement jobs. Users can add support for their preferred storage system, protocol or middleware. Stork offers a comprehensive service that scales for large data volumes.

Services such as Stork, GLITE-FTS and the Globus SaaS project define the leading edge in the development of software for efficient and schedulable data transfer in distributed and heterogeneous environments. Nevertheless, as stated by Kosar & Balman (2009), these applications and their successors represent initial steps towards data-intensive computing and data-aware batch scheduling. This is a newly emerging paradigm where many challenges remain, which includes the emergence of standards for describing data transfer/copying activities. Our goal is to complement these developments, by providing a set of loosely coupled, lightweight and platform-independent (Java) components that can be composed and deployed in a wide variety of service settings. Described below, the Data Transfer Service (DTS) architecture is to provide scaling from self-contained ‘data transfer service in a box’ deployments (i.e. by packaging and abstracting a range of different platform-independent clients), through to ‘heavy-lifting’ enterprise deployments utilizing distributed resources. Recognizing the benefits of service interoperability, we seek to base the underlying messaging protocol on open standards through the design of a bulk data copy/transfer activity document.

4. Open-standards review

Open standards offer service interoperability and longevity benefits (Cerf 2008). At the time of writing, we are aware of two in-scope Open Grid Forum (OGF) standards (http://www.ogf.org). A review of these standards undertaken with the participation of representatives from each indicates that neither sufficiently meets the requirements for a bulk data copy activity. This is not surprising, nor is it a criticism, as both have been specifically designed to address the requirements of their particular contexts, which only partially overlap with the context of the work being described here.

(a) JSDL data staging and the HPC file staging profile

The Job Submission Description Language (JSDL) (Anjomshoaa et al. 2005) standard provides a middleware agnostic schema for describing a single compute activity that includes data staging. Data are ‘staged-in’ from a source location to a compute resource prior to starting a job and are ‘staged-out’ to a target following job completion. All staging operations require the definition of a file ‘staging point’ that resides on the local file system of the compute resource. In JSDL, this is specified using the mandatory <FileName/> and optional <FilesystemName/> elements that are nested within each <DataStaging/> element. Using the staging model, it is possible to represent a copy operation between a source and a target by defining two separate <DataStaging/> elements, each with identical <FileName/> elements as shown in figure 2.

Figure 2.

Example JSDL data staging element (a source data staging element for fileA and a corresponding target element for staging out of the same file). By specifying that the input file is deleted after the job has executed, this example simulates the effect of a data copy from one location to another through the staging host.

This approach, however, is not without its limitations. Importantly, no explicit identifier is provided that correlates a source with a sink. Without proprietary extensions, the stage-in/stage-out model does not natively provide a means for a client to ‘drill down’ and monitor the status of separate staging operations. This is intended, since JSDL is designed to describe a single activity that is atomic from the perspective of an external user.

In the context of bulk data copying, the intermediate file staging point is also redundant and adds unnecessary complexity, since each <Source/> needs to be linked to a corresponding <Target/> through a common <FileName/> (this can be problematic when copying many files). The file staging point would be a hidden implementation detail in bulk data copying.

One approach to remove the intermediary storage constraint, and to associate the source with the target, is to define both the source and the target within the same <DataStaging/> element, which is permitted in JSDL. In doing this, a proprietary ID attribute can be added to explicitly identify each staging operation. However, the HPC File Staging Profile (Wasson & Humphrey 2008), which is an extension to JSDL, limits the use of credentials to a single credential definition within a data staging element. Often, different credentials will be required for the source and the target. Alternatively, credentials could be nested within the <Source/> and <Target/> elements rather than within the <DataStaging/> element, but, at the time of writing, this possibility lacks an OGF profile.

A further consideration is that there are some resource types for which additional properties will need to be specified to enable access. For example, ‘McatZone’ and ‘MdasDomain’ properties need to be specified in order to access files on an SRB resource (Rajasekar et al. 2003). Providing this kind of supplementary information requires additional element definitions. Further examples include elements to specify URI properties, file pattern selectors for copying only files/directories that have matched names or file extensions and other elements for specifying additional transfer requirements.

So while JSDL and its extensions go some way to satisfying the requirements of bulk data copy activity, more is required to account for multiple files/directories from multiple locations, while avoiding the restriction of intermediary storage. Some of the additional elements and constructs sought are provided for in the OGSA Data Movement Interface (DMI) specification.

(b) OGSA DMI

The OGSA DMI (Antonioletti et al. 2008) defines a number of XML constructs for describing and interacting with a data transfer activity. The data source and destination are each described separately with a Data Endpoint Reference (DEPR), which is a specialized form of a WS-Address element (Box et al. 2004). An example DEPR is provided in figure 3. DEPRs are intended to encapsulate all the information relating to how the source and the sink can be accessed, including the nesting of user credentials. In contrast to JSDL data staging, a DEPR facilitates the definition of one or more <Data/> elements within a <DataLocations/> element. This is used to define alternative locations for the data source and/or sink. In doing this, an implementation is then free to select between its supported protocols and retry different source/sink combinations from the available list. This improves resilience and the likelihood of performing a successful data transfer. DMI also provides a number of relevant and useful constructs. These include retry and error strategies, an information model for advertising supported protocols, a state model for interacting and controlling the data transfer and a standard set of fault types.

Figure 3.

A pseudo-schema representation of a DMI data endpoint reference type (includes some newly proposed extension points).

There are limitations, however. Additional extension points are required within the DMI schema for the purpose of nesting supplementary information such as the URI properties elements and file pattern selectors discussed earlier (figure 3). More significantly, DMI is intended to describe only a single data transfer operation between one source and one sink. In order to accommodate several transfers, for example when copying several different files that reside in different directories or file systems, multiple invocations of a DMI service factory would be required to create multiple DMI service instances. Each instance is then responsible for managing a separate transfer. This is by design; DMI is intended to be a low-level protocol that would normally sit beneath higher level (user facing) services that would aggregate multiple transfers into a ‘bulk’ data copy activity.

5. Towards a new bulk data copy activity document

Our preference is for a single Document Message that describes multiple data transfer operations; herein referred to as a bulk data copy activity. A Document Message differs from a Command Message, which tells a receiver to invoke certain behaviour, and from an Event Message, which notifies a receiver of the occurrence of state changes. In defining a single (atomic) document message, the message packet can be ‘delivery transacted’ through a pipe and filter architecture (Hohpe & Woolf 2003). At the time of writing, a standard format for a document message that describes a bulk data copy activity appears not to have been adequately defined.

As neither JSDL nor DMI fully satisfies our requirements as document messages, enhancements to both are currently being discussed with representa- tives of the two standards. Recognizing that the evolution of standards is invariably a slow process, we have constructed a proprietary document message format and a control/event messaging protocol as an interim measure to allow the development of the DTS prototype. Described in more detail elsewhere (http://code.google.com/p/dtsproject/), the document message is currently a JSDL–DMI hybrid that adopts JSDL as the generic activity description language, with additional elements modified from DMI coupled with some newly proposed elements. In this section, we outline some design considerations for a document message that builds upon the JSDL–DMI hybrid approach. These are not intended to be comprehensive examples; rather, they are intended as suggestive possibilities for consideration in future standardization efforts.

(a) Design considerations

The proposed document message format introduces a root <BulkDataCopy/> element that wraps one or more child <DataCopy/> elements. Each <DataCopy/> element specifies a single data copy operation by wrapping paired source and sink DEPR-type constructs modified from DMI that represent the data source and the destination. The <BulkDataCopy/> and <DataCopy/> elements specify xsd: ID attributes that are user defined and uniquely identify each sub-copy operation within the document. This is required in order to report and control each sub-copy in subsequent control and event messages. A pseudo-schema example of the proposed data structure is given in figure 4.

Figure 4.

A pseudo-schema representation of a possible format for a new bulk data copy activity document (plus symbol defines one or more; question mark defines optional; asterisk defines zero or more).

The OGF specifications cited thus far (JSDL and DMI) all apply an in-line approach for constructing XML documents. For example, each DEPR or JSDL data staging element specifies its own in-line <Credentials/>. This design choice partly reflects a consideration to simplify subsequent software development when marshalling, un-marshalling and validating XML. However, given the potential for element repetition, which is especially apparent in a bulk data copy activity, element references should be considered (even though this would diverge from the current in-line approach adopted in these specs). Common elements such as <Credentials/>, <TransferRequirements/> and source/sink definitions could be optionally listed as children of the <BulkDataCopy/> element, perhaps within a <Resources/> element as in JSDL. These resources could then be referenced from within <DataCopy/> and DEPR-type constructs using element ID references. In doing this, the amount of XML can be significantly reduced, which has positive implications for messaging and readability. A pseudo-schema example is provided in figure 5.

Figure 5.

Element referencing in a bulk data copy activity document rather than nesting elements ‘in-line’ (as used in JSDL and DMI). Common element repetition is eliminated.

At this point, it is important to distinguish between data copying and transfer; the latter requires deletion of the source data following successful completion of the copy. At present, we have only considered copying as per JSDL staging and DMI. Nevertheless, given the (potential) requirement for transfer, a <BulkDataTransfer/> root element would be more appropriate than the proposed <BulkDataCopy/> element. When copying data to the target, we currently integrate the JSDL <CreationFlag/> element within our modified DMI <TransferRequirements/> element to include ‘overwrite’, ‘dontOverwrite’ and ‘append’ options.

In figure 5, <DataCopy/> elements are un-ordered and define parallel copy tasks. If more elaborate task orchestration and sequencing become necessary, a number of further extensions should be considered. These could include defining optional attributes to assign task ordering as suggested in JSDL (Anjomshoaa et al. 2005), or defining an extensible XML DAG-type construct within a bulk copy document to model more elaborate task execution; for example, copy, delete and other (e.g. unforeseen) tasks such as performing intermediary file transformations or memory allocation at the sink (Kosar & Balman 2009). These design considerations are currently under review as compelling use cases are required and they considerably increase complexity.

6. Possible utilization of the proposed bulk data copy activity document

So far, we have excluded the wider scope of life cycle events and control messages. In this section, we briefly outline some suggestions for integrating the bulk copy document into the DMI execution life cycle and nesting within JSDL for use in OGSA BES. Again, these are intended to provide suggestive possibilities for consideration in future standardization efforts.

(a) OGSA DMI

For DMI, a new Data Transfer Factory interface is suggested in order to specify a bulk data copy document as input (figure 6). Subsequent event and control messages would need to be extended in order to account for each of the sub-copies, e.g. by correlating the identifiers of each <DataCopy/> element given in the initial request (message correlation). An example DMI state message with possible extensions is given in figure 7.

Figure 6.

Proposal for extending the DMI data transfer factory interface with a newly proposed method that accepts a bulk data copy document as a parameter.

Figure 7.

Proposal for extending the <dmi:State/> message to report the status of sub-copies (as part of a bulk data copy). The status of sub-copies is optionally given in the existing <dmi:Detail/> element.

(b) JSDL and OGSA BES

The OGSA Basic Execution Service profile (Foster et al. 2008) adopts JSDL as the choice for a generic activity description language. A JSDL document describes a single activity that is atomic from the perspective of an external user. At most, one activity may be defined in a JSDL document, represented by the <jsdl: Application/> element. This element serves as a high-level generic container that is intended to hold more specific application definitions. By referencing or integrating a bulk data copy activity document from within JSDL, the extensible state and information models provided by BES may be exploited. Indeed, this may encourage the adoption of a bulk data copy activity document, given existing BES implementations such as OMII-UK’s GridSAM service (McGough et al. 2008), UNICORE (Rambadt et al. 2007), ARC (Ellert et al. 2007) and BES++ (Ruiz-Alvarez et al. 2008).

Viable approaches to deliver a bulk data copy document include embedding the document within the JSDL itself in place of an <xsd:any/> extension point (such as within the <jsdl:Application/> element), or staging a bulk data copy document as a regular stage-in file to the consuming system. Both these suggestions are illustrated in figure 8.

Figure 8.

Options for integrating the proposed <BulkDataCopy/> document within JSDL; (a) nesting as a new application type within the <jsdl:Application/> element or (b) staging-in of a <BulkDataCopy/> document as input for the named executable.

The BES state model is extensible and designed to support new sub-state specializations (Foster et al. 2008). A minimal implementation is therefore free to ignore the sub-states and recognize only the main BES states. The BES state model could therefore be exploited to account for the progress of a data copy activity by integrating the DMI life cycle states (akin to the JSDL HPC File Staging profile; Wasson & Humphrey 2008). A first-pass bulk data copy specialization profile for BES is shown in figure 9 which integrates the main DMI life cycle states; see Antonioletti et al. (2008) for a full description of the DMI life cycle.

Figure 9.

Example Bulk Data Copy Specialization Profile for OGSA BES which integrates DMI life cycle events as BES sub-states. Data copy activities can be held from progressing to their next state until a client resume request is issued. The main BES states are underlined while the DMI state specializations are in italics.

7. DTS architecture

The DTS prototype is designed around MOM, and incorporates selected enterprise integration patterns described in Hohpe & Woolf (2003). Asynchronous messaging is known to be a powerful strategy for the integration of distributed and loosely coupled systems. It provides better encapsulation and more efficient and resilient remote communication than other application integration options such as file transfer, shared database and synchronous RPC (Hohpe & Woolf 2003). Loose coupling enables ready scalability and facilitates DTS usage in a variety of settings, from large-scale (distributed) deployments to small-scale (embedded) deployments on a single computer. We envisage three deployment scenarios: (i) a lightweight and stand-alone local worker–agent installation, (ii) a distributed deployment for a single worker pool, and (iii) a distributed deployment for multiple worker pools.

Scenario i. In the simplest scenario illustrated in figure 10, our platform-independent worker–agent (implemented using Java, VFS and Spring Batch, see §7b) may be deployed locally on a single computer for direct application integration through a public API or command line. The worker–agent requires a valid bulk data copy document as input and generates event messages. In this scenario, the worker–agent is subject to the same bandwidth and concurrency limitations associated with streaming data through a single node (discussed earlier). This option is nonetheless attractive in providing a fault-tolerant and asynchronous batch process for streaming data between a range of different (and incompatible) resources (i.e. our ‘service-in-a-box’ scenario).

Figure 10.

Lightweight (local) worker–agent deployment. The worker–agent is invoked by a script or is directly integrated into an existing application via its API. P, the message producer; s, submit message (bulk copy activity document); c, control message; e, event message.

Scenario ii. To scale (scale-out), multiple workers can be replicated in a pool and subscribed to a job submission queue to distribute queued tasks (competing consumer pattern). This is illustrated in figure 11. One worker pool would typically serve a single project or facility. We use point-to-point (JMS) message queues that provide durable and transactional channels for routing requests subsequent and event/control messages between the client and worker. Service demand can be evaluated by monitoring queue depth. New workers may be added and removed without affecting each other, which improves service resilience. Importantly, workers may selectively consume requests that meet with their selection criteria using the Specifying Producer and Selective Consumer patterns. Through the coordinated use of message selectors, it is possible to strategically target a worker(s) (e.g. workers may be placed to serve a particular data source and/or sink such as a specific beam-line server within a facility). A message scheduler is also introduced whose role is to hold and forward messages according to the scheduling parameters defined within the request. Expired messages are forwarded to a dead message queue (Dead Letter Channel pattern). A document-literal Web service interface and associated job-store is also introduced so that a DTS can be shared by many (remote) clients.

Figure 11.

Single worker pool deployment. Message selectors are represented by the question marks. P, the message producer; C, message consumer (service activator); s, submit message (bulk copy activity document); c, control message; e, event message.

Scenario iii. A distributed deployment can be further extended through the use of a message router which adds a higher level routing capability to partition workers into multiple groups, e.g. serving different institutions or projects. This is shown in figure 12. Routing decisions are based on the existence of client-defined properties added to the message header (Message Header Router pattern). Importantly, this passes routing control to the client, rather than delegating control solely to the worker as in the Selective Consumer pattern.

Figure 12.

A multiple worker pool deployment with message routers. Message selectors are represented by the question marks. P, the message producer; C, message consumer; s, submit message (bulk copy activity document); c, control message; e, event message.

The processing of a request is handled as follows:

  • — A client posts a data transfer request and receives a handle for subsequent polling for status updates. The service validates the message with respect to the schema.

  • — User credentials, which are required for the worker nodes to authenticate to the data source and sink, are serialized within the request message (all communications require encryption and no tokens are persisted).

  • — The Web service records the request (excluding authorization tokens) and publishes the message to the job submission queue and the queuing status is recorded.

  • — The message is optionally scheduled, and at the time requested routed to the first available worker node in the required pool.

  • — The VFS-based worker node authenticates to the data source and sink, validates and scopes the job and performs the data transfer on behalf of the client.

  • — Status events originating from the worker (completion, step failures) are published to the event queue and are received and recorded by the DTS Web service.

  • — Control messages from the client can be routed directly to the selected recipient worker that handles the job via the control queue.

(a) DTS prototype: Java, JMS, VFS and Spring

Our prototype makes use of Spring Batch, Spring Integration and Spring Web services (http://www.springsource.org/). Spring Batch provides a framework for our worker–agent for functions that are essential in long-running batch processing, such as process splitting, monitoring and re-merging, logging/tracing, transaction management, job pause and restart, skip, retry and check-pointing. The worker–agent implements VFS to span differing resource protocols. Spring Integration provides a range of polling and message-driven channel adaptors to facilitate varying messaging styles and transport protocols between the worker node and the client. Spring Integration is attractive as it supports the majority of enterprise integration patterns described in Hohpe & Woolf (2003). The DTS Web service uses Spring Web services to facilitate a ‘contract-first’ service implementation with marshalling/un-marshalling provided by Apache XMLBEANS (http://xmlbeans.apache.org).

The prototype currently relies on transport security (SSL) rather than message-level security. Grid Security Infrastructure support is provided via the MyPROXY Credential Management Service (http://grid.ncsa.illinois.edu/myproxy).

(b) Realization in an existing implementation

While our proposed bulk data copy document and architecture form the basis for our prototype, the level of abstraction does not constrain us to a specific service implementation. For example, potentially, we could apply the OGSA-BES/JSDL approach to the existing infrastructure such as GridSAM or GRISU (http://www.arcs.org.au/products-services/systems-services/grisu).

GridSAM is an open-source computational job manager that provides an open-standards-based interface to a computational resource manager, such as CONDOR, PBS, LSF or SUN Grid ENGINE. It supports, among others, HPC Basic Profile (i.e. OGSA-BES and JSDL; Dillaway et al. 2007) and HPC File Staging Profile. GridSAM’s extensibility allows for new functionality to be readily added, for example support for additional resource managers through its Distributed Resource Manager (DRM) mechanism. This could be achieved by simply supporting the proposed option ‘b’ in figure 8 enabling support across all existing DRMs. This would only require support for the copyagent.sh. Alternatively, or in a future development, a more finely grained approach could be supported. Individual DRMs are composed of a number of stages to support each step of a job execution, from staging in data, to running the job and staging out the resultant output data. Since each of these steps can be directed to encapsulated code extensions, micro-extensions to each existing DRM could be developed to support the DTS back-end infrastructure for the data staging aspects. This does not compromise existing functionality, and enables full support for the BES state model extensions. We plan to explore these possibilities in the future.

8. Conclusion

DTS is designed to scale from self-contained ‘data transfer service in a box’ deployments (by packaging and abstracting a number of platform-independent clients for a range of different resources), through to ‘heavy-lifting’ service deployments. In the simplest scenario, the worker–agent may be deployed as a locally available fault-tolerant batch tool for performing asynchronous data copying between different (potentially incompatible) resources. For heavy lifting, workers may be replicated and strategically located as remote third-party agents for scheduled execution.

We proposed a new bulk data copy activity document that currently builds upon the JSDL data staging model and OGSA DMI, and have outlined some design features for further refinement and consideration in future standardization efforts. We also considered life cycle and state events by suggesting potential integration options for use within JSDL, OGSA BES and DMI. We believe the definition of an open standard describing a bulk data-copying activity will provide a key foundation in addressing the challenges of data management.

The DTS is intended to serve the needs of a wide variety of data-centric communities in the sciences and humanities, and in doing so bridge the gap between grid-focused solutions and desktop applications for data transfer.

Acknowledgements

The support of the Australian National Collaborative Research Infrastructure Strategy (NCRIS), the Australian Research Collaboration Service (ARCS), the ARC Molecular and Materials Structure Network, OMII-UK (OMII-UK) and the UK National Grid Service are gratefully acknowledged. Contributions made by Alex Arana and discussions with Mario Antonioletti are also gratefully acknowledged.

Footnotes

    References

    View Abstract