Royal Society Publishing

Building an infrastructure for scientific Grid computing: status and goals of the EGEE project

Fabrizio Gagliardi, Bob Jones, François Grey, Marc-Elian Bégin, Matti Heikkurinen

Abstract

The state of computer and networking technology today makes the seamless sharing of computing resources on an international or even global scale conceivable. Scientific computing Grids that integrate large, geographically distributed computer clusters and data storage facilities are being developed in several major projects around the world. This article reviews the status of one of these projects, Enabling Grids for E-SciencE, describing the scientific opportunities that such a Grid can provide, while illustrating the scale and complexity of the challenge involved in establishing a scientific infrastructure of this kind.

Keywords:

1. Introduction

The term ‘Grid computing’ was first popularized in the scientific community through the publication in 1998 of The Grid: blueprint for a new computing infrastructure by Foster & Kessleman (1999). Since then, the term has been adopted for a wide variety of purposes, some of which have little to do with the original intentions. In this article, Grid computing refers to distributed computing performed across multiple administrative domains. More specifically, our focus is on applications requiring high-throughput computing, using clusters or farms of PCs and associated data storage facilities in major research establishments around the world, which are connected via high-speed networks, since this is the type of computing Grid infrastructure that the Enabling Grids for E-SciencE (EGEE) project is developing.

The aim of such a computing Grid is to provide a seamless and high quality of service to multiple scientific communities through the use of appropriate middleware technology. The benefits that are anticipated to follow from such a computing Grid include a large increase in both the peak capacity and the total computing power delivered to various scientific projects, in a secure environment. Scientific communities should also benefit from new ways of sharing and analysing very large datasets. These benefits should translate into a rise of both the quality and quantity of scientific output in a broad spectrum of compute-intensive fields, ranging from bioinformatics and climate simulation to the nanoscale design of new materials and the integration of large engineering projects involving many partners. The term ‘e-science’, like Grids, has been attributed multiple meanings, but in this article, it simply refers to the broad spectrum of scientific applications that stand to benefit greatly from Grid computing.

While this vision of Grid computing was clearly articulated by Foster and Kesselman in 1998, it is fair to say that reality is still lagging behind the vision. This is because the benefits of Grid computing come at a significant cost. In practice, such a Grid requires the establishment of a comprehensive infrastructure to provide simple, reliable round-the-clock access to the underlying computing resources. This includes everything from performance-monitoring tools to call centres to user training programmes. While many of the articles in this issue focus on the scientific applications of computing Grids, this article provides an overview of how such a scientific infrastructure is being established in practice, on the basis that it is important for the scientific community to appreciate the magnitude and complexity of the task, in order to avoid unrealistic expectations.

Two provisos need to be expressed concerning the contents of this article. Firstly, the article is based closely on EGEE project planning documents, which, in turn, reflect the complex mix of technological, economic and political considerations that are the reality of a major international effort of this kind. EGEE is not a research and development (R&D) project concerned with obtaining new technological results. Rather, it is an engineering project concerned with the effective deployment of a complex infrastructure and its successful uptake by a diverse scientific community. Secondly, the case of Grids of supercomputing facilities, which tend to rely on specialized hardware rather than commodity PCs, and address high-performance computing applications such as virtual reality and weather forecasting, are not dealt with directly in this article, although references to projects addressing this sort of Grid are given. It is worth noting that efforts are underway to combine these two types of computing Grid, so that hopefully, one day the difference will be largely immaterial to the scientific end-user. However, readers should not entertain unrealistic expectations on this front.

2. Background

Based on pioneering work in the US and in Europe, software toolkits for distributed computing—such as Globus, Condor and Unicore—are currently available. As a result, a number of projects have explored and demonstrated various aspects of computer Grids. Europe has achieved a prominent position in this field, particularly with its success in establishing a functional Grid testbed comprising more than 20 centres in Europe, in the context of the European DataGrid (EDG) project.1 Individual countries, such as the UK, France and Italy, have developed comprehensive ‘e-science’ programmes that rely on emerging national computing Grids to deliver unprecedented computing resources to science. However, to date, there are no real production-quality Grids that can offer continuous, reliable Grid services to a range of scientific communities. This is the motivation for the EGEE project.

3. Project scope and vision

The scope of the EGEE project is to integrate current national, regional and thematic Grid efforts, to provide a seamless Grid infrastructure for the support of the European Research Area. This infrastructure builds on the EU research network GEANT and exploits Grid expertise that has its roots in projects such as the EDG project, other EU supported Grid projects and the national Grid initiatives such as UK e-Science (Hey & Trefethen 2002), INFN Grid (Grid-it)2 and NorduGrid (Eerola et al. 2004). EGEE originally stood for EGEE in Europe, but now simply stands for EGEE, in recognition of the increasingly global nature of this endeavour, which involves the US and Russia as partners.

The vision of EGEE is for this Grid infrastructure to provide European researchers in academia and industry with a common pool of computing resources, independent of geographical location, enabling round-the-clock access to major computing resources. This infrastructure will support distributed research communities, which share common Grid computing needs and are prepared to integrate their own distributed computing infrastructures and agree on common access policies. The resulting infrastructure will surpass the capabilities of local clusters and individual supercomputing centres in many respects, providing a unique tool for collaborative compute-intensive science (e-science) in the European Research Area. New Grid applications also bring new requirements for the Grid infrastructure. Arguably, the most challenging new requirement is security, which EGEE is actively engaged in fulfilling. Finally, the infrastructure will provide interoperability with other Grids around the globe, including the US National Science Foundation (NSF) Cyberinfrastructure3, contributing to efforts to establish a worldwide Grid infrastructure. Figure 1 illustrates the scope and the planned evolution of the EGEE Grid infrastructure.

Figure 1

Schema of the evolution of the EGEE Grid infrastructure.

4. The EGEE structure

The EGEE project was originally proposed by experts in Grid technology from the leading Grid activities in Europe. The project now includes more than 70 project partners organized in twelve partner regions or ‘federations’, as shown in figure 2. Furthermore, with the deployment of the EGEE project structure, several of these partners have begun integrating regional Grid efforts in order to provide coordinated resources to the EGEE project.

Figure 2

The EGEE federation map.

Enabling Grids for E-SciencE is a 2 year project as part of 4 year programme. Major implementation milestones after 2 years will provide the basis for assessing subsequent objectives and funding needs. Given the service-oriented nature of this project, two pilot application areas were selected to guide the implementation and to certify the performance and functionality of the evolving European Grid infrastructure. One of these pilot initiatives is the Large Hadron Collider Computing Grid (LCG; Bird et al. 2004), which relies on a Grid infrastructure in order to store and analyse petabytes of real and simulated data from high-energy physics experiments at CERN. The other is biomedical Grids, where several communities are facing equally daunting challenges to cope with the flood of bioinformatics and healthcare data.

Given the rapidly growing scientific demand for a Grid infrastructure, it was deemed essential for the EGEE project to ‘hit the ground running’ by deploying basic services and initiating joint research and networking activities before the formal start of the project. The LCG project provides basic resources and infrastructure since 2003. Biomedical Grid applications have been deployed on the EGEE current production service (i.e. LCG-2). The available resources and user groups are rapidly expanding and this trend is anticipated to continue throughout the lifetime of the project.

It is worthwhile distinguishing the scope of the Grid infrastructure being developed by EGEE from LCG, which is currently providing the production environment for EGEE. LCG is a thematic Grid, which is dedicated to providing a Grid production service to the four LHC experiments. In contrast, EGEE has a wider scope in opening up the Grid to a wide range of sciences. In addition, EGEE includes human networking; that is, dissemination and outreach as well as training and application support to new emerging applications and user communities (e.g. biomedical).

The requirement of supporting several applications and user communities has necessitated developing a new, scalable operation and user support structure. The centralized, hierarchical model of LCG may be optimal for a single application domain in a relatively homogeneous Grid. However, in case of a large-scale, multidisciplinary Grid like EGEE, a federated approach where more responsibility is delegated to the regional support organization will enable faster response times and will also leverage local knowledge about the specifics of the resources.

5. The EGEE mission

In order to achieve the vision outlined above, EGEE is currently working on a threefold mission:

Their essential elements are manageability, robustness and resilience to failure. In addition, a consistent security model will be provided, as well as the scalability needed to rapidly absorb new resources as these become available, while ensuring the long-term viability of the infrastructure.

This will support and continuously upgrade a suite of software tools capable of providing production level Grid services to a base of users, which is anticipated to rapidly grow and diversify.

These can proactively market Grid services to new research communities in academia and industry, capture new e-science requirements for the middleware and service activities, and provide the necessary education to enable new users to benefit from the Grid infrastructure.

Reflecting this threefold mission, EGEE is structured in three main areas of activity—services, middleware re-engineering and networking. The key aspects of each of these areas are summarized in separate sections below. It is essential to the success of EGEE that the three areas of activity maintain a tightly integrated ‘virtuous cycle’, illustrated in figure 3. In this way, the project as a whole can ensure rapid yet well-managed growth of the computing resources available to the Grid infrastructure as well as the number of scientific communities that use it. As a rule, new communities will contribute new resources to the Grid infrastructure. This feedback loop is supplemented by an underlying cyclical review process covering overall strategy, middleware architecture, quality assurance and security status, and ensuring a careful filtering of requirements, a coordinated prioritization of efforts and maintenance of production-quality standards.

Figure 3

The ‘virtuous cycle’ for EGEE development.

As illustrated in figure 3, the virtuous cycle for EGEE development, a new scientific community first makes contact with EGEE through outreach events organized by networking activities. Follow-up meetings by application specialists may lead to definition of new requirements for the infrastructure. If approved, the requirements are implemented by the middleware activities. After integration and testing, the new middleware is deployed by the service activities. The networking activities then provide appropriate training to the community in question, so that it becomes an established user. Peer communication and dissemination events featuring established users then attract new communities.

6. The stakeholder perspective

In order to convey the scope and ambition of the project, this section presents the expected benefits for EGEE stakeholders and an outline of the procedure for the stakeholder to participate in EGEE. The key types of EGEE stakeholders are users, resource providers and industrial partners.

(a) EGEE users

Once the EGEE infrastructure is fully operational, users perceive it as one unified large-scale computational resource. From the user perspective, the complexity of the service organization and the underlying computational fabric remains invisible. The benefits of EGEE from the user perspective include:

Today, most users have accounts on numerous computer systems at several computer centres. The resource allocation procedures vary between the centres and are in most cases based on applications submitted to each centre or application area management. The overhead involved for a user in managing the different accounts and application procedures is significant. EGEE reduces this overhead by providing the means for users to join virtual organizations with access to a Grid containing all the operational resources needed. This is also known as ‘single log-in’.

A key focus in the EGEE re-engineering activity is security. This enhanced feature will allow a wide range of new applications, such as biomedical, to use the Grid in complete confidence.

By allocating resources efficiently, the Grid promises greatly reduced waiting times for access to resources.

The infrastructure is accessible from any geographical location with good network connectivity, thus providing regions with limited computer resources access on an as need basis to large resources.

Through coordination of resources and user groups, EGEE can provide application areas with access to resources of a scale that no single computer centre can provide. This enables European researchers to address previously intractable problems in strategic application areas.

By providing a unified computational fabric EGEE allows widespread user communities to share software and databases in a transparent way. EGEE acts as the enabling tool for collaborations, building and supporting new virtual application organizations.

By making use of the expertise from all partners EGEE is able to provide a support infrastructure that includes in depth support for all key applications and around the clock technical systems support for Grid services.

Potential user communities typically come into contact with EGEE through one of the many outreach events supported by the Dissemination and Outreach activity, and are able to express their specific user needs via the EGEE Generic Applications Advisory Panel. The resources that the new community requires, together with the resources the community can contribute to the Grid infrastructure, are then reviewed by a technical panel. The consolidated requirements are finally converted into resource policies that the service activity implements for each corresponding virtual organization (VO) on the production service. Finally, the new community receives training from the user training and induction activity. From the user perspective, the success of the EGEE infrastructure is measured in the scientific output that is generated by the user communities it is supporting.

The common theme among new communities joining the Grid is security. In the past, most particle physicists have had rather loose security requirements. In contrast, biomedical and pharmaceutical communities have severe security requirements, dictated by legal and privacy issues, as well as business competitiveness. Other important requirements come from the wide range of communities the Grid has to support. These communities are grouped on the Grid by entities called VOs; therefore, our middleware and deployment strategies need to cater for a more flexible management (e.g. creation, revocation) of the rapidly increasing number of VOs. New applications also build on top of commercial off-the-shelf packages, with the corresponding issues of deployment and licensing of these on a large number of resources. Finally, several new users, and their applications, require outbound connectivity and communication between jobs running in parallel on the same site (or in close network proximity).

Security requirements are also being stretched in both directions by medical applications. The need to transmit sensitive data, such as patient data, requires the introduction of data encryption services, relying on certificates. In this case, a pseudonymity service is also being devised to hide the Grid user. Meanwhile, with the current administrative process required for each user to acquire a Grid certificate, it is unreasonable to expect every occasional user to acquire such a certificate. In the meantime, ‘simple’ authentication is performed by public web portals, outside the Grid certificate infrastructure.

Further, medical imaging as well as physics applications requires interactive response compared with the batch system usage that has prevailed to date. To provide interactive response requires reduced delay between job submission and execution, which has been alleviated in some cases by the introduction of applications level schedulers. Interactivity often requires outbound connectivity from worker nodes at the Grid sites, but a number of resource centres consider this a security breach and disable this functionality by closing ports in firewalls.

(b) Resource providers

Enabling Grids for E-SciencE resources include national Grid initiatives, computer centres supporting one specific application area or general computer centres supporting all fields of science in a region. The motivation for providing resources to the EGEE infrastructure reflects the funding situation for each resource provider. EGEE has developed, and is actively improving, policies that are tailored to the needs of different kinds of partners. The most important benefits for resource providers are:

Through EGEE, a coordinated large scale operational system is created. This leads to significant cost savings and at the same time improved level of service provided by each participating resource partner. Through EGEE, the critical mass needed for many support actions can be reached by all participating partners.

By distributing service tasks among the partners, EGEE makes use of leading specialists to build and support the infrastructure. In this sense, the Grid connects distributed competence just as it connects distributed computational resources. Each participating centre and its users thus have access to experts in a wide variety of application and support fields.

The EGEE distributed support model allows for regional adaptation and close contacts with regional user communities. The existence of regional support is of fundamental importance when introducing new users and user communities with limited experience of computational techniques and Grid technology. A resource partner in EGEE becomes much more attractive as a collaboration partner on the regional level by representing the large-scale EGEE infrastructure.

Several partners within the EGEE framework are already forming collaborations and launching development and support actions not included the original proposal. This leads to cost sharing of R&D efforts among partners and in the longer perspective allows for specialization and profiling of participating partners to form globally leading centres of excellence within EGEE. These benefits motivate the many partners that support EGEE, representing aggregated resources of over 10 000 CPUs provided by more than 100 sites.

(c) Industrial partners

Currently, the main EGEE players are the scientific applications and the partners represent publicly funded research institutions and computer resource providers. Nevertheless, it is envisaged that industry will benefit from EGEE in several ways:

Through collaboration with individual EGEE partners, industry participates in specific activities where relevant skills and manpower are available, thereby increasing know-how on Grid technologies.

As part of the networking activities, specific industrial sectors will be targeted as potential users of the installed Grid infrastructure, for R&D applications. The pervasive nature of the Grid infrastructure should be particularly attractive to high-tech small and medium sized enterprises, because it brings major computing resources (once only accessible to large corporations) within reach.

Building a production quality Grid will require industry involvement for long-term maintenance of established Grid services, such as call centres, support centres and computing resource provider centres. The EGEE vision also has inspiring long-term implications for the IT industry. By pioneering the sort of comprehensive production Grid services envisaged by experts, but which are currently beyond the scope of national Grid initiatives, EGEE will have to develop solutions to issues such as scalability and cost models that go substantially beyond current Grid R&D projects. This process will trigger innovative IT technologies, which will have benefits for industry, commerce and society going well beyond scientific computing. Major initiatives launched by several IT industry leaders in the area of Grids and utility computing emphasize the economic potential of this emerging field.

Industry typically comes in contact with EGEE via the industry forum organized by the application identification and support activity, as well as more general dissemination events run by the dissemination and outreach activity. Interested companies are able to consult about potential participation in the project with the project director and with regional representatives on the EGEE project management board. As the scope of Grid services expands during the second 2 year phase of the programme, it is envisaged that established core services will be taken over by industrial providers with proven service capacity. This service would be provided on commercial terms, and selected by a competitive tender.

As more Grid applications leverage commercial software, we also need to work closer with the industry to agree on a common licence strategy of this commercial software, consistent with the Grid model.

7. Service activities

The service activities deploy, operate, support and manage an international production quality Grid infrastructure, including resources from many resource centres primarily across Europe, but now extends around the globe. This service is made accessible to user communities and virtual organizations in a consistent way according to agreed access management policies and service level agreements. These activities build on current national and regional initiatives such as the UK e-Science Grid, INFN Grid and NorduGrid, as well as infrastructures established by specific user communities, such as LCG.

By the end of October 2004, more than 30 000 CPU hours had been provided by the EGEE Grid infrastructure excluding HEP LHC experiments, which remain the principal infrastructure customers. While most of these CPU hours and their corresponding 20 000 jobs come from active users of the biomedical virtual organizations (the second pilot application beside HEP), other applications are now starting to use the production infrastructure.

At present, the world map of Grid infrastructure is fractured (e.g. EGEE/LCG; Grid3 (Foster et al. 2004); NorduGrid (Eerola et al. 2004); TeraGrid;4 DEISA;5 GridPP (Britton et al. 2004)), because, too often, interoperability is not trivial if at all possible. While standardization will definitely help fostering interoperability, we believe that the first hurdle we are currently facing is the lack of a common security infrastructure. Message level security following standards like WS-security might be a medium term solution, however, transport-level security is more widely used. We are therefore putting a significant effort into harmonizing the security infrastructure among the different Grid infrastructures and services. In the longer term, a recognized set of service-oriented architecture (SOA) Grid interfaces, with the help of WSRF (Web Services Resource Framework)6, will hopefully smooth the interoperability among Grids and provide a harmonized service, similarly to the telecoms and networking services today.

Prior to the deployment of a new release of the EGEE middleware to the production service, the service activities run a battery of tests on a dedicated testbed: the certification testbed. Only after these rigorous tests are passed successfully can new releases be deployed. A lesson learned from EDG, the precursor to EGEE, is the need for an intermediate step between the certification of a new release and the production service. Because EGEE is working in parallel on middleware re-engineering and novel Grid applications, the risk of perturbations on the production quality Grid service increases. Further, as the production service grows in size, thus complexity, deployment corrections have to be minimized. The solution is the introduction of a pre-production service, which will serve the double purposes of testing new Grid middleware features in a multi-site context and provide a realistic environment for testing new applications and VO policies. This intermediate service currently extends to eight sites across Europe.

The structure of the Grid services comprises EGEE operations management at CERN; EGEE core infrastructure centres in the UK, France, Italy and at CERN, responsible for managing the overall Grid infrastructure; regional operations centres, responsible for coordinating regional resources, regional deployment and support of services. The basic services that are offered are middleware deployment and installation; a software and documentation repository; Grid monitoring and problem tracking; bug reporting and knowledge database; (VO) services; and Grid management services. Continuous, stable Grid operation represents the most ambitious objective of EGEE, which is reflected by this activity being assigned nearly 50% of the EGEE budget.

8. Middleware re-engineering activities

Up to now, the state-of-the-art in Grid computing has been dominated by research Grid projects that aimed at delivering test Grid infrastructures providing proofs of concept and opening opportunities for new ideas, developments and further research. Only recently has there been an effort to agree on a unified open Grid services architecture (OGSA)7 and an initial set of specifications constituting the WSRF that set some of the standards in defining and accessing Grid services and fundamental exchange mechanisms. Furthermore, Web services (WS) have brought a new paradigm shift on largely distributed systems, which the re-engineered EGEE middleware is leveraging.

The middleware activities in EGEE focus primarily on re-engineering existing middleware functionality, leveraging the considerable experience of the partners with the current generation of middleware and feedback from user groups and operations staff. Based on experience, geographical collocation of development staff is essential, and therefore these activities are based on tight-knit teams concentrated in a few major centres with proven track records and expertise. The EGEE re-engineered middleware is branded under a new name ‘gLite’ (pronounced ‘gee-lite’).8

gLite builds on the best practice experience and middleware produced by the best of the kind Grid middleware projects of the current generation: Condor, Globus, VDT, Alien, EDT, DataTAG, and so on. EGEE participates actively in the most relevant standard bodies such as GGF (Global Grid Forum)9 and OASIS (Organization for the Advancement of Structured Information Standards).

Platform independence is also a goal for gLite. To reach this goal, while sometimes heavily reusing existing software, the gLite source code is regularly built on several flavours of Red Hat Enterprise Linux and Windows XP. Although today's deployment reality of Grid middleware is mostly Linux based, the future might bring other types of operating systems such as, Mac OS X, Windows and other Unix variants.

There are a number of shortcomings in the LCG-2 software currently deployed on the LCG and EGEE production service. The existing LCG-2 data management tools do not offer the level of performance and reliability required by applications such as the LHC physics experiments. For example, the file catalogue that runs for each VO represents a single point of failure. This is being addressed by gLite, which provides a distributed file catalogue service using third party reliable messaging software to synchronize local catalogues as well as offering bulk registration of datasets.

Further, the workload management system of LCG-2 uses a resource broker to match and rank a submitting job requirement to sites that could execute it. The current architecture involves significant overheads in the time from when a job is submitted to when it is executed. There are also limitations in site-specific policies that can be taken into account during the ranking step resulting in inefficiencies, which can cause submitted jobs to be lost. The workload management system in gLite is being modified to introduce a mixed pull and push model to address these problems. It will also reduce the reliance on the accuracy of the information service, which should increase the overall efficiency of the Grid. It is expected that gLite will be released at the end of March 2005.

Building a European Grid infrastructure based on robust components is thus becoming feasible. However, this will still take a considerable integration effort in terms of making the existing components adhere to the new standards, adapting them to evolution in these standards, and deploying them in a production Grid environment.

Security is another key domain where re-engineering of current Grid services is required. Retrofitting security to an implementation that was not designed with this requirement in mind can be difficult. This is an area were the right balance has to be struck between re-engineering of current services through wrapping techniques and more invasive work. Meanwhile, the world of Web services has seen a lot of activity around a new specification called WS-security (OASIS)10, which appears promising in the medium term.

As for all other IT infrastructures (e.g. telecoms, telephony, networking), the Grid requires a rich set of stable and well adopted standards to guide services developers and providers, and ensure interoperability between Grid services. EGEE is heavily involved in several standardization bodies on a wide range of topics. Meanwhile, complex and promising standards such as WS-* (e.g. WS-security, WS-addressing, WSRF) require good performing and mature tooling and in the right range of languages and platforms. Because we do not believe the right level of maturity has been reached yet, gLite tries, wherever possible, to be consistent with the ‘spirit’ of these standards and recommendations, while sometimes implementing a temporary working custom solution. The most important new paradigm however in Grid development is the widely shared vision that Grids need to be built according to a SOA; gLite reflects this.

9. Networking activities

The networking activities in EGEE aim to facilitate the induction of new users, new scientific communities and new virtual organizations into the EGEE community. EGEE develops and disseminates appropriate information to these groups proactively, and takes into account their emerging Grid infrastructure needs. The goal is to ensure that all users of the EGEE infrastructure are well supported and to provide input to the requirements and planning activities of the project.

Specific (human) networking activities in the EGEE are:

  1. dissemination and outreach;

  2. user training and induction;

  3. application identification and support;

  4. policy and international cooperation.

10. Application identification and support activity

The application identification and support activity has three components: two pilot application domains: high-energy physics (HEP) and biomedical Grids; completed by a more generic component dealing with the longer-term recruitment of emerging communities and their applications.

The success of current user communities such as the HEP community lies in its matured and highly organized nature. Other communities have to learn from their success and the application identification and support activity's mission is to provide them with all the support and tools needed to accelerate their transition to the Grid.

An important tool that the activity possesses is a dedicated testbed called GILDA. Heavily used during the numerous training events organized by the training activity of EGEE, the testbed also supports early investigation of feasibility for new applications to use the Grid production service.

At the time of writing, apart from HEP applications, the following applications are at different stages of integration on to the EGEE Grid:

  1. GATE: Monte Carlo simulation for radiotherapy planning;

  2. GPS@: web portal for bioinformatics;

  3. CDSS: Clinical Decision Support System: expert system for medicine;

  4. SiMRI3D: magnetic resonance images parallel simulator;

  5. xmipp_MLrefine: macromolecular three-dimensional structure analysis;

  6. gPTM3D: radiological data interactive segmentation and analysis;

  7. Others domains: earth observation, geophysics, chemistry, hydrology, ESA, astrophysics, digit libraries, and so on.

The applications mentioned above come mainly from the bio-informatics and bio-medical domains, while several other domains listed in the last item will further expand the different domains using the Grid. For example, the EGEE project has attracted its first application deployed by an industrial partner, the French Compagnie Générale de Géophysique (CGG).

11. Conclusions and outlook

The first phase of the EGEE project aims to deliver, by 2006, a production quality Grid infrastructure for the European research area and the international scientific community in general. In this time frame, it aims to extend support to new applications from at least five different scientific communities, running on the EGEE re-engineered middleware, gLite. Meanwhile, the project management is preparing plans for the continuation of the project. In this second phase, it is planned to extend both the geographical coverage of the Grid infrastructure and the number of supported end-user international scientific communities, while beginning the process of transferring some of the established Grid services to industrial providers.

In broader terms, the challenge for Grid infrastructures such as the EGEE is to integrate them seamlessly into the continuously evolving Internet and Telecom infrastructure. The common theme in the development of all of these three fields is the distribution of storage and processing ‘to the edges of the network’. Undoubtedly, the real impact of Grid computing on society will emerge when access to Grid services is more tightly integrated in everyday Internet-based tools and applications. Integrating single sign-on, security and resource sharing of the Grid (e.g. applications, runtime environments, user support) with systems that already have a large user base can create new business opportunities and, in turn, make the Grid infrastructure self-sustaining from an economic point of view.

Acknowledgments

Enabling Grids for E-SciencE is a project funded by the European Union under contract INFSO-RI-508833.

Footnotes

References

View Abstract