The Virtual Physiological Human ToolKit

Jonathan Cooper, Frederic Cervenansky, Gianni De Fabritiis, John Fenner, Denis Friboulet, Toni Giorgino, Steven Manos, Yves Martelli, Jordi Villà-Freixa, Stefan Zasada, Sharon Lloyd, Keith McCormack, Peter V. Coveney

Abstract

The Virtual Physiological Human (VPH) is a major European e-Science initiative intended to support the development of patient-specific computer models and their application in personalized and predictive healthcare. The VPH Network of Excellence (VPH-NoE) project is tasked with facilitating interaction between the various VPH projects and addressing issues of common concern. A key deliverable is the ‘VPH ToolKit’—a collection of tools, methodologies and services to support and enable VPH research, integrating and extending existing work across Europe towards greater interoperability and sustainability.

Owing to the diverse nature of the field, a single monolithic ‘toolkit’ is incapable of addressing the needs of the VPH. Rather, the VPH ToolKit should be considered more as a ‘toolbox’ of relevant technologies, interacting around a common set of standards. The latter apply to the information used by tools, including any data and the VPH models themselves, and also to the naming and categorizing of entities and concepts involved.

Furthermore, the technologies and methodologies available need to be widely disseminated, and relevant tools and services easily found by researchers. The VPH-NoE has thus created an online resource for the VPH community to meet this need. It consists of a database of tools, methods and services for VPH research, with a Web front-end. This has facilities for searching the database, for adding or updating entries, and for providing user feedback on entries. Anyone is welcome to contribute.

1. Introduction

The Virtual Physiological Human (VPH) initiative aims to ‘enable collaborative investigation of the human body as a single complex system. [The VPH will provide] the European research infrastructure that will make it possible for biomedical researchers to complement their conventional reductionist approach with what we call an integrative approach, where biological processes are described from a systems point of view’ (STEP Consortium 2007, p. 2). It recognizes that the current partitioning of health science endeavour along traditional lines (e.g. scientific discipline, anatomical subsystem and temporal or dimensional scale) is artificial and inefficient with respect to such an all-embracing description of human biology (Fenner et al. 2008). A key characteristic of the VPH is its clinical focus—developing our understanding of the biological complexity involved in maintaining life and moving us from our current reactive medicine to ‘4P medicine’: predictive, personalized, preventive and participatory (Hood 2008).

The VPH initiative (VPH-I) constitutes an integral part of the international Physiome Project (Bassingthwaighte 2000), and is a core target of the European Commission’s Seventh Framework Programme. It was launched in 2008 with a 72 M call funding 12 research projects, two coordination and support actions, and one Network of Excellence (VPH-NoE). The VPH-NoE has been tasked with facilitating interaction between these and future VPH projects, with a view to address issues of common concern—these include training, career development and community building. The initial VPH projects have now been active for just over a year—a suitable point at which to review progress.

A key deliverable from the VPH-NoE is a ‘VPH ToolKit’—a technical and methodological framework that will support and enable VPH research through creation, accumulation and curation of VPH research-related ‘capacities’. The primary aim is the integration of existing work across Europe and its further development towards greater interoperability. It is only by enhancing cross-disciplinary research in this way that the ambitious goals of the VPH will be achieved.

In this paper, we discuss our vision for the ToolKit (§2), what has been done so far in its development (§4) and where effort is most needed now (§5). In particular, effort to date has been directed at establishing the VPH ToolKit portal as a resource for the VPH community (§3). Firstly, however, the concepts underlying the ToolKit, and hence the approach to achieving its aim, are explained.

2. The VPH ToolKit concept

Initial work within the ToolKit work package of the VPH-NoE consisted of an extensive requirements and technology assessment exercise (RTAE; VPH-NoE Consortium 2008), to clarify both the scientific needs and existing technological solutions within the biomedical research community. It is clear that the ToolKit must build on the extensive work already being undertaken, usually for specific scientific problems, bringing much of this to wider use and understanding. After the RTAE, many tasks of the VPH-NoE have focused on foundation work for the first release of a ToolKit, in July 2009, upon which future releases can build.

In this, the VPH-NoE has to work with exemplar projects and the wider projects of the VPH-I to ensure that solutions developed are fit for purpose and that technologies developed to meet individual project goals can fit within this framework. To provide opportunities for projects, rather than to preclude new ideas, it is desirable to build the system in such a way that value is added to those tools that use it, by producing a basic infrastructure that can be exploited in various ways. Figure 1 visualizes this approach, showing standards underpinning the ToolKit edifice, with layers of ‘infrastructure’ tools on top (libraries, services, middleware and application frameworks) and specific applications built on these leading towards clinical relevance as the end point. The applications both showcase the relevance of the ToolKit for end users and also drive development of the lower levels towards achieving tangible benefits.

Figure 1.

Diagrammatic representation of the VPH ToolKit.

As seen in the RTAE, VPH research covers the full range of human physiology, in both normal and pathological states, using a wide range of mathematical formalisms and simulation approaches. Sauro et al. (2003) noted that it is impracticable for one tool to cover all of systems biology; this applies equally to the many and varied needs of the VPH, and the topics are indeed closely related (Kohl & Noble 2009). A better approach is to develop a set of interoperable components based on common standards. Standardization plays a central role in facilitating the exchange and interpretation of the outcomes of scientific research and, in particular, of computational modelling (Klipp et al. 2007). Standards are necessary in order for data and models to be integrated, leading to greater knowledge and understanding. In many areas of VPH research, the required standards either do not exist or need further development to meet the needs of the community. A major contribution of the VPH-NoE will be to their development and, most importantly, subsequent adoption. Emerging standards need to be trialled and refined in an iterative, open process. This can take much effort, exploration and time, both technically in defining the standards themselves and also socially in securing the backing of the user community; however, it is essential to long-term success. This topic is explored further in §4a.

The layers built on these standards (see figure 1) provide common functionality to VPH applications. This may be in the form of application programming interfaces (APIs) for facilitating working with VPH standards, and associated libraries providing common processing tasks. Libraries or services can also provide core computational routines, with application frameworks filling a similar role at a higher level, allowing rapid development of user-facing tools. Middleware is important for enabling the easy and transparent use of large-scale computational and storage resources. Such resources are a key requirement for the VPH and an essential part in virtually any VPH-I project, but their provision and maintenance is beyond the scope of the VPH-NoE itself (VPH-NoE Consortium 2009d). The role of the VPH-NoE will be primarily in enabling access to services and working with providers to ensure that facilities are interoperable.

It should be noted, though, that the VPH ToolKit is not limited to technology and data, but also includes exposing the ‘silent knowledge’ buried in VPH projects. It is only by the sharing of experiences and discussion of best practice (e.g. development and deployment of modelling tools, data mining from images, dealing with the legal and ethical constraints of working with patient data, or using standards) that others will learn and adoption will become commonplace.

3. The VPH ToolKit portal

To encourage the sharing of knowledge and wider dissemination of relevant tools and services, a community website has been developed at http://toolkit.vph-noe.eu. Underlying the website is a database of technologies and services of relevance to the VPH community. Many of the entries listed were not developed specifically for the VPH research effort, but have wider applications. Some articles on relevant topics are also included. Most importantly, the website is interactive, enabling researchers to contribute to and benefit from the evolving resource. Researchers are invited to submit tools they have developed, or found useful, for inclusion in the database. There is also the facility to provide feedback from the experiences of users of the technologies listed, and so share knowledge with the community. Currently, submissions are moderated by the VPH-NoE, but we are looking to move to a community-centric model to ensure sustainability of this resource beyond the lifetime of the project (which ends in November 2012).

Each entry submitted to the website can be placed in up to five of a selection of categories. These provide a structure (itself subject to refinement as needed) that can help to place individual activities in their correct context within the VPH as a whole. Many categories also list one or two key contacts within the VPH-NoE who have particular expertise in that area. The three primary divisions are between tools, methods and services, and slightly different information can be provided for entries within each division. The ‘tools’ section contains software tools for supporting and enabling VPH research, divided into many subcategories, such as model editors, simulation software, data conversion tools, imaging software and collaborative tools. While tools are crucial for performing research, also of equal importance are the techniques and best practices for doing so in an effective and integrative fashion. The ‘methods’ section aims to disseminate knowledge in this arena. It covers topics such as those discussed in §4a—markup languages for VPH models, data standards and the use of ontologies. Finally, the ‘services’ section is for the many online resources and services available for the VPH community; for instance, databases, model repositories and access to high-performance computing (HPC).

To provide a rich knowledge resource, each entry can include far more than just the technology name, brief description and URL. Submitters are also asked to provide, if appropriate, such information as a version number, maturity status (e.g. beta release), target user group, licence, requirements for and constraints on use, links to available documentation and support, indications of future plans and cross-links to related entries. The website includes an extensive search facility, either through a simple textual search anywhere within an entry or for values of particular fields. Keywords can be provided for entries to assist with searching.

Current development on the portal itself is extending the information that can be provided about entries, to include details on adherence to ‘VPH conformance criteria’, for example, the use of VPH standards, and quality assurance metrics. The aim is to move beyond just a collection of independent tools, methods and services, towards an interoperable suite of VPH technologies—‘putting the pieces together’. We will thus be providing indications of which tools can be integrated, whether through the use of common standards or compatible licences. Examples of integration to tackle particular scientific problems will also be included.

4. Areas of ToolKit work

The VPH ToolKit encompasses many elements, reflecting the multi-faceted arena of VPH research. It is not possible to cover everything in detail within this paper, so in this section we discuss some of the main developments and activities during the first year of the VPH-NoE. Firstly, work on the selection and development of standards, including training and support for their use, is detailed in §4a. Access to medical image data is a key requirement for clinical applications, and so much community effort involves imaging tools, as described in §4b. Finally, simulating large multi-scale models of human physiology requires substantial compute resources, a topic addressed in §4c.

(a) Standards: models, data, ontologies and infrastructure interoperability

A central theme of the VPH, and indeed of the Physiome Project, is that models and data should be accessible and re-usable. To achieve this goal, they need to be made available in software-independent formats which also incorporate the semantic information needed to understand and use the content.

A standards working group (VPH-SWG) has been established, which is primarily coordinated by VPH-NoE stakeholders, and works in consultation with the broader VPH research community (academic, industrial and clinical). The clinical focus of the VPH mandates cooperation with other standards bodies (e.g. HL7, http://www.hl7.org/), in order to, for example, interface with the medical device and health industries. Of course, the process of standardization is a long one and will take longer than the lifetime of the current EC-funded VPH-NoE project. Mechanisms for sustaining this activity are being investigated (VPH-NoE Consortium 2009d).

There are a number of activities inherent within this process, including the analysis of current and evolving standards, resulting in recommendations for new standards. Four subareas (corresponding to four subgroups) have been identified that cover standards in the VPH projects: ontologies, data, modelling and infrastructure interoperability.

(i) Ontology standards

The use of ontologies is crucial for linking VPH resources and for linking with other communities (e.g. the molecular and systems biology communities). They are now the accepted means for annotating models and data with semantic metadata, for example, to identify the biological entities and processes involved. However, a plethora of ontologies has been developed (e.g. Smith et al. 2007; Burger et al. 2008), making interoperability challenging, and the choice of a suitable ontology bewildering. We are surveying the use of ontologies within the VPH community, to recommend as a first step convergence on a common standard for referring to anatomy (in collaboration with relevant groups such as the Foundational Model of Anatomy (FMA; Rosse & Mejino 2007), Open Biomedical Ontologies (OBO; Smith et al. 2007), and the National Center for Biomedical Ontology (NCBO; Rubin et al. 2006)). This includes the development of a pipeline to support the requested addition of uncatalogued terms to standard ontologies such as the FMA. This work is at an early stage, but has the potential to be one of the most significant contributions of the VPH-NoE. Ontologies are being promoted throughout the VPH sphere via training activities at the European Bioinformatics Institute (EBI), and several meetings on this topic are planned during the next year. A new VPH project, RICORDO, is also focused on this area.

(ii) Data standards

Many kinds of clinical and experimental data are used in VPH modelling work, for tasks such as determining model parameters, or for model validation by comparison with simulation output (i.e. predicted behaviours). Part of the work of the VPH-NoE is concerned with access to experimental data from published papers, and involves both text mining to extract parameters (Grau et al. 2009), and the development of online databases (Thomas et al. 2006). Development of standards for biological signal data is an example of the efforts needed to support such initiatives (http://www.embs.org/techcomm/tc-cbap/biosignal.html). Future work will involve the use of Web services and ontologies to facilitate searches across multiple data resources. Well-defined and standardized metadata is especially important in this context.

These activities are complementary to the clinical focus of the VPH, which has an unshakeable commitment to pave the way towards the inclusion of clinical data and patient-specific information into VPH modelling work, ultimately for patient benefit. This involves a developing strategy for the wide release of clinical data, looking at what is needed to enable this to happen, to inform future project proposals. Technically, there are many requirements, which will be addressed in collaboration with future VPH projects, including the infrastructure necessary to share data, the tools to provide appropriate anonymization, authentication and authorization, and the standardization of data formats to allow released data to be used by VPH software. Arguably the most difficult aspect of data provision, however, is sociological rather than technological, and therefore the VPH-NoE will also seek to offer guidance on, for example, the ethical issues associated with clinical data use, and lobby for consistency and improvement of legislation across Europe.

(iii) Modelling standards

Published models should be reproducible and testable by others, especially when a model is to be incorporated within a more comprehensive model, or clinical use is considered. This requires details in addition to the original publication—the reference description of a model must include model and data files to allow automated reproduction of simulations. The need is particularly acute in VPH research due to the size of the systems considered, their multi-scale aspects (both temporally and spatially) and the rapid proliferation of models. Work is required at three levels: minimal reporting standards for what information is required in a reference description; standards for the (computer-readable) syntax of models, data and simulations; and ontologies for annotating the semantic meaning of terms in the data, models or simulation experiments.1 VPH-NoE project members are involved in further development of such modelling languages as CellML (Lloyd et al. 2004) and FieldML (Christie et al. 2009), and reporting standards such as MIASE (http://www.ebi.ac.uk/compneur-srv/miase/) for simulation experiments. Work is being carried out to produce a first version of a force field markup language for molecular simulations that allows sharing of force fields in terms of topology, parameters and equations through a MathML interface. We also have close links with the SBML (Hucka et al. 2004) community, and are seeking to develop links with other groups such as NeuroML (Crook et al. 2007), in order to share expertise and to identify areas where concerted action can be of benefit to all.

Access to reference descriptions of models also requires infrastructure to store them. As well as work on model repositories, centralized locations containing curated models, the ToolKit includes tools to control access to simulation resources shared via peer-to-peer networks, as built in the ADUN molecular simulation package (Johnston & Villà-Freixa 2007). Other VPH-NoE activities include training on the use of markup languages, extending the support for such techniques in existing tools and assisting projects in marking up models.

(iv) Infrastructure interoperability standards

Appropriate accessibility of resources is key to the research and clinical uptake of simulations that use grid-based HPC resources, networks and data repositories. Various such resources are used for VPH research, as discussed in §4c. Significant progress has already been made on the interoperability of these grids, and work is continuing to promote standards-based interoperation at the protocol level—for example, through Open Grid Forum (OGF) specifications such as JSDL and OGSA-BES—and the application level, where needed. The needs of coupled multi-scale physiological models impose particular requirements on computational resources, and so the VPH-SWG will work with the relevant OGF working groups to feed back our needs and use cases, to help in the evolution of standards in areas essential for the successful realization of VPH models in clinical workflows.

(b) Imaging tools

Access to medical images, mainly in DICOM (Digital Imaging and Communications in Medicine) format, and data (signals, meshes) is a key requirement for clinical applications. Many open-source DICOM readers are available and widely used, but the community lacks an open-source and cross-platform mid-level DICOM management layer including an abstract graphical user interface (GUI) and advanced features, e.g. image indexing, picture archiving and communication system/hospital information system (PACS/HIS) access, that could be plugged directly into existing software. The imaging subgroup within the VPH-NoE performed a survey (VPH-NoE Consortium 2009b) to identify the existing tools and their potential, using criteria in accordance with the work of Nagy (2007):

  • — cross-platform compatibility to ensure a large diffusion;

  • — use of C++ as the main programming language, ensuring compatibility with the major graphic and readers libraries, such as VTK (Gobbi & Peters 2003) and gdcm (http://www.creatis.insa-lyon.fr/software/public/Gdcm/);

  • — exhaustiveness and accessibility of the documentation;

  • — existence of a living forum ensuring reliability and long-term development;

  • — open storage configuration (i.e. the ability to create its own representation, not only the usual DICOM tree);

  • — concordance with new DICOM standards proposals (e.g. DICOM Standards Committee, Working Group 23 2009); and

  • — anonymization functionality (protecting private data is a major concern and challenge for the VPH community) and the inverse process.

From these requirements, the survey shows that none of the existing tools satisfies all criteria. Even the popular solutions provided by Conquest (http://www.xs4all.nl/~ingenium/dicom.html) and Osirix (http://www.osirix-viewer.com/) fail on some criteria (no open configuration and no de-anonymization mechanism for the first, and only MAC OS X supported for the second). We have thus proposed the development of a new API based on these requirements (VPH-NoE Consortium 2009a).

The second objective of the imaging subgroup is to develop an online help tool called GUIDE (GUidelines for Image Development Environment), which will be part of the VPH ToolKit portal. The purpose of this tool is to guide users—developers, researchers and clinicians—in choosing the proper biomedical image analysis tools for their work (software, libraries, etc.), and to provide support enabling their sharing and open use. Giving a long list of tools, as, for example, the IdoImaging.com website does, is not enough because image analysis is very problem-dependent. GUIDE will therefore also expose use cases and analysis scenarios to clarify the relevance of the suggested tools. The requirements for GUIDE have been established (VPH-NoE Consortium 2009c). Each problem submission in GUIDE, by query or by a decision tree, will expose the relations between tools and use cases. The categorization of use cases with an ontology will be studied. GUIDE will also allow users to leave comments or questions about use cases, which will increase their relevance and their application domain definition. In this framework, an expected side effect of GUIDE is the creation of a community around use cases and associated discussions, a synergy between use case authors and GUIDE users.

(c) High-performance computing

Vast amounts of computing power are necessary to bridge the scale gap between the possible levels of description of biological systems from molecules to organisms (Kohl & Noble 2009). However, the multi-scale nature of the VPH also means that a multiplicity of resources are required, from the desktop to the most powerful supercomputers in the EU and beyond.

Within the EU, available computational infrastructures include EGEE (http://www.eu-egee.org/), providing low-end clusters, and DEISA (http://www.deisa.eu/), providing supercomputer class resources. The VPH-NoE has obtained access to both of these infrastructures for VPH-I researchers: to EGEE through the EGEE Biomedical Virtual Organization and to DEISA through a ‘Virtual Community’ allocation. Two million CPU hours have been awarded for 2009, and are being used by more than half of the current VPH-I projects. These allocations are being managed by the VPH-NoE through a community model, whereby each VPH-I project that wants to use an allocation nominates a ‘community champion’ who acts as a contact point between their project users, the VPH-NoE and the grid resource provider. This model removes the need for individual VPH-I projects to apply for and manage their own individual allocations on an annual basis.

To facilitate access to these infrastructures in a way that is transparent to the end user, the ToolKit also includes the Application Hosting Environment (AHE; Zasada & Coveney 2009). Its use of standards-compliant submission mechanisms means that it provides a single interface to grids running a multitude of different middleware stacks, without requiring users to learn specific details about how to access each HPC resource that they want to use. The AHE is designed to allow scientists quickly and easily to run unmodified, legacy applications on grid resources, manage the transfer of files to and from the grid resource, and monitor the status of the application. An expert user installs the application and configures the AHE server, so that all participating users (including clinicians) can share the same application.

VPH members can also leverage the power of two new-generation computing technologies, exemplified by GPUGRID (Buch et al. 2010). This uses accelerated processors such as graphics processing units (GPUs) for large-scale molecular dynamics (MD) simulations (Giupponi et al. 2008). Doing so requires specially developed computer code designed to run on the highly parallel GPU architectures; in the case of GPUGRID, the ACEMD program (Harvey et al. 2009) is used. GPUGRID is also a distributed computing project, which builds on the contribution of thousands of volunteers worldwide. Volunteers are often attracted by the possibility of contributing concretely to the achievement of a scientific goal, such as the knowledge of one specific disease. Dissemination strategies and communication of the projected impact, therefore, become vital in the visibility of such scientific projects and contribute to the public perception of the societal role of science (Rinaldi 2009). Among others, GPUGRID is used within a VPH-NoE exemplar project to quantify potentially adverse side effects of drugs on the heart.

The VPH-NoE is also deploying workflow tools for use by VPH researchers and making exemplar workflows available through the VPH ToolKit. Workflows are essential to the VPH-I effort: the goal of VPH research is to integrate simulations at different levels, which is essentially a workflow scenario. There are many such workflow tools available, but the one that most closely meets the needs of the current VPH-I projects is GSENGINE (Malawski et al. 2008), developed in the EU FP6 ViroLab project. It is coupled with a workflow repository system, meaning that workflows can be developed by expert researchers and then executed by clinicians and other researchers for given patients.

5. Discussion and conclusion

The VPH initiative is a powerful vision that, if achieved, will have a tremendous impact on the life of our citizens, with great potential for providing real human benefit. The field is vibrant, but the challenges of multi-scale multi-system research are huge. Central to this is the need for integration. Application of biomedical research outputs to clinical practice and the healthcare industries requires integrated data, information, knowledge and wisdom.

Within and beyond the VPH-NoE, significant effort has already been spent in developing technologies for the support of VPH research. To date, existing work has focused on specific research goals, but its adaptation for wider use promises greater benefits. However, such an approach in a wide context presents a significant challenge. In particular, there is a need to identify and promote relevant architectures, ontologies, data models, interfaces and procedure designs, to decrease the cost of the development of new advanced tools profiting from legacy software.

In this paper, we have argued that VPH ToolKit development must be organized around the community development of core standards, and this has been, and continues to be, its focus within the VPH. Sharing knowledge and expertise is vital, and to this end the ToolKit portal website is anticipated to be a key resource for the community. The first step towards an integrated environment for VPH collaboration has been rewarding but demanding. Equally, it is clear that many more challenging steps remain.

Acknowledgements

The VPH-NoE was supported by the European Commission DG Information Society through the Seventh Framework Programme of Information and Communication Technologies, under grant number 223920. The authors thank the other members of the VPH-NoE consortium for their contributions to the ToolKit.

Footnotes

    References

    View Abstract