iViS Challenge Links - iViS Frequently Asked Questions

iViS - in Vivo - in Silico:

The Virtual Worm, Weed and Bug

Breathing Life into the Biological DataMountain

A GRAND CHALLENGE FOR COMPUTATIONAL SYSTEMS BIOLOGY

Date: 14th May 2004
Moderator: Ronan Sleep
School of Computing Sciences , University of East Anglia , Norwich NR74TJ, mrs@cmp.uea.ac.uk

We routinely use massively powerful computer simulations and visualisations to design aeroplanes, build bridges and to predict weather. With computer power and biological knowledge increasing daily , perhaps we can apply advanced computer simulation techniques to realise computer embodiments of living systems. This is the futuristic proposition of a research challenge proposed by UK Computer Scientists. The project - called in Vivo - in Silico ( iViS) aims to realise fully detailed, accurate and predictive computer embodiments of plants, animals and unicellular organisms. The iViS challenge developed from a proposal to UKCRC by David Harel to model a complete multi-cellular organism, for example the nematode worm C. elegans [1, 2], and Ronan Sleep's proposal to attempt computational modelling of gastrulation.

Initially the aims will be restricted to simple and much studied life-forms such as the nematode worm, the humble weed Arabidopsis, and single cell organisms such as streptomyces and bakers yeast: hence the subtitle 'the virtual worm, the weed and bug'. Rather remarkably, human genes can replace about 35% of the worm's, so there is reason to hope that many of the complex processes involved in a worm developing both form and function from a single cell may help us understand much about the workings of higher life-forms.

Potential benefits of iViS include an understanding of regeneration processes in plants and animals, with potentially dramatic implications for disease and accident victims. But iViS may also lead to revolutionary ways of realising complex systems: instead of designing and programming in excruciating detail, perhaps we can just grow them in a suitable medium. We know it's possible, because that's exactly what nature does with the worm, the weed and the bug.

The Vision

iViS offers a powerful vision in which future life scientists can take virtual fly-through tours of a plant, animal or colony of cells, studying what is happening at scales ranging from whole life-form to what goes on inside an individual cell, and stretching or shrinking time passage. Filters control what is seen by the observer, allowing concentration on specific aspects such as cell division, motility or chemical potential.

iViS is an attractive way of browsing an all that we know about a life-form. But it may offer more: with sufficient effort, it might be possible to raise the faithfulness of the underlying model to the point where it becomes predictive as well as descriptive . If this happens, it will become possible to perform meaningful observations and experiments in Silico. Specifically, an iViS model should faithfully exhibit the following phenomena: development from an initial fertilized cell to a full adult, cell function and interaction , motility and sensory behaviour, including possible interactions other life-forms. Virtual experiments (e.g. moving a virtual cell during development) should lead to the same outcomes as real life.

iViS and the Life Science Data Mountain

Computers are vital to the Life Sciences: they record, process, visualise and automatically distribute data, and even design and run experiments. They analyze the results in terms of large statistical and other mathematical models of biological processes.

But what do all the numbers, graphs, and spaghetti-like diagrams that emerge from the latest experiments all mean , and what can we do with this knowledge? Making it all fit together into a coherent and useful picture presents a major challenge - many biologists would say the major challenge - facing the life sciences. It takes years of experience to gain familiarity with even one sort of biological data, let alone the many new forms emerging. We urgently need integrating frameworks for organising the data .

This problem is now so pressing that the UK 's Biotechnology and Biological Research Council (BBSRC) is establishing a number of Centres for Integrative Systems Biology in partnership with relevant universities. These Centres will need the vision, breadth of intellectual leadership and research resources to integrate traditionally separate disciplines in a programme of international quality research in quantitative and predictive systems biology . iViS offers a challenging focus of attention for such centres .

iViS as a Driver for Global Knowledge Integration

Part of the answer to the data mountain may lie in the way in which the world wide web is revolutionising our approach to knowledge organisation. Before the Web (BW), the internet was used mainly to exchange emails and files of raw data. The web revolutionised the potential of the internet by allowing data owners to advertise the availability of their data together with information about how to display it in a browser window - this is what HTML does. Today's search engines means we can within seconds browse and access knowledge from a truly global repository.

The web is already a vital window on the world for scientists wishing to remain up to date. Groups of laboratories that previously worked at arms length and communicated infrequently only via journals and the conference circuit now converse via the web within seconds, swapping massive datasets to compare results. Scientists have begun to exploit the web by establishing global Virtual Knowledge Repositories (VKR) that share data, theories and models.

An example of a VKR is PHYSIOME ( http://www.physiome.org/ ), which supports " the databasing of physiological, pharmacological, and pathological information on humans and other organisms and integration through computational modelling. "Models" include everything from diagrammatic schema, suggesting relationships among elements composing a system, to fully quantitative, computational models describing the behaviour of physiological systems and an organism's response to environmental change. Each mathematical model is an internally self-consistent summary of available information, and thereby defines a "working hypothesis" about how a system operates. Predictions from such models are subject to test, with new results leading to new models ".

Virtual Knowledge Repositories like Physiome will help reduce the proliferation of models and theories that explain parts of the global mountain of life science data. But this in turn will create a deeper challenge: instead of fitting raw data pieces together, we will be faced with the problem of making the models fit into a consistent larger model. Sometimes this will be easy, for example when there is a simple input-output relationship between subsystems. More often - perhaps the rule rather than the exception - combining two models will show unexpected interactions inconsistent with in vivo data.

Part of the problem is that mathematical models deal in numbers, devoid of meaning. The latest evolution of web technology - the semantic web - may help fix this. There is now provision for the web to enhance raw data with additional information called metadata . This can tell the recipient what the data means, how it is represented, the way in which it was generated. Models, which often come in the form of a computer program, can be tagged with metadata which is effectively an inbuilt instruction manual.

Work has begun in earnest on establishing and the resulting metadata dictionaries (called ontologies ) for the life-sciences. Already there are over 40 of them: see http://obo.sourceforge.net/ a list. So the drive and energy to create bio-ontologies is already very active. But there is not the same drive to draw them together into a unified whole. The iViS challenge provides just such drive, because the in Silico modelling of a complete life-form , will require harmonious working across all relevant ontology boundaries.

Even if we can build a simulation of a life-form that successfully integrates all known data and models, it is not at all clear that genuinely interesting new predictions will emerge. The process may suffer from something called overfitting : yes, we can build a model consistent with known data, but it may lack any predictive power. For example, we can always find a polynomial of degree N for fit (N-1) data points exactly . Avoiding such overfitting is a very strong reason for complementing data-driven modelling work on iViS with more abstract top-down approaches.

But even with a careful mix of approaches, can iViS ever realise its controversial predictive aims? Computer simulation may work for aeroplanes and bridges, maybe living systems are just too complex for this to work. One answer is to accept that there will be aspects of some biological domains which remain well outside the scope of even the most sophisticated computer modelling for many decades - perhaps for ever. But there will be at least some domains which succumb to iViS's whole life form modelling approach: developmental biology looks a good bet.

Meeting the Challenge: iViS research themes

The obvious targets for iViS models are the organisms selected for special attention by biologists for over a century. These range from single cell life-forms such as yeast or streptomyces, through model plants such as arabidopsis and maize to creatures such as the nematode worm, the fruit fly, and the squid.

But how can we 'breathe life into data' via computer simulation? This is not simply a question of computing speed or memory, but how to represent the mass of known data as a set of interacting computational processes. We can get computers to simulate massively complex aircraft or bridges, but getting them to 'grow a worm, weed or bug' is significantly beyond the current state of the art.

Nevertheless, it may not be impossible if we build determinedly on the considerable body of work underway to explore ways of organising life science data. One example is the Edinburgh Mouse Atlas Project (http://genex.hgu.mrc.ac.uk/):

"The emap Atlas is a digital Atlas of mouse embryonic development. It is based on the definitive books of mouse embryonic development . yet extends these studies by creating a series of interactive three-dimensional computer models of mouse embryos at successive stages of development with defined anatomical domains linked to a stage-by-stage ontology of anatomical names."

It can be expected that growing numbers of life science virtual knowledge centres will follow EMAP in adopting some form of spatio-temporal framework. The role of iViS is to expand this vision to a dynamic 3-D working model, initially targeting much simpler life-forms.

There are a number of research strands in the Computing Sciences needed to support the aspirations of iViS. We might bundle them under the heading: Computational Models and Scaleable Architectures for in Silico Life Sciences . Some strands will work bottom-up, paying great attention to biological data. Other strands will work top-down, studying minimal abstractions capable of generating the phenomena exhibited in vivo. Many will work 'middle-out', balancing the desire to be simple, elegant and general with the desire to be faithful to the data.

Key to success will be the development of a new breed of computer languages for representing and manipulating biological data in a meaningful way, and using it do drive a realistic, highly detailed, simulation which can be explored using advanced interfaces.

Groups of Computer Scientists are already exploring new languages, architectures, and system design and analysis tools for the life sciences. See for example Luca Cardelli of Microsoft ( http://www.luca.demon.co.uk/ ) for a good picture of this work. Cardelli and others are tackling the complexities of life science head on, developing industrial-quality models aimed at handling the masses of detail in a living system.

These efforts are complemented by those looking to discover simple more abstract computational systems from which emerge various life-like properties. Such abstract models are particularly useful when viewing some particular aspect of a plant, animal or cell. For example Prusinkiewicz ( http://www.cpsc.ucalgary.ca/Research/bmv ) has almost created a new art form for constructing good-looking pictures of plant growth, using remarkably simple abstract models called L-systems . Of course these models capture perhaps only a tiny part of the truth, but iViS may need such simplifying frameworks to structure the great mass of detail.

What about raw computer power and advances in haptic and other interface technologies? Of course they will be needed: but some will emerge anyway from the IT industry without special prompting. The critical problem is to get the underlying computational models right: once we have these, we can begin - as it were - serious work on designing the building and sorting out exactly which of the contemporary technologies we should use to actually construct a real iViS prototype.

Demonstrators and Outline Roadmap

iViS progress will surface as a number of key demonstrators, culminating in whole life form models covering a wide range of phenomena. Intermediate demonstrators will cover a narrower range. Modelling the development of form during development is one example, motivated by the following quote:

Perhaps no area of embryology is so poorly understood, yet so fascinating, as how the embryo develops form. Certainly the efforts in understanding gene regulation have occupied embryologists, and it has always been an assumption that once we understand what building blocks are made, we will be able to attack the question of how they are used. Mutations and gene manipulations have given insight into what components are employed for morphogenesis, but surely this is one example where we need to use dynamic imaging to assess how cells behave, and what components are interacting to drive cell movements and shape changes [1]

A number of groups are already involved in developmental modelling - see the iViS website (???). A rather speculative timeline is:

Within 5 years: early results on e.g. developmental phenomena in plants and animals, and first unicellular demonstrations.

Within 10 years : first prediction of a textbook result from an assembly of component models; models of arabidopsis meristem growth; models of simple animal development.

2017, within 13 years the D'Arcy milestone (100 years after the publication of D'Arcy Thompson's paper "On Growth and Form"): first demonstration of iViS (e.g. for Arabidopsis).

Within 20 years: iViS models form the core of many Virtual Knowledge Centres for the life sciences.

References

[1] D. Harel, "A Turing-Like Test for Biological Modeling", Nature Biotechnology 23 (2005), 495-496.
[2] D. Harel, "A Grand Challenge for Computing: Full Reactive Modeling of a Multi-Cellular Animal", Bulletin of the EATCS, European Association for Theoretical Computer Science, no. 81, 2003, pp. 226-253. (Reprinted in Current Trends in Theoretical Computer Science: "The Challenge of the New Century", Algorithms and Complexity, Vol.1 (Paun, Rozenberg and Salomaa, eds.), World Scientific, pp. 559-568, 2004).

iViS (In Vivo - In Silico) - Grand Challenge Website

iViS Challenge

Resources

Categories