Speaking through Graphs
(August 25th, 2016) Biologists often use pathway diagrams to communicate results, but methods for drawing diagrams are about as plentiful as biologists. Researchers at the University of Edinburgh have developed a scheme that combines graphical notation with pathway simulation.
One of the greatest challenges in biology today is making sense of the vast amounts of data obtained through genome-scale analyses. The advent of sequencing technology has made gathering data on molecule interactions much easier, but biologists use many different methods and standards for communicating their results.
Often, they draw diagrams but these diagrams, in most cases, only serve to enhance an accompanying text – viewed independently, they make only vague sense. There is no universal graphical language that makes the biological pathway diagrams comprehensible on their own.
Geneticist Tom Freeman and colleagues at the Roslin Institute and Royal (Dick) School of Veterinary Studies at the University of Edinburgh have developed a graphical language, termed the modified Edinburgh Pathway Notation (mEPN), which they believe will help biologists consolidate and communicate their data via diagramming. The system they developed was outlined recently in a paper in PLoS Biology, and examples of their work are available on their website. A tool for performing pathway simulations is also described.
“In general, there isn’t really any formalised way that we draw diagrams today, so every person who wants to make a review or show how their results reflect the known literature will draw them the way they feel that they want to,” says Freeman. “The problem with this is that none of the drawings they make are comparable because the way in which they represent things is totally different.”
In the past few years, there has been a movement to formalise the way, in which biologists depict the proteins, complexes, and biochemicals that make up a biological system. According to Freeman, the Systems Biology Graphical Notation (SBGN) project, established in 2009, was one of the first to propose a set of standard ways to draw diagrams. However, SBGN is not without its problems. “It doesn’t allow you to represent a lot of things that you might want to represent, and the way that it’s composed isn’t particularly biologist-friendly,” says Freeman. The underlying language, for example, is based primarily in computer science.
Although mEPN borrows concepts from SBGN, it is more user-friendly and includes key modifications, including the ability to use diagrams to simulate a pathway’s activity, which Freeman believes will make it much more valuable for biologists. “I think the great beauty of combining the graphical modelling with activity modelling is that we can not only describe what we think we know, we can test whether the system could work as represented. This then allows us to better design the next experiment and explain the results once obtained,” he says.
In developing mEPN, Freeman and colleagues aimed to include as much detail in the system as possible while maintaining readability. The project has been in the works for nearly 10 years. In addition to collaborating with scientists, Freeman has been working with undergraduate students. He gives them the tools to draw a given system and then sees where they run into trouble, using their feedback to modify the language so that it better suits its purpose.
The graphical representation system consists of a series of nodes and edges. The nodes represent pathway entities - such as biomolecules - and processes, events such as binding or phosphorylation that occur among the entities. Pathway entities are represented by a glyph of a particular shape, while processes are shown as a circle with a two to three character label. The edges are lines joining the nodes that show the type of action - catalysis or inhibition, for example - and the direction in which it occurs.
“We’ve gotten to the point where I think we have a graphical language that works quite well, that is hopefully quite intuitive to biologists,” says Freeman. “The next step was looking at how we might use these models for simulating the activity of pathways.”
To simulate this activity, Freeman and colleagues used a mathematical modelling concept known as a Petri net, a formalised method of representing systems as varied as business processes and software design. The researchers have employed a stochastic algorithm that allowed them to simulate flow through a system, and then modified that algorithm to run in their software.
“The actual way of combining that graphical representation with a computational modelling system that allows you to not only just draw what it is that you know but to run simulations through them allows a different level of complexity to be understood,” says Freeman.
“So if you start with, let’s say, a hormone, and it binds to its receptor, you can then follow the flow of that information through the receptor, down through the signalling pathway, maybe leading on to transcription,” says Freeman. The simulation capability also allows users to create virtual knock-outs and to adjust starting conditions to determine the impact of a particular entity.
Freeman hopes that making mEPN available will encourage more biologists to adopt pathway modelling. He offers that for him it has been very useful for formalising information and unearthing inconsistencies. “You can read as many reviews as you want, but your understanding of how the system works is in your head,” he says. Drawing a diagram using a set of rules allows biologists to explore the limits of their knowledge.
Upon diagramming, he says, researchers will often find that a vital piece of information is missing, or see that a system can’t work the way they thought it did. “Part of the advantage of doing this is the modelling process itself - the whole process of formally describing how a system works is incredibly informative,” he says. “In so doing, you realise not only what you do know, but what you don’t know.”
The researchers wanted the system to be approachable to junior biologists, since in many cases they would be the ones responsible for making up the diagrams. In addition to involving his undergraduate students in the development, he also finds it an effective teaching tool: “It’s a nice way in which to get any student to sit down and mine the literature, and to formalise what they understand from the literature as a diagram.”
Freeman says that although drawing diagrams is not intuitive for most people and initially can be a challenge - many biologists aren’t accustomed to making diagrams using a particular set of rules, and there is a certain artistry involved - he believes it can be learned by most within a matter of weeks.
“I wouldn’t suggest that what we have proposed is particularly easy to adopt,” he says. “It’s not something that people have been trained to do, and therefore it’s a challenge.” Converting a system as it’s described in the literature to a graphical model can be tricky conceptually, and key bits of information will often be missing. It can also be tempting to start wandering off into different systems.
But Freeman maintains that the benefits of diagramming for formalisation and collaboration far outweigh the effort required to adopt the system. “If it wasn’t challenging, this would have been done years ago,” says Freeman. “But I think it’s vital that we do this, because actually the rewards for doing it are quite high.”