Research Interests Summary
My current research focuses on the use of informatics techniques and tools to integrate and interpret life science data across a variety of domains. In particular I am interested in connecting biological models at multiple scales: the biochemical, molecular, cellular, organismal and ecological.
Genome sequencing is getting faster and cheaper, yet we still lack a full picture of how genes and other DNA elements interact with the environment to specify the development and functioning of a complex organism. Even for well-characterized model systems, the bulk of our biological knowledge is represented in forms opaque to computational processing (journal articles, reviews, etc). The Gene Ontology project was established to systematize this knowledge to allow for automated computational inference - for example, predicting the function of human genes based on phylogeny, or interpreting the expression patterns of genes regulated in diseases. I currently manage the GO software group, which produces software, resources and standards such as AmiGO, the GO database, OBO-Edit, TermGenie, the GO Galaxy Environment, and the obo-format specification.
I also contribute to the development of the ontology, in particular, the use of logic-based techniques to enhance the ontology, improve automated inference and integrate the GO with other resources.
Disease Phenotype Informatics
Our group is interested in the use of data mining ontology-based inference to help elucidate the molecular basis of disease phenotypes. Model systems provide a powerful way to understand how genome mutations can give rise to phenotypes in humans, yet making use of this complex heterogeneous data remains a challenge.
I am the creator of the OWLSim algorithm, which allows for the computation of the similarity of two organisms in phenotype space. I also developed the OBD database and reasoning system which is geared towards phenotype-based search. The methods are described in the paper Linking Human Diseases to Animal Models using Ontology-based Phenotype Annotation, and OWLSim analyses are used in tools such as mousefinder.
In a collaboration with the Neurosciences Information Framework (NIF) project, I developed a Phenotype Knowledge Base (PKB) system that allows neurodegenerative diseases to be automatically matched to model systems based on phenotypes in common. The system integrates phenotypes that are manifest at different scales, from the molecular up through the cellular to the level of gross neuroanatomy.
Our group is currently extending these techniques to determine the individual contributions of genes to phenotypes in Copy Number Variation (CNV) diseases.
Anatomics, Evolution and Development
I am the co-developer of uberon, an integrative metazoan anatomy ontology that unifies multiple independent species-centric ontologies and terminologies. I have also developed a number of extension ontologies, such as SPONGEBO (the sponge basic ontology).
I collaborate with the Phenoscape project on the development of ontology-based tools to integrate evolutionary systematics data with genome-phenotype data from model organisms. One of the projects I am working on is a formalization of homology that operates at both the gene level and organ level.
A unified computable representation of biology
One of my long term research goals is to render the bulk of biological and medical knowledge computable, allowing for a new generation of intelligent bioinformatics tools. Ontologies represent a modest first step along this path, but existing ontologies provide only shallow disconnected fragments of biology. One of the first challenges is coordinating the development of multiple ontology-based projects to ensure that the sum total provides as accurate, consistent and non-redundant picture of biology as possible. I am a co-founder and coordinating editor of the Open Bio-Ontologies Foundry, which was initiated to further this goal.
My current efforts focus on integrating:
- the Gene Ontology; see Mungall et al (JBI)
- the Cell Ontology; see Meehan et al
- Phenotype ontologies (human, mouse, worm, zebrafish, fly); see Mungall et al (Genome Biology)
- Environmental and ecological ontologies (ENVO, PO)
Logic-based modeling and programming
Whilst the majority of useful biological inferences are fuzzy or probabilistic, I believe that a firm foundation in logic is necessary for informatics applications and biological modeling.
The Web Ontology Language (OWL) provides a restricted subset of first order logic that is extremely useful for ontology development. I am currently working on opening up the capabilities of OWL to a number of different ontologies (for example, through Oort. I am particularly interested in the EL subset and rule-based extensions.
I am a proponent of logic-based languages such as Prolog, which allow for simple integration of multiple paradigms, including relational databases, ontologies, natural language processing and procedural programming. I developed a prolog grammar based system called Obol (and its successor Shoge) which has proved essential for parsing ontology terms to determine their inherent semantics.
Like many bioinformaticians of my era, I have done my fair share of perl hacking. I was previously a bioperl developer, and have written a handful of modules on CPAN. I have contributed software to the Generic Model Organism Database project, and am the co-designer of the Chado schema. I am to blame for go-perl and its successor go-moose, as well as obo-scripts.
Mungall, C. J., Torniai, C., Gkoutos, G. V., Lewis, S. E., and Haendel, M. A. (2012). Uberon, an integrative multi-species anatomy ontology Genome Biology 13, R5.
Christopher Mungall, Georgios Gkoutos, Cynthia Smith, Melissa Haendel, Suzanna Lewis, and Michael Ashburner.
Integrating phenotype ontologies across multiple species
Genome Biology, 11(1):R2, 2010.
Nicole L Washington, Melissa A Haendel, C J Mungall, Michael Ashburner, Monte Westerfield, Suzanna E Lewis (2009) Linking Human Diseases to Animal Models using Ontology-based Phenotype Annotation.PLoS Biology 7 (11)
Mungall, C. J., Bada, M., Berardini, T. Z., Deegan, J., Ireland, A., Harris, M. A., Hill, D. P., et al. (2010). Cross-Product Extensions of the Gene Ontology. Journal of biomedical informatics 44 (1), 80-86
J. Deegan, E. Dimmer, and C.J. Mungall.
Formalization of taxon-based constraints to detect inconsistencies in annotation and ontology development. BMC bioinformatics, 11:530, 2010.
Google profile: http://profiles.google.com/cmungall
EOL profile: http://eol.org/users/72083
Mendeley profile: http://www.mendeley.com/profiles/chris-mungall
Linked in: http://www.linkedin.com/in/chrismungall
OWL hacking blog: http://douroucouli.wordpress.com/
Logic programming blog: http://blipkit.wordpress.com