Computation Institute Lunch Presentation
"High-throughput Reconstruction and Optimization of 140 New Genome-scale Metabolic Models"
DATE: April 9, 2009
TIME: 12:00 PM - 1:00 PM
SPEAKER: Christopher Henry, Mathematics and Computer Science Division, Argonne National Laboratory
LOCATION: Room A134, Bdg. 221, Argonne National Laboratory, RI405, 5640 S. Ellis Ave., University of Chicago
Description:
One of the desired end products of the genome annotation process is the development of a genome-scale metabolic model of the organism being annotated [1]. These models provide a means of predicting organism growth conditions, gene essentiality, phenotypic capabilities, response to genetic mutations, and metabolic engineering strategies [2]. Despite the great demand for genome-scale metabolic models, the rate of model development has lagged far behind the rate of genome sequencing. To improve the rate of development of new models, we have integrated a biochemical database, a set of thermodynamic estimations [3], a gap filling algorithm [4], and a model optimization algorithm[5] with the SEED framework for updating, correcting, and propagating annotations across hundreds of genomes simultaneously [6]. Within this framework we have implemented an automated pipeline for the high-throughput reconstruction and optimization of genome-scale metabolic models of prokaryotes [7], and we have applied this pipeline to produce functioning models for a diverse set of 140 organisms across 14 bacterial subdivisions. On average, these models comprise of 960 reactions associated with 624 genes covering 20% of the organism genome. Application of the gap filling algorithm resulted in the addition of an average of 83 reactions with no known corresponding genes. Whenever gene essentiality or phenotyping data was available, the model optimization algorithm was applied producing models with prediction accuracies that exceed 90%. Analysis of the reactions added by the gap filling process resulted in the following key discoveries; (i) by identifying the biomass components causing a reaction to be added by the gap filling, we were able to refine our biomass definitions for every organism modeled; (ii) by identify reactions added to many different models by the gap filling algorithm, we were able to determine portions of the metabolism of various genera for which additional annotation, curation, and experimental work is required; and (iii) by applying the completed models to the prediction of essential gene sets, we were able to identify the metabolic functions that were consistently essential in every organism as well as those functions that were essential in only a small subset of organisms.
1. Feist AM, Herrgard MJ, Thiele I, Reed JL, Palsson BO: Reconstruction of Biochemical Networks in Microbial Organisms. Nat Rev Microbiol 2009, 7(2):129-143.
2. Feist AM, Palsson BO: The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli. Nat Biotechnol 2008, 26(6):659-667.
3. Jankowski MD, Henry CS, Broadbelt LJ, Hatzimanikatis V: Group contribution method for thermodynamic analysis of complex metabolic networks. Biophys J 2008, 95(3):1487-1499.
4. Satish Kumar V, Dasika MS, Maranas CD: Optimization based automated curation of metabolic reconstructions. BMC Bioinformatics 2007, 8:212.
5. Henry CS, Zinner J, Cohoon M, Stevens R: iBsu1103: an improved genome scale metabolic model of B. subtilis based on SEED annotations. Genome Biol 2009:submitted.
6. Overbeek R, Disz T, Stevens R: The SEED: A peer-to-peer environment for genome annotation. Communications of the Acm 2004, 47(11):46-51.
7. DeJongh M, Formsma K, Boillot P, Gould J, Rycenga M, Best A: Toward the automated generation of genome-scale metabolic networks in the SEED. BMC Bioinformatics 2007, 8:-.
Authors: Christopher Henry, Matt Dejongh, Aaron Best, Paul Fryberger, and Rick Stevens
More Information:
Lunch will be provided at both locations