Skip to Main Content U.S. Department of Energy
Graphic: Environmental Biomarkers Banner

Biological Response Pathways

PNNL is developing a Bayesian statistics framework that integrates disparate data sets for a more robust approach to identifing patterns of biological response and discovering relevant biomarkers of exposure or response.

Safe Nano - Biological Response Pathways navigation image

The majority of research on biological data integration has focused either on extremely targeted problems, such as protein-protein interactions, or diagnostics (e.g., cancerous or not). Data integration from multiple technologies promises to be a more robust approach to uncovering hidden patterns in underlying biological response, which may be as complicated as shifts in entire populations, or as simple as a set of peaks on a spectrum. This data integration requires developing and using powerful statistical approaches to manage the differences in resolution from various instruments.

Safe nano capabilities

Improving classification accuracy using the integrated approach.
Enlarged View

PNNL's statistical framework is using the powerful property of Bayesian statistics to introduce prior knowledge into the model. This is challenging because all experimental data must be represented as probability models. Building the framework requires four core technical developments: (1) an interface for access to disparate sources of data; (2) probability mappings of the data to perform Bayesian integration; (3) diagnostic models of exposure or response; and (4) derivation of biological models for biomarker discovery. In FY 2008, PNNL applied the integration approach for two biological problems on vastly different scales: pathogen exposure in a mouse model (described in the figure here) and uranium exposure in a periphyton community (described under environmental sustainability). In addition, we developed a visualization tool for communication of results.

The figure shows results of pathogen exposure in mouse models, where mice were exposed to one of three organisms, Francisella tularensis subsp. Novicida, an avirulent mutant of Francisella novicida, or Pseudomonas aeruginosa. The mice were evaluated using four approaches at 4 and 24 hour time points to determine if markers of exposure were present at a pre-symptomatic state; (1) proteomics by high resolution mass spectrometry, (2) proteomics via matrix Assisted Laser Desorption Ionization (MALDI), (3) metabolomics via Nuclear Magnetic Resonance (NMR), and (4) cell count data traditionally collected in most immunology laboratories. We demonstrated that the integrative approach could improve the classification accuracy of the mice into exposure groups to 89% over 83% for any single high-throughput proteomic and metabolomic dataset. Adding the cell count data improves the classification accuracy further to 94%, despite the cell count data having poor discriminating power individually as seen in the Figure.

Publication

Webb-Robertson BJM, LA McCue, N Beagley, JE McDermott, DS Wunschel, SM Varnum, JZ Hu, NG Isern, GW Buchko, K Mcateer,JG Pounds, SJ Skerret, and CW Frevert. 2009. "A Bayesian Integration Model of High-Throughput Proteomics and Metabolomics Data for Improved Early Detection of Microbial Infections." In Pacific Symposium on Biocomputing. 14:451-463

Contact:

Environmental Biomarkers

Research Capabilities

More Informations

Safe Nano - Enhanced Tools and Research Capabilities