|
Systems biology is a
comprehensive quantitative analysis utilized to understand in which
way all the components of a biological unit interact functionally
over time. The joint behavior of a set of genes o phenomena within a
system allow to identify significant changes where the variations
are not significantly different. It is only in the coherent behavior
of an inferior level, associated with a higher-level entity that
makes a pattern evident.
In our lab, in
collaboration with the Columbia Genome Center
(MAGNet - Multiscale Analysis of Genomic and
Cellular Networks) we are currently
trying to understand the functioning of the human body during health
and disease by applying the systems biological approach.
Wide genome array
technology is being adopted rapidly by the medical community. After
identifying a medical problem by methods of outcomes research,
translational research, including
gene expression methods may help in problem solving. Excellent reviews have been published during the past
years and we have linked them below so they can be easily retrieved.
The first step is an
appropriate study design
with an adequate number of biological
(different 'biological' cases) and technical (same case repeated one
or more times) replicates. It seems that a minimum of
5 biological cases per experiment seems to be adequate for experiments comparing two different groups. Of great importance is
the minimization of the systematic 'methodological' error which is
highly related to different conditions of processing the samples,
for example on different days or by different
operators.
Evaluation of the
quality of the arrays by an experienced researcher is
important in the preprocessing phase of the experiment which include
image analysis, normalization and transformation of the data.
Once the data is
prepared, different analytical algorithms that have been developed
in "user-friendly" interphases can be helpful to extract
conclusions. One of the challenges is to identify which genes become
differentially regulated in relation to a disease or condition and to understand the meaning of the
findings as the researcher will probably be faced to hundreds or
thousands of genes that become differentially regulated. For this
purpose, several tools are being continuously developed by several
expert multidisciplinary groups.
As elegantly described by
Eisen
et
al., Hierarchical clustering is a method to Represent complex gene
expression data by statistical organization and graphical display.
By using this approach, relationships among objects (genes) are
represented by a tree whose branch lengths reflect the degree of
similarity between the objects. The computed trees can be used to
order genes so that genes or groups of genes with similar expression
patterns are adjacent and can then be displayed graphically. Genes
with similar expression patterns are likely to represent similar
biological processes.
Widely accepted is "SAM"
('Significance Analysis of Microarrays'),
a software package developed at Stanford University. Detailed
information can be found on the SAM-Stanford
website. Of
high interest for the development of gene classifiers is the "PAM"
('Prediction Analysis of Microarrays') algorithm, useful and powerful
to identify a small set of genes that are highly correlated with
certain biological or pathological processes and use these genes for
developing tools that allows screening and prevention of a given
disease.
The understanding of the
biology, the real meaning of the findings is a major challenge. The
Gene Ontology (GO)
project addressed this problem and developed a three structured,
controlled vocabulary (ontologies) that describe gene products in
terms of their associated biological processes, cellular components
and molecular functions in a species-independent manner grouping
genes according to the cellular process, compartments or biological
function grouping them into well defined GO terms, but again it
become difficult to extract conclusions when dealing with thousands
of genes at the same time. The Genomics and Bioinformatics group at
the NIH, took a leading role in developing tools that
facilitate the extraction of accurate information in a batch
processing scale. Highthroughput GoMiner website
can be accessed from our links page and the reference below.
Gene Set Enrichment
Analysis (GSEA) is a computational method that determines
whether a set of genes shows concordant differences between two
biological states or phenotypes.
The method focuses on gene sets groups of genes that share common
biological function, chromosomal location or regulation.
The reconstruction of
cellular networks using reverse engineering algorithms is a field
which is constantly evolving. The Columbia
Genome Center (MAGNet - Multiscale Analysis
of Genomic and Cellular Networks) is
taking a leading role in the development of this field under the
direction of Dr. Andrea Califano.
Dr. Califano and his group published an
Algorithm for the Reconstruction of Accurate Cellular Netowrks. We
believe that the application of these systems biology approach to
the understanding of complex biological systems in health and
disease is highly relevant and promissory towards the development of
a predictive, preventative and personalized approach in heart
transplantation medicine.
GeneWays
is a system developed by
Dr.
Andrey Rzhetsky
at Columbia University
for automatically extracting, analyzing, visualizing and integrating
molecular pathway data from the research literature focusing on
interactions between molecular substances and actions that can be
graphically displayed.
Links and
bibliography of interest
Reviews
Allison DB, Cui X, Page GP, Sabripour M.
Microarray data analysis: from disarray to
consolidation and consensus.Nat
Rev Genet. 2006 Jan;7(1):55-65. Review. Erratum in: Nat Rev
Genet. 2006 May;7(5):406.
Segal E, Friedman N, Kaminski N, Regev A, Koller D.
From signatures to models: understanding
cancer using microarrays. Nat Genet. 2005 Jun;37 Suppl:S38-45.
Barabasi AL, Oltvai ZN.
Network biology: understanding the cell's
functional organization. Nat Rev Genet. 2004 Feb;5(2):101-13
Holloway AJ, van Laar RK, Tothill RW, Bowtell DD.
Options available--from start to finish--for
obtaining data from DNA microarrays II. Nat Genet. 2002 Dec;32
Suppl:481-9. Review.
Churchill GA.
Fundamentals of experimental design for cDNA microarrays. Nat Genet.
2002 Dec;32 Suppl:490-5.
Quackenbush J. Microarray
data normalization and transformation.
Nat Genet. 2002 Dec;32 Suppl:496-501.
Slonim DK. From patterns
to pathways: gene expression data analysis comes of age.
Nat Genet. 2002 Dec;32 Suppl:502-8.
Yang YH, Speed T.
Design issues for cDNA microarray experiments. Nat Rev Genet. 2002
Aug;3(8):579-88.
Lockhart DJ, Winzeler EA. Genomics, gene expression and DNA
arrays. Nature. 2000 Jun 15;405(6788):827-36.
Methods
Zeeberg BR, et al.
High-Throughput
GoMiner, an 'industrial-strength' integrative gene ontology tool for
interpretation of multiple-microarray experiments, with application
to studies of Common Variable Immune Deficiency (CVID). BMC
Bioinformatics. 2005 Jul 5;6:168.
Basso K, Margolin AA, Stolovitzky G, Klein U, Dalla-Favera R,
Califano A. Reverse
engineering of regulatory networks in human B cells.Nat Genet. 2005
Apr;37(4):382-90.
Rzhetsky A, et al.
GeneWays: a system for extracting, analyzing, visualizing, and
integrating molecular pathway data. J Biomed Inform. 2004
Feb;37(1):43-53.
Tibshirani R, Hastie T, Narasimhan B, Chu G.
Diagnosis of multiple cancer types by
shrunken centroids of gene expression. Proc Natl Acad Sci U S A.
2002 May 14;99(10):6567-72.
Tusher VG, Tibshirani R, Chu G.
ignificance analysis of microarrays applied to
the ionizing radiation response.Proc Natl Acad Sci U S A. 2001 Apr
24;98(9):5116-21.
Alter O, Brown PO, Botstein D.
Singular value decomposition for
genome-wide expression data processing and modeling. Proc Natl Acad
Sci U S A. 2000 Aug 29;97(18):10101-6.
Golub TR, et al. Molecular
classification of cancer: class discovery and class prediction by
gene expression monitoring. Science. 1999 Oct 15;286(5439):531-7.
Eisen MB, Spellman PT, Brown PO, Botstein D.
Cluster analysis and display of genome-wide
expression patterns. Proc Natl Acad Sci U S A. 1998 Dec
8;95(25):14863-8.
Weinstein JN,
et al.
An information-intensive approach to the
molecular pharmacology of cancer. Science. 1997 Jan
17;275(5298):343-9.
Open source platforms
and software for academic use
GeWorkbench (genomics Workbench) is a Java-based open-source
platform for integrated genomics. GeWorkbench is the Bioinformatics
platform of
MAGNet, the National Center for the Multi-scale Analysis of
Genomic and Cellular Networks.
Genesis Institute for
Genomics and Bioinformatics, by Alexander Sturn at Graz University
of technology and is available free of charge to academic,
government, and other nonprofit institutions for noncommercial,
nonprofit internal research purposes.
SAM
'Significance Analysis of Microarrays' (Stanford University)
PAM
'Prediction Analysis of Microarrays' (Stanford University)
HTGM
'High-Throughput GoMiner' web interface (Genomics and Bioinformatics
Group, National Cancer Institute)
GSEA 'Gene Set
Enrichment Analysis' (Broad Institute, Massachusetts Institute of
Technology)
**The information
content of this page is only intended to provide easy access to
information for the non experienced reader. Several papers and
software sharing similar quality or characteristics may have been
unintentionally omitted. We welcome your feedback and suggestions.
|