Network Component Analysis



Liao, J.C.; Boscolo, R; Yang, Y.L.; Tran,L.M.; Sabatti, C.; and Roychowdhury, V. (2003) Network component analysis: Reconstruction of regulatory signals in biological systems Proc. Natl. Acad. Sci. USA, 100: 15522–15527.
Download Appendices here.


High-dimensional data sets generated by high-throughput technologies, such as DNA microarray, are often the outputs of complex networked systems driven by hidden regulatory signals. Traditional statistical methods for computing low-dimensional or hidden representations of these data sets, such as principal component analysis and independent component analysis, ignore the underlying network structures and provide decompositions based purely on a priori statistical constraints on the computed component signals. The resulting decomposition thus provides a phenomenological model for the observed data and does not necessarily contain physically or biologically meaningful signals. Here we develop a method, called network component analysis, for uncovering hidden regulatory signals from outputs of networked systems, when only a partial knowledge of the underlying network topology is available. The a priori network structure information is first tested for compliance with a set of identifiability criteria. For networks that satisfy the criteria, the signals from the regulatory nodes and their strengths of influence on each output node can be faithfully reconstructed. This method is first validated experimentally using the absorbance spectra of a network of various hemoglobin species. The method is then applied to microarray data generated from yeast Saccharamyces cerevisiae and the activities of various transcription factors during cell cycle are reconstructed by using recently discovered connectivity information for the underlying transcriptional regulatory networks.

Kao, K.C.; Yang, Y.; Boscolo, R.; Sabatti, C.;Roychowdhury, V. and Liao, J.C. (2004) Transcriptome-based determination of multiple transcription regulator activities in Escherichia coli using network component analysis, Proc. Natl. Acad. Sci. USA, 101:641-646.

Cells adjust gene expression profiles in response to environmental and physiological changes through a series of signal transduction pathways. Upon activation or deactivation, the terminal regulators bind to or dissociate from DNA, respectively, and modulate transcriptional activities on particular promoters. Traditionally, individual reporter genes have been used to detect the activity of the transcription factors. This approach works well for simple, non-overlapping transcription pathways. For complex transcriptional networks, more sophisticated tools are required to deconvolute the contribution of each regulator. Here we demonstrate the utility of Network Component Analysis in determining multiple transcription factor activities based on transcriptome profiles and available connectivity information regarding network connectivity. We used Escherichia coli carbon source transition from glucose to acetate as a model system. Key results from this analysis were either consistent with physiology or verified using independent measurements.

Generalized Network Component Analysis
Tran, L.M.; M.P. Brynildsen; K. C. Kao; J. K. Suen and J. C. Liao (2005) gNCA: A framework for determining transcription factor activity based on transcriptome: Identifiability and numerical implementation, Metabolic Engineering. 7 (2005) 128–141

Network Component Analysis (NCA) is a network structure-driven framework for deducing regulatory signal dynamics. In contrast to classical approaches such as principal component analysis or independent component analysis, NCA makes use of the connectivity structure from transcriptional regulatory networks to restrict the decomposition to a unique solution. However, the existing version of NCA cannot incorporate information beyond the network topology such as information obtained from regulatory gene knock-outs that constrain the dynamics of regulatory signals. The ability of incorporating such information enables a more accurate and self-consistent analysis over different experiments and extends NCA to systems that may not satisfy the identifiability criteria of NCA. In this paper, we derive a generalized form of NCA, gNCA, which significantly expands the capability of transcription network analysis by incorporating regulatory signal constraints arising from genetic knockouts. The theoretical bases including criteria for uniqueness of solution and distinguishability between networks are derived. In addition, numerical techniques for robust decomposition are discussed. gNCA is then demonstrated using an Escherichia coli wild-type strain and an isogenic arcA deletion mutant during a carbon source transition.