Debopriya Das1, Nilanjana Banerjee1,2 and Michael Q. Zhang1
1Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724
2George Mason University, School of Computational Sciences, 10900 University Boulevard, Manassas, VA 20110
PNAS 101, 16234-16239 (2004)
Cooperativity between transcription factors is critical to gene regulation. Current computational methods do not take adequate account of this salient aspect. To address this issue, we present a new computational method based on MARS (Multivariate Adaptive Regression Splines) to correlate the occurrences of transcription factor binding motifs in the promoter DNA and their interactions to the logarithm of the ratio of gene expression levels. This allows us to discover both the individual motifs and synergistic pairs of motifs that are most likely to be functional, and enumerate their relative contributions at any arbitrary time point for which mRNA expression data is available. We present results of simulations and focus specifically on the yeast cell-cycle data. Inclusion of synergistic interactions can increase the prediction accuracy over linear regression to as much as 1.5-3.5 fold. Significant motifs and combinations of motifs are appropriately predicted at each stage of the cell cycle. We believe our MARS based approach will become more significant when applied to higher eukaryotes, especially mammals, where cooperative control of gene regulation is absolutely essential.