Saturday, November 24, 2007

CDM over yeast data

For generating chains in yeast data, we are considering the following information. We are identifying each yeast gene using their ynumbers. The mapping of the Ynumbers to other ids was downloaded from SGD.

1) Gene - gene
I am using the yeast transcriptional network published by
Harbison et al. in Nature 431: 99-104, 2004. This data has been split according to different stress conditions. There are 6229 genes. Under each stress condition, we would test if particular TFs have an effect on what genes. So we have a 2D matrix with genes along the rows and the interested TFs along the columns. I have 14 such conditions.

2) Gene - protein
I downloaded protein - protein interaction data for yeast from CYGD, SGD and DIP. I extracted the interacting gene ids and mapped them to their respective Ynumbers. All the three sets of interactions were combined. Totally, I have 5799 genes and 76125 interactions.

3) Gene - metabolite
I used the iND750 yeast metabolic network model published by Duarte NC et al,
Genome Res. 2004 Jul;14(7):1298-309. We considered the following ways to connect two genes sharing a metabolite
a) connect 2 genes acting on same metabolite in the same pathway
b) connect all genes in one pathway
c)

4) Gene - biochemical pathway
This data was downloaded from SGD. This data was biclustered too.

5) Gene - GO annotations
Data downloaded from GO website. This file was deposited on 10/12/2007.
I separated the Molecular function, biological process and cellular compartment relations.
The three relations were then biclustered.


No comments: