2004 Fiscal Year Final Research Report Summary
Construction of Estimation System of Gene Expression Value by Analyzing Microarray Image
Project/Area Number |
15500202
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Bioinformatics/Life informatics
|
Research Institution | National Institute of Agrobiological Sciences |
Principal Investigator |
TAKEYA Masaru National Institute of Agrobiological Sciences, Genetic Diversity Department, Senior Researcher, 遺伝資源研究グループ, 主任研究官 (00355728)
|
Co-Investigator(Kenkyū-buntansha) |
IWAMOTO Masao National Institute of Agrobiological Sciences, Plant Physiology Department, Senior Researcher, 生理機能研究グループ, 主任研究官 (90370642)
TSUMURA Norimichi Chiba University, Department of Information and Image Science, Associate Professor, 工学部, 助教授 (00272344)
MIYAKE Yoichi Chiba University, Department of Information and Image Science, Professor, 工学部, 教授 (70027895)
|
Project Period (FY) |
2003 – 2004
|
Keywords | Microarray / Mixture Distribution Model / Gene Expression Value Estimation / Image Analysis |
Research Abstract |
1.The gene expression profile is often calculated from intensity value of spot in microarray. This numerical transformation leads to lose the other potential information on microarray image. We developed a tool to detect some kinds of statistical characteristics of pixel on each spot, and edited a data table with gene expression and statistical characteristics on each spot. 2.Microarray experiments involve a large number of error-prone steps that lead to a high level of noise in the resulting data. This noise reduces the accuracy of gene expression analysis. Scatter plots of double-spotted signals obtained from cDNA microarrays are actually represented as two-dimensional, rather than linear, distributions. With multi-noise sources, deviations from ideal linear behavior essentially reflect random fluctuations. We herein proposed a technique for estimating gene expression values for duplicated data on cDNA microarrays. For this estimation, distribution of all duplicated data on one slide is modeled as a probability distribution. In the scatter plots, the distribution is constructed from a mixture of normal two-dimensional distributions, which represent fluctuations in gene expression values due to noise. An EM algorithm is used for estimating the modeling parameters. The probability that duplicated data is shifted by noise is calculated using Bayesian estimation. Total RNAs extracted from rice leaves at 4-hour intervals on the same day were used for 6 rice cDNA microarray assays. These 6 data sets were used to test the proposed technique. Genes in the data sets were subjected to clustering based on probability of true value. Clustering successfully identified candidate genes regulated by circadian rhythms in rice.
|