Construction of Estimation System of Gene Expression Value by Analyzing Microarray Image

Research Project

Project/Area Number	15500202
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Bioinformatics/Life informatics
Research Institution	National Institute of Agrobiological Sciences
Principal Investigator	TAKEYA Masaru National Institute of Agrobiological Sciences, Genetic Diversity Department, Senior Researcher, 遺伝資源研究グループ, 主任研究官 (00355728)
Co-Investigator(Kenkyū-buntansha)	IWAMOTO Masao National Institute of Agrobiological Sciences, Plant Physiology Department, Senior Researcher, 生理機能研究グループ, 主任研究官 (90370642) TSUMURA Norimichi Chiba University, Department of Information and Image Science, Associate Professor, 工学部, 助教授 (00272344) MIYAKE Yoichi Chiba University, Department of Information and Image Science, Professor, 工学部, 教授 (70027895)
Project Period (FY)	2003 – 2004
Project Status	Completed (Fiscal Year 2004)
Budget Amount *help	¥3,700,000 (Direct Cost: ¥3,700,000) Fiscal Year 2004: ¥1,700,000 (Direct Cost: ¥1,700,000) Fiscal Year 2003: ¥2,000,000 (Direct Cost: ¥2,000,000)
Keywords	Microarray / Mixture Distribution Model / Gene Expression Value Estimation / Image Analysis
Research Abstract	1.The gene expression profile is often calculated from intensity value of spot in microarray. This numerical transformation leads to lose the other potential information on microarray image. We developed a tool to detect some kinds of statistical characteristics of pixel on each spot, and edited a data table with gene expression and statistical characteristics on each spot. 2.Microarray experiments involve a large number of error-prone steps that lead to a high level of noise in the resulting data. This noise reduces the accuracy of gene expression analysis. Scatter plots of double-spotted signals obtained from cDNA microarrays are actually represented as two-dimensional, rather than linear, distributions. With multi-noise sources, deviations from ideal linear behavior essentially reflect random fluctuations. We herein proposed a technique for estimating gene expression values for duplicated data on cDNA microarrays. For this estimation, distribution of all duplicated data on one slide is modeled as a probability distribution. In the scatter plots, the distribution is constructed from a mixture of normal two-dimensional distributions, which represent fluctuations in gene expression values due to noise. An EM algorithm is used for estimating the modeling parameters. The probability that duplicated data is shifted by noise is calculated using Bayesian estimation. Total RNAs extracted from rice leaves at 4-hour intervals on the same day were used for 6 rice cDNA microarray assays. These 6 data sets were used to test the proposed technique. Genes in the data sets were subjected to clustering based on probability of true value. Clustering successfully identified candidate genes regulated by circadian rhythms in rice.

Report

(3 results)

2004 Annual Research Report Final Research Report Summary
2003 Annual Research Report