2016 Fiscal Year Annual Research Report
マルチオミックス解析による遺伝子発現制御領域内のがん化を導く変異の予測
Project/Area Number |
15F15385
|
Research Institution | Tokyo Medical and Dental University |
Principal Investigator |
角田 達彦 東京医科歯科大学, 難治疾患研究所, 教授 (10273468)
|
Co-Investigator(Kenkyū-buntansha) |
LOPEZ ALVAREZ YOSVANY 東京医科歯科大学, 難治疾患研究所, 外国人特別研究員
|
Project Period (FY) |
2015-11-09 – 2018-03-31
|
Keywords | mutations / transcription factors / promoter regions / amino acids / succinylation prediction |
Outline of Annual Research Achievements |
In this year we surveyed a list of transcription factors with documented functions in different human cancers. We downloaded the motif matrices of more than 2,000 factors and determined their binding site preferences throughout the human genome. Subsequently, we retrieved a collection of annotated somatic mutations in promoter regions from the COSMIC repository. After assigning the mutations to downstream genes we computed those highly mutated factors per cancer. In addition, we tried to incorporate information of a new post-translational modification mark for specific factors. This mark called succinylation is known to be involved in cancer. We retrieved a list of proteins with annotated succinylated lysines and calculated structural features of amino acids for accurate prediction results. In order to compute three structural properties: accessible surface area, torsion angles and secondary structure, we used the recently published tool SPIDER2. SPIDER2 returned numerical vectors that contained the structural information of each protein. Each lysine residue was then described as one vector by considering its 15 downstream and 15 upstream amino acids. Consequently, a large set of positive and negative lysines was created. This training set was preprocessed to reduce class imbalance and used to train a pruned decision tree. As a result, we proposed a new approach that outperformed current predictors in the literature. This new method was used to identify succinylated lysines whose information could be combined with genomic data for predicting cancer phenotypes.
|
Current Status of Research Progress |
Current Status of Research Progress
1: Research has progressed more than it was originally planned.
Reason
We consider that our research could be evaluated as excellent during the last year. In this time, we were able to analyze somatic mutations affecting transcription factors in different types of human cancers, such as liver, pancreas, intestine, among others. Based on such information we found those genomic intersections within promoter regions, which could be important in the development of cancer cells. Because we also wanted to incorporate information related to proteins, we further began to research on a new post-translational modification called succinylation. In this research area, we were able to develop new approaches to accurately predict succinylation sites of regulatory proteins involved in cancer development. Part of this work was summarized in a scientific paper, and recently published in the Journal of Analytical Biochemistry. This first paper used a set of structural features of amino acids for predicting succinylated lysine residues. Such characteristics included accessible surface area, backbone torsion angles and secondary structure. Besides we were able to continue the development of better prediction methods which outperformed current benchmark predictors in the literature, thus making with our research an important contribution to this new research field. While we are only referring to those published papers we have two more manuscripts under review right now. This study of post-translational modification marks is intended to be integrated into genomic and mutation information related to human cancers in the near future.
|
Strategy for Future Research Activity |
Our future research will continue to focus on deciphering somatic mutations in promoter regions that are also involved in cancer development. A computational model able to combine different omics data will be our final goal. Promoter regions have been consistently reported to contain a significant number of somatic mutations, which affect the binding mechanism of transcription factors and therefore the expression of downstream genes. Besides a long list of studies has also reported the importance of regulatory proteins in the development of cancer. The future plan is partly aimed at writing and publishing three more papers related to succinylation in proteins. In addition, these new approaches will be integrated in a software whose application note will be submitted for publication. The information of succinylated transcription factors will be integrated into genomic data for prediction of cancer phenotypes. Another topic of vital importance is the determination of the tissue of origin of metastatic cancers. This plan is aimed at identifying the tissue of origin by using somatic mutations in both gene bodies and promoter regions. For this, we shall download the human mutations from the COSMIC repository. Those samples which are not genome-wide or came from the same tumor would be removed. Redundant variations (from the same genomic position) will be eliminated and the gene names re-annotated. During the annotation step, genomic coordinates of each gene will be retrieved and each mutation will be accordingly assigned to ensembl genes.
|
Research Products
(7 results)