今後の研究の推進方策 |
The next step is to discover new biological pathways containing the identified essential and specific proteins. With each representation of core genes as one dimension, similarity-based learning approach (i.e. Nearest Neighbor Algorithm) will be adopted to group the sequence windows that contain these genes. Subsequently, the category of potential pathway-associated windows will be predicted based on the similarity (space distance) with known aromatic compound degradation pathways. This allows us to identify candidate sequence windows for putative aromatic compound degradation pathways. Based on preliminary results, we found the xenobiotics-related genes consistently showed distinct phylogenetic behavior (tight clustering and confinement to specific habitats) compared to those associated with degradation of natural aromatic compounds. Using this trend, we can further differentiate pathways related to xenobiotics and natural compounds. To strengthen the connection between machine learning and pathway prediction, we will adopt thermodynamics to estimate the feasibility and directionality of predicted biochemical reactions.
|