研究実績の概要 |
The past year has been dedicated to try and identify enhancer regions in the complete human genome by using recurrent neural networks. By taking the whole genome in consideration, and not just some very limited and specific regions, the hope was to reach a more comprehensive understanding of those regions and identify some as of yet unknown enhancers.The results, while significantly better than random, were not as good as hoped. State of the art enhancer identification reaches a success rate of more than 90%, but our results hovered around 60%. While it is definitely a problem that could be tackled in the future, the relatively small dataset available (in opposition to the size of the genome) made it too difficult for the current machine learning techniques to work: they require both strong ground truth and a big dataset. Sadly the hope that the available data would be enough didn't match reality.The research as since then evolved into a slightly different direction, aiming at being able to identify which enhancer is active in which cell lines. We believe that our experience for studying the motif finding problem using the genetic algorithm would be effective in this direction and thus we will explore this possibility during the remaining term.
|