研究実績の概要 |
Since the submission of the grant application, there have been very significant advances in protein structure prediction methods that have been revealed by the most recent CASP experiment (14th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction; CASP14), which was held in 2020. One particular machine learning-based method, AlphaFold2, greatly outperformed all other approaches. We therefore decided to concentrate our project on using this approach, which is, however, unfortunately not accessible yet. In the meantime, we have successfully applied machine learning methods to other parts of the NMR structure determination pipeline, namely peak picking and deconvolution, as well as the refinement of chemical shift assignments. Our method takes as input only the protein sequence and NMR spectra, producing as output: (a) peak lists for each spectrum, (b) a chemical shift list, (c) upper distance limit restraints, and (d) a protein structure in PDB format. The structure determination process does not require any human intervention and takes about 5 hours, making it possible to obtain a high-quality protein strucure shortly after completing the NMR measurements. Using this approach, we have managed to automatically solve 100 protein structures of 35-175 residues with a median backbone RMSD of 1.27 A to the PDB reference structures. Moreover, the method correctly assigned 96.3% backbone and 85.5% side-chain chemical shifts (median accuracy), compared to BMRB depositions.
|
今後の研究の推進方策 |
We plan to adopt AlphaFold2, or a similar distance prediction approach, for generating additional distance restraints for CYANA structure calculations with sparse NMR data. Machine learning methods for peak picking, peak deconvolution, and extension of chemical shift assignments will be improved. Applications of the structure determination pipeline to data sets obtained by in-cell NMR are planned.
|