研究実績の概要 |
Proteins are one of the most important component of living organisms. They carry out a multitude of functions to control a myriad of pathways. The malfunctioning of these proteins is the major cause of various diseases. The intricate function of proteins is determined by their equally sophisticated 3D structures. The ability to design proteins with a specified structure and thereby conferring it with a desired function would have tremendous impact on our ability to develop new therapeutics, diagnostics and biosensors. Our objective is to develop a novel computational method for the de novo design of proteins. Structure-based Computational Protein design (CPD) plays a critical role in advancing the field of protein engineering. Using an all-atom energy function, CPD tries to identify amino acid sequences that fold into a target structure and ultimately perform a desired function. Energy functions remain however imperfect and injecting relevant information from known structures in the design process should lead to improved designs. We propose to use a library of naturally occurring sequence segments that are known to fold into a given structural fragments to dramatically reduce the sequence space that has to be searched. We then use an evolutionary approach such as the estimation of distributions algorithm to efficiently search the sequence space by learning from previous populations.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
2: おおむね順調に進展している
理由
We have developed a CPD with backbone flexibility called SHADES, a data-driven method that exploits local structural environments in known protein structures together with energy to guide sequence design. SHADES is based on customized libraries of non-contiguous in-contact amino acid residue motifs. We have tested SHADES on a public benchmark of 40 proteins selected from different protein families. When excluding homologous proteins, SHADES achieved a protein sequence recovery of 30% and a protein sequence similarity of 46% on average, compared with the PFAM protein family of the target protein. When homologous structures were added, the wildtype sequence recovery rate achieved 93%. WD40 proteins are a subfamily of propeller proteins, with a pseudo-symmetrical fold made up of subdomains called blades. By computationally reverse-engineering the duplication, fusion and diversification events in the evolutionary history of a WD40 protein, a perfectly symmetrical homolog called Tako8 was made. We have used SHADES to redesign Tako8 to create Ika8, a four-fold symmetrical protein in which neighbouring blades carry compensating charges. Ika2 and Ika4, carrying two or four blades per subunit, respectively, were found to assemble spontaneously into a complete eight-bladed ring in solution. These artificial eight-bladed rings may find applications in bionanotechnology and as models to study the folding and evolution of WD40 proteins.
|
今後の研究の推進方策 |
RNA polymerases are ancient proteins found in all kingdoms of life and have a large complex structure consisting of multiple domains. It remains to be demystified how these large complex proteins evolved presumably from much simpler primitive proteins. We hypothesize that the double-psi beta-barrel (DPBB) domain at the core of RNA polymerase is the ancestor of modern RNA polymerases. We plan to use SHADES to design a standalone stably folded and functional DPBB protein that is capable of Mg2+ binding, which is required for the enzymatic activity of RNA polymerases. To further reduce the complexity, we plan to design a two-fold symmetric protein of the DPBB fold, which consists of a duplicated half DPBB sequence. This design will provide support for the notion that modern day complex proteins could have been evolved from the gene duplication, fusion and diversification events from ancestral simpler proteins. We also plan to further improve our scoring function by considering not only the sequence preference but also taking into account the dihedral angle distribution in known structures. This improved scoring function will further enhance our ability to design proteins with novel folds and novel functions.
|