研究課題/領域番号 |
25440023
|
研究種目 |
基盤研究(C)
|
研究機関 | 神戸大学 |
研究代表者 |
TOKMAKOV A・A 神戸大学, 学内共同利用施設等, 研究員 (20301278)
|
研究期間 (年度) |
2013-04-01 – 2016-03-31
|
キーワード | protein folding / cell-free synthesis / bioinformatics |
研究概要 |
Presently, an approach aimed at identification of numerous physicochemical, structural and functional properties of amino acid sequences, including the sites of multiple PTMs, associated with soluble expression of eukaryotic proteins in bacterial cell-free extracts has been finalized and published [1]. The developed method is intended for analysis of output from a cell-free protein production pipeline. It includes: 1) categorical assessment of expression data; 2) calculation and prediction of multiple properties of expressed sequences; 3) correlation of the individual properties with the expression scores; and 4) evaluation of statistical significance of the observed correlations.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
2: おおむね順調に進展している
理由
At present, the method for identification of numerous physicochemical, structural and functional properties of amino acid sequences, including the sites of multiple PTMs, associated with soluble expression of eukaryotic proteins in bacterial cell-free extracts has been published [1]. The prototype of the prediction algorithm for assessment of protein amenability to cell-free expression is being evaluated by its ability to correctly classify new targets not present in the initial training set. The developed discriminant function is being applied to the test set of the expressed proteins, and prediction accuracy is estimated for this set.
|
今後の研究の推進方策 |
During the second year of study (2014) recombinant proteins will be expressed in the eukaryotic cell-free system. A complete dataset of cell-free expressed proteins will comprise a number of non-redundant (at 90% identity) amino acid sequences. The proteins of different functional and structural classes will be represented in the dataset, with the unbiased target selection in this regard. All the proteins will be expressed in the extracts under the same uniform set of conditions minimizing the influence of sequence-independent factors. The scores A, C, and N will be designated to all experimentally expressed proteins as follows: A - soluble proteins, C - expressed, but insoluble proteins, and N - non-expressed proteins. Multiple features of the expression sequences will be predicted using available bioinformatics tools.
|