Extracting Biologically Interesting Metadata from Full-Text Papers
Project/Area Number |
26330343
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Life / Health / Medical informatics
|
Research Institution | 大学共同利用機関法人情報・システム研究機構(機構本部施設等) |
Principal Investigator |
Yamamoto Yasunori 大学共同利用機関法人情報・システム研究機構(機構本部施設等), データサイエンス共同利用基盤施設, 特任准教授 (50470076)
|
Co-Investigator(Kenkyū-buntansha) |
川島 秀一 大学共同利用機関法人情報・システム研究機構(機構本部施設等), データサイエンス共同利用基盤施設, 特任助教 (50314274)
片山 俊明 大学共同利用機関法人情報・システム研究機構(機構本部施設等), データサイエンス共同利用基盤施設, 特任助教 (60396869)
岡本 忍 大学共同利用機関法人情報・システム研究機構(機構本部施設等), データサイエンス共同利用基盤施設, 特任准教授 (90623893)
|
Project Period (FY) |
2014-04-01 – 2018-03-31
|
Project Status |
Completed (Fiscal Year 2017)
|
Budget Amount *help |
¥4,680,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥1,080,000)
Fiscal Year 2016: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2015: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
Fiscal Year 2014: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
|
Keywords | テキストマイニング / 微生物ゲノム / セマンティックウェブ / マニュアルアノテーション / 論文全文 / オープンアクセス / 文献全文 / 微生物 / ゲノム / NER / バクテリア |
Outline of Final Research Achievements |
Accompanied with the sheer increase of genome papers, we need an automatic acquisition system of biological knowledge from full papers. Moreover, several relevant datasets are scattered throughout multiple institutions, and biologically interesting analyses need to use them in an integrated manner. As for the microbe research, habitat environments and sampling locations are among the relevant data to be extracted. In addition, the Semantic Web technologies have been adopted by major biological institutions such as National Center for Biotechnology Information (NCBI) or European Bioinformatics Institute (EBI). In this situation, we built a manually annotated corpus and an automatic extraction system using text mining technologies. We use Resource Description Framework (RDF) to express the extracted knowledge, and are publishing it as Linked Open Data (LOD) to be efficiently and effectively used with other relevant datasets in an integrated manner.
|
Report
(5 results)
Research Products
(3 results)