Development of a System for Collecting Context Data for Large-Scale Inverse Reinforcement Learning
Project/Area Number |
17K00295
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Hokkaido University |
Principal Investigator |
Rzepka Rafal 北海道大学, 情報科学研究院, 助教 (80396316)
|
Project Period (FY) |
2017-04-01 – 2022-03-31
|
Project Status |
Completed (Fiscal Year 2021)
|
Budget Amount *help |
¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)
Fiscal Year 2020: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2019: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2018: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2017: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
|
Keywords | 知識獲得 / 因果関係 / コモンセンス / 文脈処理 / ストーリー処理 / corpus creation / story generation / danger detection / commonsense / common sense / causal relations / story understanding / crowdsourcing / context understanding / knowledge completion / natural language / language models / context processing / implicit knowledge / semantic chains / text generation / data collection / data analysis / knowledge testing / 人工知能 |
Outline of Final Research Achievements |
This project aimed at creating data to be used in machine learning as an equivalent of human supervisor but in a deeper manner than plain statistics of words in corpora usually used today. Common sense knowledge is most often not explicitly expressed and semantic graphs as ConceptNet were proposed to enrich context processing. However, such graphs are not too helpful for inferring causal chains of events. To fill this gap I proposed algorithms for automatic augmentation of events and to examine how small changes of context change results of sentiment recognition task in context of potential danger detection. Although some results were promising, the automatic approach suffered from unnatural completions and there was a problem with evaluation as similar data did not exist for Japanese language. Therefore I created two datasets: a) 21,592 sentences for recognizing influence of contextual changes and b) 8,800 five sentence stories for story understanding experiments.
|
Academic Significance and Societal Importance of the Research Achievements |
本研究の成果のアルゴリズムは、言語モデルの出力を知識ベースとコーパステキストを用いることによって部分的に制御することができることを示した。このアプローチは現在でも広く利用されておらず、制御されない深層学習が危険な判断をする原因になることがあり、社会が人工知能の研究を信用しない原因の一つでもある。本研究で作成したデータセットは、危険検知における文脈変化の影響に関するものとしては世界初で,大量なストーリーのDBとして日本語で初めて作成されたものである。両方のデータセットの公開は、人工知能が日常生活での行動が安全かどうかを検証するための新しいタスクを切り開く可能性がある。
|
Report
(6 results)
Research Products
(33 results)