Construction of mathematical optimization methods for discrete data useful in machine learning algorithms.
Project/Area Number |
17K19973
|
Research Category |
Grant-in-Aid for Challenging Research (Exploratory)
|
Allocation Type | Multi-year Fund |
Research Field |
Information science, computer engineering, and related fields
|
Research Institution | Kyoto University |
Principal Investigator |
|
Co-Investigator(Kenkyū-buntansha) |
西野 正彬 日本電信電話株式会社NTTコミュニケーション科学基礎研究所, 協創情報研究部, 特別研究員 (90794529)
|
Project Period (FY) |
2017-06-30 – 2021-03-31
|
Project Status |
Completed (Fiscal Year 2020)
|
Budget Amount *help |
¥6,370,000 (Direct Cost: ¥4,900,000、Indirect Cost: ¥1,470,000)
Fiscal Year 2019: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
Fiscal Year 2018: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
Fiscal Year 2017: ¥2,470,000 (Direct Cost: ¥1,900,000、Indirect Cost: ¥570,000)
|
Keywords | 機械学習 / 文脈自由文法 / 木構造 / 一階述語論理 / 帰納論理プログラミング / 最小汎化 / 文脈自由言語 / 木構造データ / 文字列データ / 距離計算 / 離散構造 / 構文解析木 / pq-gram距離 / BDD / 離散最適化 / 文字列構造 |
Outline of Final Research Achievements |
Machine learning is now a fundamental technology in processing data in natural languages. If we convert natural language sentences converted into vectors of number and then applied the latest machine learning techniques, such as deep learning, we would meet difficulty in interpreting the meaning of the learning results. Moreover, we would have no guarantee that the natural structure of a sentence are adequately represented with vectors whose structure is very flat. In this study, we have developed optimization mathematics and algorithms for machine learning for parse trees in context-free languages, which are mathematical models of natural language data, sentences in first-order predicate logic, and patterns, which are direct algebraic representations of word sequences.
|
Academic Significance and Societal Importance of the Research Achievements |
機械学習は自然言語データの処理における基本技術となっている.特に自然言語データを自然数ベクトルのデータに変換した上で,深層学習など最新の機械学習技術を適用する方法は大きな成果を上げつつある.しかし,深層学習は学習結果の意味を解釈しづらく,さらには文のもつ自然な構造がベクトルという平坦な構造で適切に表現できる保証はない.本研究で扱った,語の列である自然言語データ,あるいはそこから抽出した構文木を直接扱う機械学習アルゴリズムを用いれば,解釈可能な構造を表現した結果を出力することが期待される.
|
Report
(5 results)
Research Products
(9 results)