Large-scale text data analysis using hashing techniques

Research Project

Project/Area Number	26730126
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Multi-year Fund
Research Field	Intelligent informatics
Research Institution	NTT Communication Science Laboratories
Principal Investigator	Hayashi Katsuhiko 日本電信電話株式会社NTTコミュニケーション科学基礎研究所, 協創情報研究部, 研究員 (50725794)
Project Period (FY)	2014-04-01 – 2017-03-31
Project Status	Completed (Fiscal Year 2016)
Budget Amount *help	¥2,470,000 (Direct Cost: ¥1,900,000、Indirect Cost: ¥570,000) Fiscal Year 2016: ¥390,000 (Direct Cost: ¥300,000、Indirect Cost: ¥90,000) Fiscal Year 2015: ¥260,000 (Direct Cost: ¥200,000、Indirect Cost: ¥60,000) Fiscal Year 2014: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Keywords	談話構造解析 / 省略補完 / 行列分解 / ハッシュ法 / 低ランク近似 / 分枝限定法 / 線形時間言語解析 / 音声言語データ解析 / 自然言語処理 / 談話解析 / 修辞構造解析
Outline of Final Research Achievements	I investigated hashing and matrix factorization techniques to efficiently analyze large-scale text data in various domains. First, I proposed a fast and accurate parsing algorithm for discourse tree structure analysis of English newswire texts. I also presented a text summarization method using discourse trees, and achieved an improvement in text summarization accuracy. Second, I proposed a method to automatically detect and insert missing elements in English and Japanese speech/newswire texts. Finally, I proposed a knowledge (word thesaurus) embedding method for fast word similarity computation. In future, I will apply these methods to such more advanced NLP applications as machine translation and question answering.

Report

(4 results)

2016 Annual Research Report Final Research Report ( PDF )
2015 Research-status Report
2014 Research-status Report

Research Products
(7 results)

All 2017 2016 2015

All Presentation (6 results) (of which Int'l Joint Research: 4 results, Invited: 1 results) Patent(Industrial Property Rights) (1 results)

[Presentation] On the Equivalence of Holographic and Complex Embeddings for Link Prediction2017
- Author(s)
  Katsuhiko Hayashi, Masashi Shimbo
- Organizer
  The 55th Annual Meeting of the Association for Computational Linguistics
- Place of Presentation
  バンクーバー
- Year and Date
  2017-07-31
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] K-best Iterative Viterbi Parsing2017
- Author(s)
  Katsuhiko Hayashi, Masaaki Nagata
- Organizer
  The 15th Conference of the European Chapter of the Association for Computational Linguistics
- Place of Presentation
  バレンシア
- Year and Date
  2017-04-05
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] 知識グラフの埋め込みとその応用2017
- Author(s)
  林克彦
- Organizer
  千葉工業大学ステアラボ人工知能セミナー
- Place of Presentation
  東京
- Related Report
  2016 Annual Research Report
- Invited
[Presentation] Empirical comparison of dependency conversions for RST discourse trees2016
- Author(s)
  Katsuhiko Hayashi, Tsutomu Hirao, Masaaki Nagata
- Organizer
  The 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue
- Place of Presentation
  ロサンゼルス
- Year and Date
  2016-09-13
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] Empty element recovery by spinal parser operations2016
- Author(s)
  Katsuhiko Hayashi
- Organizer
  The 54th Annual Meeting of the Association for Computational Linguistics
- Place of Presentation
  ベルリン
- Year and Date
  2016-08-08
- Related Report
  2015 Research-status Report
- Int'l Joint Research
[Presentation] 修辞構造木から自動変換した談話依存構造木の性質について2015
- Author(s)
  林克彦
- Organizer
  言語処理学会年次大会
- Place of Presentation
  京都大学(京都府京都市)
- Year and Date
  2015-03-16 – 2015-03-21
- Related Report
  2014 Research-status Report
[Patent(Industrial Property Rights)] 単語学習装置、単語学習方法及び単語学習プログラム2017
- Inventor(s)
  林克彦、新保仁、永田昌明
- Industrial Property Rights Holder
  林克彦、新保仁、永田昌明
- Industrial Property Rights Type
  特許
- Filing Date
  2017-03-02
- Related Report
  2016 Annual Research Report

Large-scale text data analysis using hashing techniques

Principal Investigator

Hayashi Katsuhiko 日本電信電話株式会社NTTコミュニケーション科学基礎研究所, 協創情報研究部, 研究員 (50725794)

¥2,470,000 (Direct Cost: ¥1,900,000、Indirect Cost: ¥570,000)

Report

Research Products

[Presentation] On the Equivalence of Holographic and Complex Embeddings for Link Prediction2017

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] K-best Iterative Viterbi Parsing2017

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 知識グラフの埋め込みとその応用2017

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] Empirical comparison of dependency conversions for RST discourse trees2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Empty element recovery by spinal parser operations2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 修辞構造木から自動変換した談話依存構造木の性質について2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Patent(Industrial Property Rights)] 単語学習装置、単語学習方法及び単語学習プログラム2017

Inventor(s)

Industrial Property Rights Holder

Industrial Property Rights Type

Filing Date

Related Report