Abstractive Neural Multi-document Summarization Considering Cross Document Structure

Research Project

Project/Area Number	21H03495
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Review Section	Basic Section 61030:Intelligent informatics-related
Research Institution	Tokyo Institute of Technology
Principal Investigator	Okumura Manabu 東京工業大学, 科学技術創成研究院, 教授 (60214079)
Co-Investigator(Kenkyū-buntansha)	上垣外英剛奈良先端科学技術大学院大学, 先端科学技術研究科, 准教授 (40817649)
Project Period (FY)	2021-04-01 – 2024-03-31
Project Status	Completed (Fiscal Year 2023)
Budget Amount *help	¥17,160,000 (Direct Cost: ¥13,200,000、Indirect Cost: ¥3,960,000) Fiscal Year 2023: ¥4,940,000 (Direct Cost: ¥3,800,000、Indirect Cost: ¥1,140,000) Fiscal Year 2022: ¥4,680,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥1,080,000) Fiscal Year 2021: ¥7,540,000 (Direct Cost: ¥5,800,000、Indirect Cost: ¥1,740,000)
Keywords	自然言語処理 / 複数テキスト要約 / ニューラルモデル / 生成型要約 / 文書横断文間関係
Outline of Research at the Start	本研究課題では，ニューラル要約モデルを2段階の連結モデルとして構成し， 1) 文書横断共参照解析や文書横断構造解析の解析結果を考慮した上で，要約文集合をその順序とともに生成するニューラルモデル，2) 冗長性の度合いや文の順序の首尾一貫性の度合いを元に，順序付き要約文集合をリランキングし，最適な順序付き要約文集合を出力するニューラルモデルを研究開発する．1)の研究開発は，文書横断共参照解析および文書横断構造解析技術の研究開発と，それらの解析結果をencodeして要約文集合を生成するニューラルモデルの研究開発に細分化できるので，結果的に本研究課題は3つのコア技術に分解し研究開発を行なうことになる．
Outline of Final Research Achievements	In document structure analysis, that analyzes the relationships between sentences, by utilizing large language models (LLMs), we proposed a method to imitate shift-reduce operations through prompts. As a result of evaluation experiments, the proposed method achieved the state-of-the art performance. In text summarization, we proposed a neural model that utilizes the results of this document structure analysis. We confirmed that this contributes to improving the performance of the summarization. We further proposed a method for enabling the model to understand the summarization-specific information by predicting the summary length in the encoder and generating a summary of the predicted length in the decoder in fine-tuning. We confirmed that this also contributes to improving the performance.
Academic Significance and Societal Importance of the Research Achievements	文間の関係を解析する文書構造解析器は，我々のグループが世界最高性能を達成していたが，引き続き研究開発を継続し，新しい手法を提案することで，現在も世界最高性能を維持している．テキスト要約において要約長を予測するというアイデアはこれまでに提唱されておらず，そういう意味で斬新なアイデアに基づいており，しかも，要約長を予測するよう要約モデルを学習することで性能向上に寄与することを示しており，学術的な意義は大きい．

Report

(4 results)

2023 Annual Research Report Final Research Report ( PDF )
2022 Annual Research Report
2021 Annual Research Report

Research Products
(11 results)

All 2024 2023 2022 2021

All Journal Article (1 results) (of which Peer Reviewed: 1 results) Presentation (10 results) (of which Int'l Joint Research: 7 results)

[Journal Article] Neural RST-Style Discourse Parsing Exploiting Agreement Sub-trees as Silver Data2022
- Author(s)
  小林尚輝, 平尾努, 上垣外英剛, 奥村学, 永田昌明
- Journal Title
  
  Journal of Natural Language Processing
  
  Volume: 29 Issue: 3 Pages: 875-900
- DOI
  10.5715/jnlp.29.875
- ISSN
  1340-7619, 2185-8314
- Related Report
  2022 Annual Research Report
- Peer Reviewed
[Presentation] Can we obtain significant success in RST discourse parsing by using Large Language Models?2024
- Author(s)
  Aru Maekawa, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura
- Organizer
  The 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2024)
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] 大規模言語モデルによるシフト還元修辞構造解析の模倣2024
- Author(s)
  前川在, 平尾努, 上垣外英剛, 奥村学
- Organizer
  言語処理学会第30回年次大会(NLP2024)
- Related Report
  2023 Annual Research Report
[Presentation] Abstractive Document Summarization with Summary-length Prediction2023
- Author(s)
  Jingun Kwon, Hidetaka Kamigaito and Manabu Okumura
- Organizer
  The 17th Conference of the European Chapter of the Association for Computational Linguistics（EACL2023）
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] A Simple and Strong Baseline for End-to-End Neural RST-style Discourse Parsing2022
- Author(s)
  Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura and Masaaki Nagata
- Organizer
  The 2022 Conference on Empirical Methods in Natural Language Processing EMNLP 2022
- Related Report
  2022 Annual Research Report
- Int'l Joint Research
[Presentation] 逆翻訳を利用したデータ拡張による文間の修辞構造解析の改善2022
- Author(s)
  前川在, 小林尚輝, 平尾努, 上垣外英剛, 奥村学
- Organizer
  言語処理学会第29回年次大会(NLP2023)
- Related Report
  2022 Annual Research Report
[Presentation] 言語モデルと解析戦略の観点からの修辞構造解析器の比較2022
- Author(s)
  小林尚輝, 平尾努, 上垣外英剛, 奥村学, 永田昌明
- Organizer
  言語処理学会第28回年次大会(NLP2022)
- Related Report
  2021 Annual Research Report
[Presentation] Considering Nested Tree Structure in Sentence Extractive Summarization with Pre-trained Transformer2021
- Author(s)
  Jingun Kwon, Naoki Kobayashi, Hidetaka Kamigaito and Manabu Okumura
- Organizer
  The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021)
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] A Language Model-based Generative Classifier for Sentence-level Discourse Parsing2021
- Author(s)
  Ying Zhang, Hidetaka Kamigaito and Manabu Okumura
- Organizer
  The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021)
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] Abstractive Document Summarization with Word Embedding Reconstruction2021
- Author(s)
  Jingyi You, Chenlong Hu, Hidetaka Kamigaito, Hiroya Takamura and Manabu Okumura
- Organizer
  RANLP 2021
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] Improving Neural RST Parsing Model with Silver Agreement Subtrees2021
- Author(s)
  Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura and Masaaki Nagata
- Organizer
  NAACL-HLT 2021
- Related Report
  2021 Annual Research Report
- Int'l Joint Research

Abstractive Neural Multi-document Summarization Considering Cross Document Structure

Principal Investigator

Okumura Manabu 東京工業大学, 科学技術創成研究院, 教授 (60214079)

¥17,160,000 (Direct Cost: ¥13,200,000、Indirect Cost: ¥3,960,000)

Report

Research Products

[Journal Article] Neural RST-Style Discourse Parsing Exploiting Agreement Sub-trees as Silver Data2022

Author(s)

Journal Title

DOI

ISSN

Related Report

[Presentation] Can we obtain significant success in RST discourse parsing by using Large Language Models?2024

Author(s)

Organizer

Related Report

[Presentation] 大規模言語モデルによるシフト還元修 辞構造解析の模倣2024

Author(s)

Organizer

Related Report

[Presentation] Abstractive Document Summarization with Summary-length Prediction2023

Author(s)

Organizer

Related Report

[Presentation] A Simple and Strong Baseline for End-to-End Neural RST-style Discourse Parsing2022

Author(s)

Organizer

Related Report

[Presentation] 逆翻訳を利用したデータ拡 張による文間の修辞構造解析の改善2022

Author(s)

Organizer

Related Report

[Presentation] 言語モデルと解析戦略の 観点からの修辞構造解析器の比較2022

Author(s)

Organizer

Related Report

[Presentation] Considering Nested Tree Structure in Sentence Extractive Summarization with Pre-trained Transformer2021

Author(s)

Organizer

Related Report

[Presentation] A Language Model-based Generative Classifier for Sentence-level Discourse Parsing2021

Author(s)

Organizer

Related Report

[Presentation] Abstractive Document Summarization with Word Embedding Reconstruction2021

Author(s)

Organizer

Related Report

[Presentation] Improving Neural RST Parsing Model with Silver Agreement Subtrees2021

Author(s)

Organizer

Related Report

[Presentation] 大規模言語モデルによるシフト還元修辞構造解析の模倣2024

[Presentation] 逆翻訳を利用したデータ拡張による文間の修辞構造解析の改善2022

[Presentation] 言語モデルと解析戦略の観点からの修辞構造解析器の比較2022