• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Semi-supervised all-words WSD by co-training of forward LSTM and backward LSTM

Research Project

Project/Area Number 19K12093
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 61030:Intelligent informatics-related
Research InstitutionIbaraki University

Principal Investigator

Shinnou Hiroyuki  茨城大学, 理工学研究科(工学野), 教授 (10250987)

Project Period (FY) 2019-04-01 – 2022-03-31
Project Status Completed (Fiscal Year 2021)
Budget Amount *help
¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)
Fiscal Year 2021: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2020: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Fiscal Year 2019: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Keywordsall-words WSD / 半教師あり学習 / Co-training / BERT / Masked Language Model / 語義曖昧性解消 / LSTM
Outline of Research at the Start

語義曖昧性解消 (Word Sense Disambiguation, 以下 WSD) は文中の多義語の語義を推定する処理であり、all-words WSD は入力文内の全ての単語に対して語義を付与する処理である。all-words WSD は通常の教師あり学習では、必要となるラベル付き訓練データが膨大のため実現できない。ここでは本研究では順方向 LSTM と逆方向 LSTM の共学習 (以下Co-training) による半教師あり学習を行うことで、少量のラベル付きデータと大量のラベルなしデータからall-words WSD を実現する手法を確立する。

Outline of Final Research Achievements

In general, a word has multiple senses (meanings). The all-words WSD is a task in which each word in an input sentence is assigned a sense in that sentence. Since this task can be solved by using BERT, the successor model of LSTM, we investigated the use of BERT and showed how to apply BERT to various tasks, including this task. We showed how to apply BERT to various tasks, including this task.

Academic Significance and Societal Importance of the Research Achievements

自然言語処理の各種タスクは機械学習を利用することで解決できる.しかし機械学習で必要とされる訓練データの構築コストが大きいという問題がある.本研究のタスクの all-words WSD はその問題が特に顕著である.BERT は事前学習済みモデルであり,BERT を利用することで少量の訓練データから高精度のモデルを学習できる.研究課題の含め,各種タスクに BERT の利用する方法を示すことができた.

Report

(4 results)
  • 2021 Annual Research Report   Final Research Report ( PDF )
  • 2020 Research-status Report
  • 2019 Research-status Report
  • Research Products

    (42 results)

All 2022 2021 2020 2019

All Presentation (40 results) (of which Int'l Joint Research: 17 results,  Invited: 1 results) Book (2 results)

  • [Presentation] BERT の転移学習とMis-leading データの削除による識別精度の改善2022

    • Author(s)
      岩本昇太, 新納浩幸
    • Organizer
      第28回言語処理学会年次大会, PT4-13
    • Related Report
      2021 Annual Research Report
  • [Presentation] キーワード付与による画像キャプション生成2022

    • Author(s)
      木村文飛, 新納浩幸
    • Organizer
      第28回言語処理学会年次大会, PT3-10
    • Related Report
      2021 Annual Research Report
  • [Presentation] BERT の領域適応における複合語の語彙拡張2022

    • Author(s)
      田中裕隆, 新納浩幸
    • Organizer
      第28回言語処理学会年次大会, PT2-8
    • Related Report
      2021 Annual Research Report
  • [Presentation] Construction and Evaluation of Japanese Sentence-BERT Models2021

    • Author(s)
      Naoki Shibayama, Hiroyuki Shinnou
    • Organizer
      PACLIC-2021
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Application of Mix-Up Method in Document Classification Task using BERT2021

    • Author(s)
      Naoki Kikuta, Hiroyuki Shinnou
    • Organizer
      RANLP-2021
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Domain-Specific Japanese ELECTRA Model Using a Small Corpus2021

    • Author(s)
      Youki Itoh, Hiroyuki Shinnou
    • Organizer
      RANLP-2021
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] 簡易小型化BERTを利用した日本語構文解析2021

    • Author(s)
      河野慎司, 新納浩幸
    • Organizer
      情報処理学会自然言語処理研究会, NL-251-20
    • Related Report
      2021 Annual Research Report
  • [Presentation] 複数のBERTモデルを利用した Data Augmentation2021

    • Author(s)
      高萩恭介, 新納浩幸
    • Organizer
      情報処理学会自然言語処理研究会, NL-250-4
    • Related Report
      2021 Annual Research Report
  • [Presentation] 日本語 SentenceBERT の構築とその評価2021

    • Author(s)
      芝山直希, 新納浩幸
    • Organizer
      情報処理学会自然言語処理研究会, NL-249-7
    • Related Report
      2021 Annual Research Report
  • [Presentation] Faster-RCNNを用いた one-click supervision2021

    • Author(s)
      平野友基, 新納浩幸
    • Organizer
      情報処理学会NL研・CVIM研・PRMU研の合同研究会
    • Related Report
      2021 Annual Research Report
  • [Presentation] Construction of Domain-Specific DistilBERT Model by Using Fine-Tuning2020

    • Author(s)
      Jing Bai, Rui Cao, Wen Ma and Hiroyuki Shinnou
    • Organizer
      TAAI-2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Construction of document feature vectors using BERT2020

    • Author(s)
      Hirotaka Tanaka, Rui Cao, Jing Bai, Wen Ma and Hiroyuki Shinnou
    • Organizer
      TAAI-2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Analysis of Polysemy using Variance Values for Word Embeddings by BERT2020

    • Author(s)
      Yanghuizi Ou, Rui Cao, Jing Bai, Wen Ma and Hiroyuki Shinnou
    • Organizer
      TAAI-2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Determining the Logical Relation between Two Sentences by Using the Masked Language Model of BERT2020

    • Author(s)
      Yi Zhao, Rui Cao, Jing Bai, Wen Ma and Hiroyuki Shinnou
    • Organizer
      TAAI-2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Composing Word Vectors for Japanese Compound Words Using Bilingual Word Embedding2020

    • Author(s)
      Teruo Hirabayashi, Kanako Komiya, Masayuki Asahara and Hiroyuki Shinnou
    • Organizer
      PACLIC-2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Generation and Evaluation of Concept Embeddings Via Fine-Tuning Using Automatically Tagged Corpus2020

    • Author(s)
      Kanako Komiya, Daiki Yaginuma, Masayuki Asahara and Hiroyuki Shinnou
    • Organizer
      PACLIC-2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Evaluation of BERT Models by Using Sentence Clustering2020

    • Author(s)
      Naoki Shibayama, Rui Cao, Jing Bai, Wen Ma and Hiroyuki Shinnou
    • Organizer
      PACLIC-2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Automatic Creation of Correspondence Table of Meaning Tags from Two Dictionaries in One Language Using Bilingual Word Embedding2020

    • Author(s)
      Teruo Hirabayashi, Kanako Komiya, Masayuki Asahara and Hiroyuki Shinnou
    • Organizer
      BUCC-2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Use of BERT for NLP tasks by HuggingFace's transformers2020

    • Author(s)
      Hiroyuki Shinnou
    • Organizer
      ROCLING-2020
    • Related Report
      2020 Research-status Report
    • Invited
  • [Presentation] 二言語 BERT を利用したターゲット言語の教師データを必要としない感情分析2020

    • Author(s)
      荘司響之介, 曹鋭, 白静, 馬ブン, 新納浩幸
    • Organizer
      言語資源活用ワークショップ 2020
    • Related Report
      2020 Research-status Report
  • [Presentation] BERT の Masked Language Model を用いた二文間の接続関係の推定2020

    • Author(s)
      趙一, 曹鋭, 白静, 馬ブン, 新納浩幸
    • Organizer
      言語資源活用ワークショップ 2020
    • Related Report
      2020 Research-status Report
  • [Presentation] BERT による単語埋め込み表現の分散値を用いた語義の広がりの分析2020

    • Author(s)
      欧陽恵子, 曹鋭, 白静, 馬ブン, 新納浩幸
    • Organizer
      言語資源活用ワークショップ 2020
    • Related Report
      2020 Research-status Report
  • [Presentation] Fine-Tuning による領域に特化した DistilBERT モデルの構築2020

    • Author(s)
      新納浩幸, 白静, 曹鋭, 馬ブン
    • Organizer
      第34回人工知能学会全国大会
    • Related Report
      2020 Research-status Report
  • [Presentation] Bilingual Word Embeddingsによる短単位と長単位のアラインメント2020

    • Author(s)
      平林照雄, 古宮嘉那子, 新納浩幸
    • Organizer
      第26回言語処理学会年次大会
    • Related Report
      2020 Research-status Report
  • [Presentation] 文のクラスタリングを用いた BERT 事前学習モデルの評価2020

    • Author(s)
      芝山直希, 曹鋭, 白静, 馬ブン, 新納浩幸
    • Organizer
      第26回言語処理学会年次大会
    • Related Report
      2020 Research-status Report
  • [Presentation] Detecting Missing Translations in Neural Machine Translation Using Information Quantity in Sentences2019

    • Author(s)
      Shin Fujii, Rui Cao, Jing Bai, Wen Ma, Hiroyuki Shinnou
    • Organizer
      TAAI-2019
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] Combination of Feature-based and Instance-based methods for Domain Adaptation in Sentiment Classification2019

    • Author(s)
      Jing Bai, Rui Cao, Wen Ma, Hiroyuki Shinnou
    • Organizer
      TAAI-2019
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] Unsupervised Domain Adaptation for Sentimental Classification by Word Embeddings on the Lower Layer of BERT2019

    • Author(s)
      Jing Bai, Hirotaka Tanaka, Rui Cao, Wen Ma, Hiroyuki Shinnou
    • Organizer
      TAAI-2019
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] Document Classification by Word Embeddings of BERT2019

    • Author(s)
      Hirotaka Tanaka, Hiroyuki Shinnou, Rui Cao, Jing Bai, Wen Ma
    • Organizer
      PACLING-2019
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] Semi-supervised learning for all-words WSD using self-learning and fine-tuning2019

    • Author(s)
      Rui Cao, Jing Bai, Wen Ma, Hiroyuki Shinnou
    • Organizer
      PACLIC-2019
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] Composing Word Vectors for Japanese Compound Words Using Dependency Relations2019

    • Author(s)
      Kanako Komiya, Takumi Seitou, Minoru Sasaki, Hiroyuki Shinnou
    • Organizer
      CICLING 2019
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] BERTを利用した文書の特徴ベクトルの作成2019

    • Author(s)
      田中裕隆, 曹鋭, 白静, 馬ブン, 新納浩幸
    • Organizer
      情報処理学会自然言語処理研究会 NL-243-8
    • Related Report
      2019 Research-status Report
  • [Presentation] Triple-GANによる感情分析に対する半教師あり学習2019

    • Author(s)
      楊金成, 曹鋭, 白静, 馬ブン, 新納浩幸
    • Organizer
      第15回テキストアナリティクス・シンポジウム
    • Related Report
      2019 Research-status Report
  • [Presentation] 日本語 Pretrained BERTモデルの比較2019

    • Author(s)
      芝山直希, 曹鋭, 白静, 馬ブン, 新納浩幸
    • Organizer
      第15回テキストアナリティクス・シンポジウム
    • Related Report
      2019 Research-status Report
  • [Presentation] BERTを利用した単語用例のクラスタリング2019

    • Author(s)
      馬ブン,田中裕隆,曹鋭,白静,新納浩幸
    • Organizer
      言語資源活用ワークショップ 2019
    • Related Report
      2019 Research-status Report
  • [Presentation] BERTを利用した教師あり学習による語義曖昧性解消2019

    • Author(s)
      曹鋭,田中裕隆,白静,馬ブン,新納浩幸
    • Organizer
      言語資源活用ワークショップ 2019
    • Related Report
      2019 Research-status Report
  • [Presentation] All-words WSDとfine-tuningを利用した分類語彙表の語義の分散表現の構築2019

    • Author(s)
      柳沼大輝,古宮嘉那子,新納浩幸
    • Organizer
      言語資源活用ワークショップ 2019
    • Related Report
      2019 Research-status Report
  • [Presentation] 文書領域情報を有するBERTの階層位置に関する考察2019

    • Author(s)
      欧陽恵子,田中裕隆,曹鋭,白静,馬ブン,新納浩幸
    • Organizer
      言語資源活用ワークショップ 2019
    • Related Report
      2019 Research-status Report
  • [Presentation] BERT の下位階層の単語埋め込み表現列を用いた感情分析の教師なし領域適応2019

    • Author(s)
      白静, 田中裕隆, 曹鋭, 馬ブン, 新納浩幸
    • Organizer
      情報処理学会自然言語処理研究会 NL-240-17
    • Related Report
      2019 Research-status Report
  • [Presentation] BERT による単語埋め込み表現列を用いた文書分類2019

    • Author(s)
      田中裕隆, 曹鋭, 白静, 馬ブン, 新納浩幸
    • Organizer
      情報処理学会自然言語処理研究会 NL-240-16
    • Related Report
      2019 Research-status Report
  • [Book] PyTorch自然言語処理プログラミング word2vec/LSTM/seq2seq/BERTで日本語テキスト解析!2021

    • Author(s)
      新納 浩幸
    • Total Pages
      240
    • Publisher
      インプレス
    • ISBN
      9784295011132
    • Related Report
      2021 Annual Research Report
  • [Book] PyTorchによる物体検出2020

    • Author(s)
      新納浩幸
    • Total Pages
      208
    • Publisher
      オーム社
    • ISBN
      9784274225932
    • Related Report
      2020 Research-status Report

URL: 

Published: 2019-04-18   Modified: 2023-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi