• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Robust Incremental Parsing based on Finite-State Approximation of Context Free Grammar

Research Project

Project/Area Number 15300044
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionNagoya University

Principal Investigator

OGAWA Yasuhiro  Nagoya University, Graduate School of Engineering, Research Assistant, 大学院・情報科学研究科, 助手 (70332707)

Co-Investigator(Kenkyū-buntansha) INAGAKI Yasuyoshi  Aichi Prefectural University, Faculty of Information Science and Technology, Professor, 情報科学部, 教授 (10023079)
TOYAMA Katsuhiko  Nagoya University, Graduate School Engineering, Associate Professor, 大学院・情報科学研究科, 助教授 (70217561)
MATSUBARA Shigeki  Nagoya University, Information Technology Center, Associate Professor, 情報連携基盤センター, 助教授 (20303589)
MUHTAR Mahsut  Nagoya University, Graduate School of International Development, Research Assistant, 大学院・国際開発研究科, 助手 (20283517)
OHKUBO Hirotaka  Aichi Prefectural University, Faculty of Information Science and Technology, Research Assistant, 情報科学部, 助手 (40295580)
Project Period (FY) 2003 – 2004
Project Status Completed (Fiscal Year 2004)
Budget Amount *help
¥13,400,000 (Direct Cost: ¥13,400,000)
Fiscal Year 2004: ¥5,000,000 (Direct Cost: ¥5,000,000)
Fiscal Year 2003: ¥8,400,000 (Direct Cost: ¥8,400,000)
Keywordsnatural language processing / parsing / finite state automaton / algorithm / simultaneous interpretation / spoken language / corpus / context free grammar / 有限オートマン
Research Abstract

In this research, it has been aiming at the development of high speed and gradual progress the parsing technology treatable by the same degree of the speed as the voice input to develop the translation technology between several languages that have the simultaneous interpreter function. The approach of developing the approximation conversion technique based on the statistical method that was able to reflect the use frequency etc. of the grammatical rule appropriately by using the language corpus with the syntax tree was selected. The research of the following items was concretely promoted.
^*English, Japanese, and Thai corpus maintenance.
The simultaneous interpreter conversation corpus that had been collected in the Nagoya University integration sound information research base was used. In this research, the translation data of it was made by using Japanese data of 24 hours and the English translation with the original data among these about Thai. The syntax tree data was given to each … More language corpus in the form of the phrase structure grammar.
^*Acquisition of statistical information from large-scale corpus with syntax tree.
The technique to acquire various statistical informations was examined from the language corpus from which the syntax tree was given. It was statistically analyzed by searching for the position in the syntax tree where the grammatical rule appeared, and describing it with the context information.
^*Development of limited automata approximation conversion technique.
The technique for converting it from the context-free grammar into limited automata was researched. In conversion. Automata were made by expressing the context-free grammar in the form of the reflexive transition network, and developing them descending. The algorithm that developed the are with high use frequency by priority was developed as a development method according to the probability calculation.
^*Design and mounting of parsing gradual progress system.
The system of grammatical acquisition, the limited automata approximation, and parsing was designed, and mounted. Limited automata that consisted of the are of about 50 million were made for the achievement of a practicable analysis.
^*Evaluation for comparison of parsing.
The parsing experiment in English, Japanese, and Thai was executed by using the bench mark. The evaluation for comparison of this analytical technique from a diversified viewpoint like accuracy, time, and the number and the form etc. of the syntax tree was executed as a result.
The realizability of robust incremental parsing was verified by the automata approximation of the context-free grammar through the research on two years, and the effect on the speed-up of the analysis was able to be confirmed. Less

Report

(3 results)
  • 2004 Annual Research Report   Final Research Report Summary
  • 2003 Annual Research Report
  • Research Products

    (19 results)

All 2005 2004 Other

All Journal Article (11 results) Patent(Industrial Property Rights) (2 results) Publications (6 results)

  • [Journal Article] Incremental dependency parsing based on headed context-free grammar2005

    • Author(s)
      Yoshihide Kato
    • Journal Title

      Systems and Computers in Japan 36・2

      Pages: 63-77

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Journal Article] Robust Dependency Parsing of Spontaneous Japanese Spoken Language2005

    • Author(s)
      Tomohiro Ohno
    • Journal Title

      IEICE Transactions on Information and Systems E88-D・3

      Pages: 545-552

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Journal Article] CIAIR In-car Speech Corpus-Influence of Driving Status-2005

    • Author(s)
      Nobuo Kawaguchi
    • Journal Title

      IEICE Transactions on Information and Systems E88-D・3

      Pages: 578-582

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Incremental dependency parsing based on headed context-free grammar2005

    • Author(s)
      Yoshihide Kato
    • Journal Title

      Systems and Computers in Japan Vol.36, No.2

      Pages: 63-77

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Robust Dependency Parsing of Spontaneous Japanese Spoken Language2005

    • Author(s)
      Tomohiro Ohno
    • Journal Title

      IEICE Transactions on Information and Systems Vol.E88-D, No.3

      Pages: 545-552

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] CIAIR In-car Speech Corpus-Influence of Driving Status-2005

    • Author(s)
      Nobuo Kawaguchi
    • Journal Title

      IEICE Transactions on Information and Systems Vol.E88-D, No.3

      Pages: 578-582

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Expansion of a Japanese-Uighur Bilingual Dictionary by Paraphrasing2005

    • Author(s)
      Yasuhiro Ogawa
    • Journal Title

      Journal of Natural Language Processing Vol.11, No.5

      Pages: 39-61

    • NAID

      10014051190

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] CLAIR in-car Speech Corpus -Influence of Driving Status-2005

    • Author(s)
      Nobuo Kawaguchi
    • Journal Title

      IEICE Transactions on Information and Systems E88-D・3

      Pages: 578-582

    • Related Report
      2004 Annual Research Report
  • [Journal Article] 日本語言い換え処理を利用した日本語-ウイグル語対訳辞書の拡充2004

    • Author(s)
      小川泰弘
    • Journal Title

      自然言語処理 11・5

      Pages: 39-61

    • NAID

      10014051190

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Journal Article] Stochastically Evaluating the Validity of Partial Parse Trees in Incremental Parsing2004

    • Author(s)
      Yoshihide Kato
    • Journal Title

      Proceedings of ACL Workshop Incremental Parsing

      Pages: 9-15

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Journal Article] CIAIR Simultaneous Interpretation Corpus2004

    • Author(s)
      Hitomi Tohyama
    • Journal Title

      Proceedings of Oriental COCOSDA 2004

      Pages: 72-77

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Patent(Industrial Property Rights)] 同時翻訳用有限状態トランスデユーサの作成装置2004

    • Inventor(s)
      松原 茂樹, 稲垣 康善, 笠 浩一朗
    • Industrial Property Rights Holder
      財団法人名古屋産業科学研究所
    • Industrial Property Number
      2004-216878
    • Filing Date
      2004-07-26
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Patent(Industrial Property Rights)] 同時翻訳用有限状態トランスデューサの作成装置2004

    • Inventor(s)
      松原 茂樹, 稲垣 康善, 笠 浩一朗
    • Industrial Property Rights Holder
      財団法人名古屋産業科学研究所
    • Industrial Property Number
      2004-216878
    • Filing Date
      2004-07-26
    • Related Report
      2004 Annual Research Report
  • [Publications] 大原 誠: "同時通訳を介した異言語間対話の時間的特長--逐次通訳との比較に基づく対訳コーパスの分析"通訳研究. 3. 35-53 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Koichiro Ryu: "Bilingual Speech Dialogue Corpus for Simultaneous Machine Interpretation Research"Proceedings of Oriental International Coordinating Committee on Speech Databases and Speech I/O System Assessment. 217-224 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Tomohiro Ohno: "Spiral Construction of Syntactically Annotated Spoken Language Corpus"Proceedings of IEEE International Conference on Natural Language Processing and Knowledge Engineering. 477-483 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Makoto Ohara: "Automatic Extraction of Translation Patterns from Bilingual Legal Corpus"Proceedings of IEEE International Conference on Natural Language Processing and Knowledge Engineering. 150-157 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Yuki Irie: "An Advanced Japanese Speech Corpus for In-car Spoken Dialogue Research"Proceedings of Oriental International Coordinating Committee on Speech Databases and Speech I/O System Assessment. 209-216 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Itsuki Kishida: "Construction of an Advanced In-Car Spoken Dialogue Corpus and its Characteristic Analysis"Proceedings of 8th European Conference on Speech Communication and Technology. (2003)

    • Related Report
      2003 Annual Research Report

URL: 

Published: 2003-04-01   Modified: 2021-04-07  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi