• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Augmenting Terminologies through Proactive Extraction of Term Translation Pairs from the Web

Research Project

Project/Area Number 24650122
Research Category

Grant-in-Aid for Challenging Exploratory Research

Allocation TypeMulti-year Fund
Research Field Library and information science/Humanistic social informatics
Research InstitutionThe University of Tokyo

Principal Investigator

KAGEURA Kyo  東京大学, 大学院情報学環, 教授 (00211152)

Co-Investigator(Kenkyū-buntansha) TAKEUCHI Koichi  岡山大学, 大学院自然科学研究科, 講師 (80311174)
Project Period (FY) 2012-04-01 – 2015-03-31
Project Status Completed (Fiscal Year 2014)
Budget Amount *help
¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)
Fiscal Year 2014: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Fiscal Year 2013: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2012: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Keywords専門語彙 / Webクローリング / 対訳抽出 / 語彙成長 / 語彙ネットワーク
Outline of Final Research Achievements

How native and borrowed constituent elements contribute to the construction of technical terminology, how these elements are used when the terminology glows. By defining terminological network (with terms as vertices and shared constituents as edges) and constituent network (with constituent elements as vertices and co-occurrence in terms as edges), indices to evaluate consistency and coherency of terminology were defined. By using these observations, we developed a method of producing bilingual new term pair candidates from existing terminologies and validating them through monolingual and comparable domain corpora obtained from the web. Experiments have shown that the performance of bilingual term crawling is at least comparable with existing corpus-based extraction method, and complementary in the sense that they extract different types of pairs, which are more relevant to existing terminologies. Theoretical implications of this work was clarified in terms of lexicograpic issues.

Report

(4 results)
  • 2014 Annual Research Report   Final Research Report ( PDF )
  • 2013 Research-status Report
  • 2012 Research-status Report
  • Research Products

    (8 results)

All 2015 2014 2013 2012

All Presentation (4 results) (of which Invited: 2 results) Book (4 results)

  • [Presentation] Terminology-driven terminology augmentation2014

    • Author(s)
      Koichi Takeuchi and Kyo Kageura
    • Organizer
      The 14th China-Japan Natural Language Processing Joint Research Promotion Conference
    • Place of Presentation
      Chengdu
    • Year and Date
      2014-10-12 – 2014-10-14
    • Related Report
      2014 Annual Research Report
  • [Presentation] The sphere of terminology: between ontological system and textual corpora2014

    • Author(s)
      Kyo Kageura
    • Organizer
      Terminology and Knowledge Engineering
    • Place of Presentation
      Berlin
    • Year and Date
      2014-06-19 – 2014-06-21
    • Related Report
      2014 Annual Research Report
    • Invited
  • [Presentation] Terminology-driven Augmentation of Bilingual Terminologies2013

    • Author(s)
      Koichi Sato, Koichi Takeuchi and Kyo Kageura
    • Organizer
      MT Summit XIV 2013
    • Place of Presentation
      Nice, France
    • Related Report
      2013 Research-status Report
  • [Presentation] The status of "new terms" from the point of view of language practitioners, and the crawling of new term translation pairs from the Web2012

    • Author(s)
      Kyo Kageura
    • Organizer
      Neology in Specialized Languages: Detection, Implantation and Circulation of New Terms
    • Place of Presentation
      Lyon, France
    • Related Report
      2012 Research-status Report
    • Invited
  • [Book] Dury, et al. eds. La Neologie en Langue de Specialite ("Augmenting terminology by crawling new term translation pairs from textual corpora")2015

    • Author(s)
      Kyo Kageura
    • Total Pages
      275
    • Publisher
      CRTT
    • Related Report
      2014 Annual Research Report
  • [Book] Kockaert and Steurs, eds. Hadbook of Terminology ("Terminology and lexicography")2015

    • Author(s)
      Kyo Kageura
    • Total Pages
      872
    • Publisher
      John Benjamins
    • Related Report
      2014 Annual Research Report
  • [Book] Building and Using Comparable Corpora (Kyo Kageura and Takeshi Abekawa "The place of comparable corpora in providing terminological reference information to online translators: a strategic framework" pp. 285-301)2013

    • Author(s)
      Serge Sharoff, Pierre Zweigenbaum and Reinhard Rapp
    • Total Pages
      335
    • Publisher
      Springer
    • Related Report
      2013 Research-status Report
  • [Book] The Quantitative Analysis of the Dynamics and Structure of Terminologies.2012

    • Author(s)
      Kyo Kageura
    • Total Pages
      243
    • Publisher
      John Benjamins
    • Related Report
      2012 Research-status Report

URL: 

Published: 2013-05-31   Modified: 2019-07-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi