• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Construction of Large Scale Japanese Text Database Based on Advanced Retrieval Method

Research Project

Project/Area Number 61880005
Research Category

Grant-in-Aid for Developmental Scientific Research

Allocation TypeSingle-year Grants
Research Field Informatics
Research InstitutionUniversity of Tokyo

Principal Investigator

FUJISAKI Hiroya  Faculty of Engineering, University of Tokyo, 工学部, 教授 (80010776)

Co-Investigator(Kenkyū-buntansha) KAMEDA Hiroyuki  Faculty of Engineering, Tokyo Engineering University, 工学部, 講師 (00194994)
宮崎 幸一  (株)朝日新聞社, 東京本社, 制作局局長
KURASHIMA Tokihisa  Dictionary Department, Sanseido co., ltd., 国語辞書編集所所長
TANAKA Yasuhito  Management and Information Science Department, Himeji College, 経営情報工学, 助教授 (00163585)
OGINE Tsunao  Institute of Literature and Linguistics, University of Tsukuba, 文芸・言語学系, 助教授 (00111443)
MIYAZAKI Koichi  Production Department, Tokyo Main Office, Asahi Shinbun Publish Company
広瀬 啓吉  東京大学, 工学部, 助教授 (50111472)
Project Period (FY) 1986 – 1987
Project Status Completed (Fiscal Year 1987)
Budget Amount *help
¥9,500,000 (Direct Cost: ¥9,500,000)
Fiscal Year 1987: ¥3,000,000 (Direct Cost: ¥3,000,000)
Fiscal Year 1986: ¥6,500,000 (Direct Cost: ¥6,500,000)
KeywordsRetrieval With Advanced Functions / Large Scale Japanese Text Database / Retrieval of Linguistic Usages / Morpheme Analysis / 品詞情報自動付与 / 形態素解析 / 読み情報自動付与
Research Abstract

The study of construction of a large scale text database with advanced retrieval functions yielded the following outcomes.
1. Compilation of word dictionary for linguistic analysis: the dictionary, which was already generated from two computer-readable dictionaries, i<e., the Shinmeikai-Kokugojiten and the Nihongotango-Kikaijisho, was modified and expanded by adding proper nouns and other words frequently occurring in daily newspaper articles in orser for the dictionary to be applied to analyze a large amount of newspaper articles. The number of lexical items of this dictionary reachek about 200,000. Each item holds information on spellings, conjugations and declensions.
2. Algorithms for analysis of morphemes and part of speeches and their implementation on a computer: Relationship between part of speeches was exhaustively investigated. As the result, as 86 by 59 connection table of part of speech was obtained and the structure ob bunsetsu was described in a transition network. Furthermore, algorithms for analysis of morphemes and part of speeches, which are based on the grammatical knowledge mentioned above, were made up and implemented in FORTRAN77 on a large scale computer of the University of Tokyo.
3. Equipment of newspaper date: Articles of Japanese daily newspaper of 84 days were selected from the Asahi Newspaper in 1982, and were processed and stored on the large scale computer as a text data.
4. Construction of database system with advanced retrieval functions: Based on these outcomes, a large scale text database was constructed, which is furnished with the advanced retrieval functions such that character, sequence of characters, word, sequence of words, part of speech, sequence of part of speeches and arbitrary combinations of these can be utilized as retrieval keys. The database management system was built in FORTRAN77 on the large scale computer.
As mentioned above, the aims of this study has fully accomplished.

Report

(2 results)
  • 1987 Final Research Report Summary
  • 1986 Annual Research Report
  • Research Products

    (18 results)

All Other

All Publications (18 results)

  • [Publications] 藤崎 博也: 情報処理学会第33回全国大会講演論文集. 1831-1832 (1986)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] 藤崎 博也: 情報処理学会第35回全国大会講演論文集. 1269-1270 (1987)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] 藤崎 博也: 情報処理学会第36回全国大会講演論文集. (1988)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] 荻野 綱男: 計量国語学. 16. 81-87 (1987)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] 田中 康仁: 情報処理学会第35回全国大会講演論文集. 1211-1212 (1987)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] 亀田 弘之: 情報処理学会論文誌. 28. 1103-1111 (1987)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] Organizatgion of Large Scale Japanese Text Database with advanced functions: Reports of the 33th Meeting of Information Processing Society of Japna. 1831-1832 (1986)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] Hiroya Fujisaki: "Lexical Category Analysis for a lorge-scale Japanese Text Database with Advanced Functions" Reports of the 33th Meeting of Information Processing Society of Japan. 1269-1270 (1987)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] Hiroya Fujisaki: "Morphemic and Syntactic Analysis for Constructing a Text Database with Advanced Functions" Reports of the 36th Meeting of Information Processing Society of Japan. (1988)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] Tsunao Ogino: "Methodology to Evaluate the Performance of Kna-Kanji Conversion Systems" Computational Linguistics. 16. 81-87 (1987)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] Yasuhito Tanaka: "Acquistition of Knowlledge Data by Analyzing Natural Language" Reports of the 35th Meeting of Information Processing Society of Japan. 1211-1212 (1987)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] Hiroyuki Kameda: "Classification and Retrieval System for Newspaper Information Based on a Theme - Key Concept - Key Word Hierarchy" Transactions of Information Processing Society of Japan. 1103-1111 (1987)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1987 Final Research Report Summary
  • [Publications] 亀田弘之: 情報処理学会第33回全国大会講演論文集. 1831-1832 (1986)

    • Related Report
      1986 Annual Research Report
  • [Publications] 亀田弘之: 情報処理学会第33回全国大会講演論文集. 1833-1834 (1986)

    • Related Report
      1986 Annual Research Report
  • [Publications] 荻野綱男: マイ・ワープロ. (1987)

    • Related Report
      1986 Annual Research Report
  • [Publications] 荻野綱男: 日本言語学会第93回研究発表会資料. 54 (1986)

    • Related Report
      1986 Annual Research Report
  • [Publications] 田中康仁: 情報処理学会第34回全国大会講演論文集. (1987)

    • Related Report
      1986 Annual Research Report
  • [Publications] 田中康仁: 情報処理学会自然言語研究会資料. (1987)

    • Related Report
      1986 Annual Research Report

URL: 

Published: 1987-03-31   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi