• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

A study on adaptive indexing method for dedicated portals

Research Project

Project/Area Number 18500093
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Media informatics/Database
Research InstitutionNational Institute of Informatics

Principal Investigator

AIZAWA Akiko  National Institute of Informatics, Digital Content and Media Sciences Research Division, Professor (90222447)

Project Period (FY) 2006 – 2007
Project Status Completed (Fiscal Year 2007)
Budget Amount *help
¥4,010,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥510,000)
Fiscal Year 2007: ¥2,210,000 (Direct Cost: ¥1,700,000、Indirect Cost: ¥510,000)
Fiscal Year 2006: ¥1,800,000 (Direct Cost: ¥1,800,000)
Keywordsnatural language processing / compound extraction / dictionary construction / information retrieval / lexicon / dedicated portal sites / indexing tools / CRF / 専用ポータル / 語彙抽出 / 専門ポータル / EM法
Research Abstract

In recent years, constructing dedicated web portals has become a common practice for academic people. These portals are valuable information sources to maintain the diversity of the web contents and to disseminate academic or specialized knowledge to the public. Dedicated portals with specialized content require a good term extraction tool in order to identify multi-word expressions that are not found in general dictionaries. However, existing segmentation tools are not satisfactory for this purpose.
Based on the above, this study focuses on a keyword extraction method that enhances the search capability of dedicated portal servers. During the two years research period, we addressed to the followings :
1. A framework of automatic multi-word expression (or compounds) extraction where the following two modules are applied sequentially but independently: (A) a segmentation module that identifies longest multi-word regions from a given text input, and (B) a parsing module that analyzes the cost of word connections within a same multi-word region.
2. A new method for (B) where the tree structure of multi-words was determined using a statistical cost function. The parameters for the function are obtained by applying CRF (conditional random field) to the technical terms extracted from handbooks' of academic societies.
The future issues include (i) the implementation of a lightweight tool for automatic keyword extraction using the proposed method, and (ii) the utilization of the extracted terms for search navigation or text categorization.

Report

(3 results)
  • 2007 Annual Research Report   Final Research Report Summary
  • 2006 Annual Research Report
  • Research Products

    (25 results)

All 2008 2007 2006

All Journal Article (14 results) (of which Peer Reviewed: 4 results) Presentation (11 results)

  • [Journal Article] 大規模テキストコーパスを用いた語の類似度計算に関する考察2008

    • Author(s)
      相澤 彰子
    • Journal Title

      情報処理学会論文誌 49-3

      Pages: 1426-1436

    • NAID

      110006644536

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
    • Peer Reviewed
  • [Journal Article] On calculating word similarity using large text corpora2008

    • Author(s)
      Akiko Aizawa
    • Journal Title

      IPSJ Journal 49-3

      Pages: 1426-1436

    • NAID

      110006644536

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Journal Article] 類語関係抽出タスクにおけるコーパス規模拡大の影響2008

    • Author(s)
      相澤彰子
    • Journal Title

      情報処理学会論文誌 49-3

      Pages: 1426-1436

    • NAID

      110004824217

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 名詞と動詞の依存関係を利用したテキストからのIS-A関係の発見方法2007

    • Author(s)
      中渡 瀬秀一, 相澤 彰子
    • Journal Title

      人工知能学会論文誌 22-6

      Pages: 585-594

    • NAID

      10022008204

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
    • Peer Reviewed
  • [Journal Article] 共起に基づく類似性尺度2007

    • Author(s)
      相澤 彰子
    • Journal Title

      オペレーションズ・リサーチ 52-11

      Pages: 706-712

    • NAID

      110006440287

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Journal Article] Discovering IS-A relationships from Text : a method based on Dependencies between Nouns and Verbs2007

    • Author(s)
      Hidekazu Nakawatase, Akiko Aizawa
    • Journal Title

      transaction of the Japanese Society for Artificial Intelligence Vol.22, No.6

      Pages: 585-594

    • NAID

      10022008204

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Journal Article] Co-occurrence based similarity measures2007

    • Author(s)
      Akiko Aizawa
    • Journal Title

      Communications of the Operations Research Society of Japan Vol.52, No.11

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Journal Article] 名詞と動詞の依存関係を利用したテキストからのIS-A関係の発見方法2007

    • Author(s)
      中渡瀬秀一、相澤彰子
    • Journal Title

      人工知能学会論文誌 22-6

      Pages: 585-594

    • NAID

      10022008204

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 共起に基づく類似性尺度2007

    • Author(s)
      相澤彰子
    • Journal Title

      オペレーションズ・リサーチ 52-11

      Pages: 706-712

    • NAID

      110006440287

    • Related Report
      2007 Annual Research Report
  • [Journal Article] テキストを媒体とする情報の伝達をめぐって2007

    • Author(s)
      相澤彰子
    • Journal Title

      人工知能学会学会誌 22, 1

      Pages: 14-14

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 語義の違いを検出するための大規模コーパス処理手法の検討2006

    • Author(s)
      相澤彰子
    • Journal Title

      電子情報通信学会 人工知能と知識処理研究会、 研究会資料 106, AI-38

      Pages: 57-62

    • NAID

      110004744920

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 係り受け関係を利用した類語・例文辞書構築法と大規模コーパスへの適用2006

    • Author(s)
      相澤彰子, 中渡瀬秀一
    • Journal Title

      人工知能学会全国大会(第20回)講演論文集

    • NAID

      130005023209

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 類語関係抽出タスクにおけるコーパス規模拡大の影響2006

    • Author(s)
      相澤彰子
    • Journal Title

      情報処理学会、第175回自然言語処理研究会, 研究会資料 NL-94

      Pages: 91-98

    • NAID

      110004824217

    • Related Report
      2006 Annual Research Report
  • [Journal Article] 書誌同定のためのリンケージシステムの試作2006

    • Author(s)
      相澤彰子
    • Journal Title

      大規模データ・リンケージ・データマイニングと統計手法予稿集,

      Pages: 87-87

    • Related Report
      2006 Annual Research Report
  • [Presentation] Multi-class named entity recognition via bootstrapping with dependency tree-based patterns2008

    • Author(s)
      Van B.Dang and Akiko Aizawa
    • Organizer
      the 12nd Pacific-Asia Conference on Knowledge Discovery and Discovery and Data Mining (PAKDD2008)
    • Place of Presentation
      Osaka,Japan
    • Year and Date
      2008-05-23
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] 検索用キーフレーズの解析及び抽出に関する検討2008

    • Author(s)
      長谷 川新, 相澤 彰子, 浜本 隆之
    • Organizer
      情報処理学会第70回全国大会予稿集
    • Place of Presentation
      東京
    • Year and Date
      2008-03-14
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Annual Research Report 2007 Final Research Report Summary
  • [Presentation] Multi-class named entity recognition via bootstrapping with dependency tree-based patterns2008

    • Author(s)
      Van B. Dang, Akiko Aizawa
    • Organizer
      the 12nd Pacific-Asia Conference on Knowledge Discovery and Data Mining
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] Webコーパスを用いた語の類似度計算に関する考察2007

    • Author(s)
      相澤 彰子
    • Organizer
      人工知能学会知識ベースシステム研究会
    • Place of Presentation
      東京
    • Year and Date
      2007-07-14
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Annual Research Report 2007 Final Research Report Summary
  • [Presentation] On calculating word similarity using Web as corpus2007

    • Author(s)
      Akiko Aizawa
    • Organizer
      JSAI SIG Technical Reports
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] 類語関係抽出タスクにおけるコーパス規模拡大の影響2006

    • Author(s)
      相澤 彰子
    • Organizer
      第175回自然言語処理研究会/第84回情報学基礎研究会・NL-94
    • Place of Presentation
      東京
    • Year and Date
      2006-09-12
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] 係り受け関係を利用した類語・例文辞書構築法と大規模コーパスへの適用2006

    • Author(s)
      相澤 彰子, 中渡 瀬秀一
    • Organizer
      人工知能学会全国大会(第20回)
    • Place of Presentation
      東京
    • Year and Date
      2006-06-08
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] 語義の違いを検出するための大規模コーパス処理方法の検討2006

    • Author(s)
      相澤 彰子
    • Organizer
      電子情報通信学会 人工知能と知識処理研究会
    • Place of Presentation
      東京
    • Year and Date
      2006-05-18
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] Detecting Semantic Diversity of Words in Large Scale Corpora2006

    • Author(s)
      Akiko Aizawa
    • Organizer
      IEICE Tech Reports, AI2006-11
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] Automatic Extraction of Synonyms with Sample Phrases using Dependency Analysis of Text and Its Application to Large-scale Corpora2006

    • Author(s)
      Akiko Aizawa, Hidekazu Nakawatase
    • Organizer
      The 20th Annual Conference of JSAI
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary
  • [Presentation] On the Effect of Corpus Size in Words Similarity Calculation2006

    • Author(s)
      Akiko Aizawa
    • Organizer
      SIG-report of IPSJ
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2007 Final Research Report Summary

URL: 

Published: 2006-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi