ネットサーチエンジンにおける表構造の索引化と意味的多義性解消への応用

Research Project

Project/Area Number	13780336
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Single-year Grants
Research Field	情報システム学(含情報図書館学)
Research Institution	The University of Tokushima
Principal Investigator	獅々堀正幹徳島大学, 工学部, 助教授 (50274262)
Project Period (FY)	2001 – 2002
Project Status	Completed (Fiscal Year 2002)
Budget Amount *help	¥2,000,000 (Direct Cost: ¥2,000,000) Fiscal Year 2002: ¥900,000 (Direct Cost: ¥900,000) Fiscal Year 2001: ¥1,100,000 (Direct Cost: ¥1,100,000)
Keywords	WWW / 表構造解析 / 表内容解析 / 情報抽出 / 意味的多義性 / 表構造 / 索引化 / HTML / 知識獲得 / 情報検索
Research Abstract	本研究は,WWW空間上に存在するHTML形式の表構造から言語学的な知識を自動獲得することを目的としている.従来,WWW空間上のデータを対象にしたネットサーチエンジンに代表される全文検索技術では,HTMLタグ情報を考慮していないため,表構造内の単語については,各項目間の関係が無視されていた.しかしながら,表構造内の各項目には,属性と属性値の関係が成り立つものが多数存在しており,大規模な表構造を収集すれば,言語学的な知識が抽出できると考えている. 本研究の平成13年度の実施計画目標であった「表構造から各項目の位置情報を生成する表構造解析アルゴリズムの確立と効率的な索引化手法の考案・評価」に対しては,位置情報をコンパクトなビット列で表現する手法を考案した.本手法を用いると,位置情報がコンパクトに表現できるだけでなく,表構造内において縦横の位置に存在する項目を高速に検索することが可能になった. また,本研究の平成14年度の実施計画目標は「表構造内に存在する固有名詞の意味情報を特定する表内容解析アルゴリズムの確立とその結果を用いた検索質問が有する意味的多義性を考慮したネット検索エンジンの開発」であった.これに対して,各項目の意味情報は,各項目の縦横上位下位方向に存在する項目内容(これを表内の文脈と呼ぶ)に反映されている点に着目し,教師データを用いて相互情報量により文脈間の類似性を計算し,表内容解析を行うアルゴリズムを提案した.また,応用システムとして,表内の情報を問い合わせるシステム,ホームページ内に存在する表を読み上げるシステム等を開発し,その有効性を確認した. 本研究成果は,情報処理学会の自然言語処理研究会およびデータベースシステム研究会にて口頭発表しており,情報処理学会論文誌にも投稿中である.

Report

(2 results)

2002 Annual Research Report
2001 Annual Research Report

Research Products
(14 results)

All Other

All Publications (14 results)

[Publications] Sangkon Lee, Masami Shishibori, Junichi Aoe: "Extraction of Field-coherent passages"Int'l J. Information Processing & Management. Vol38.No2. 173-207 (2002)
- Related Report
  2002 Annual Research Report
[Publications] E.Atlam, M.Okada, M.Shihibori, J.Aoe: "An Evaluation Method of Word Tendency Depending on Time-series Variation and its Improvements"Int'l J. Information Processing & Management. Vol38,No2. 157-171 (2002)
- Related Report
  2002 Annual Research Report
[Publications] M.Jung, M.Shishibori, A.Tanaka, J.Aoe: "A Dynamic Construction Algorithm for the Compact Patricia Trie using the Hierarchical Structure"Int'l J. Information Processing & Management. Vol38,No2. 221-236 (2002)
- Related Report
  2002 Annual Research Report
[Publications] Sangon Lee, Masami Shishibori: "Passage Segmentation Based on Topic Matter"Int'l J. of Computer Processing of Oriental Languages. Vol15,No.3. 305-340 (2002)
- Related Report
  2002 Annual Research Report
[Publications] M.Shishibori, S.Lee, M.Oono, J.Aoe: "Improvement of the LR Parsing Table and Its Application to Grammatical Error Correction"Int'l J. of Information Sciences. Vol148,No4. 11-26 (2002)
- Related Report
  2002 Annual Research Report
[Publications] 柘植覚, 獅々堀正幹, 黒岩眞吾, 北研二: "サポートベクターマシンによる適合性フィードバックを用いた情報検索"情報処理学会論文誌. Vol44,No1. 59-67 (2003)
- Related Report
  2002 Annual Research Report
[Publications] 北研二, 津田和彦, 獅々堀正幹: "情報検索アルゴリズム"共立出版. 212 (2002)
- Related Report
  2002 Annual Research Report
[Publications] Masami Shishibori, Kazuaki ando, Jun-ichi Aoe: "A Filtering Method for E-mail Documents based on Personal Profiles"Proceedings of the 19th Int'l Conf. on Computer Processing of Oriental Languages. 69-72 (2001)
- Related Report
  2001 Annual Research Report
[Publications] Masami Shishibori, Minsoo Jung, Satoru Tsuge, Jun-ichi Aoe: "Improvement of the Hierarchical Compact Patricia Trie for a Dynamic Large Key Set"Proceeding of 5th International Conference on Knowledge-Based Intelligent Information Engineering Systems & Allied Technologies. 7. 581-585 (2001)
- Related Report
  2001 Annual Research Report
[Publications] Masami Shishibori, Kazuaki Ando, Jun-ichi Aoe: "A E-mail Filtering System Based on Personal Profiles"Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium. 609-616 (2001)
- Related Report
  2001 Annual Research Report
[Publications] EL-Sayed Atlam, Makoto Okada, Masami Shishibori, Jun-ichi Aoe: "An Evaluation Method of Words Tendency Depending on Time-series Variation and Its Improvements"Journal of Information Processing & Management. Vol38, No2. 157-171 (2002)
- Related Report
  2001 Annual Research Report
[Publications] Sangkon Lee, Masami Shishibori, Toru Sumitomo, Jun-ichi Aoe: "Extraction of Field-coherent Passages"Journal of Information Processing & Management. Vol38, No.2. 173-207 (2002)
- Related Report
  2001 Annual Research Report
[Publications] Minsoo Jung, Masami Shishibori, Akihiro Tanaka, Jun-ichi Aoe: "A Dynamic Construction Algorithm for the Compact Patricia Trie using the Hierarchical Structure"Journal of Information Processing & Management. Vol38, No2. 221-236 (2002)
- Related Report
  2001 Annual Research Report
[Publications] 北研二, 津田和彦, 獅々堀正幹: "情報検索アルゴリズム"共立出版株式会社. 212 (2002)
- Related Report
  2001 Annual Research Report

ネットサーチエンジンにおける表構造の索引化と意味的多義性解消への応用

Principal Investigator

獅々堀 正幹 徳島大学, 工学部, 助教授 (50274262)

¥2,000,000 (Direct Cost: ¥2,000,000)

Report

Research Products

[Publications] Sangkon Lee, Masami Shishibori, Junichi Aoe: "Extraction of Field-coherent passages"Int'l J. Information Processing & Management. Vol38.No2. 173-207 (2002)

Related Report

[Publications] E.Atlam, M.Okada, M.Shihibori, J.Aoe: "An Evaluation Method of Word Tendency Depending on Time-series Variation and its Improvements"Int'l J. Information Processing & Management. Vol38,No2. 157-171 (2002)

Related Report

[Publications] M.Jung, M.Shishibori, A.Tanaka, J.Aoe: "A Dynamic Construction Algorithm for the Compact Patricia Trie using the Hierarchical Structure"Int'l J. Information Processing & Management. Vol38,No2. 221-236 (2002)

Related Report

[Publications] Sangon Lee, Masami Shishibori: "Passage Segmentation Based on Topic Matter"Int'l J. of Computer Processing of Oriental Languages. Vol15,No.3. 305-340 (2002)

Related Report

[Publications] M.Shishibori, S.Lee, M.Oono, J.Aoe: "Improvement of the LR Parsing Table and Its Application to Grammatical Error Correction"Int'l J. of Information Sciences. Vol148,No4. 11-26 (2002)

Related Report

[Publications] 柘植 覚, 獅々堀 正幹, 黒岩 眞吾, 北 研二: "サポートベクターマシンによる適合性フィードバックを用いた情報検索"情報処理学会論文誌. Vol44,No1. 59-67 (2003)

Related Report

[Publications] 北 研二, 津田 和彦, 獅々堀 正幹: "情報検索アルゴリズム"共立出版. 212 (2002)

Related Report

[Publications] Masami Shishibori, Kazuaki ando, Jun-ichi Aoe: "A Filtering Method for E-mail Documents based on Personal Profiles"Proceedings of the 19th Int'l Conf. on Computer Processing of Oriental Languages. 69-72 (2001)

Related Report

Related Report

[Publications] Masami Shishibori, Kazuaki Ando, Jun-ichi Aoe: "A E-mail Filtering System Based on Personal Profiles"Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium. 609-616 (2001)

Related Report

[Publications] EL-Sayed Atlam, Makoto Okada, Masami Shishibori, Jun-ichi Aoe: "An Evaluation Method of Words Tendency Depending on Time-series Variation and Its Improvements"Journal of Information Processing & Management. Vol38, No2. 157-171 (2002)

Related Report

[Publications] Sangkon Lee, Masami Shishibori, Toru Sumitomo, Jun-ichi Aoe: "Extraction of Field-coherent Passages"Journal of Information Processing & Management. Vol38, No.2. 173-207 (2002)

Related Report

[Publications] Minsoo Jung, Masami Shishibori, Akihiro Tanaka, Jun-ichi Aoe: "A Dynamic Construction Algorithm for the Compact Patricia Trie using the Hierarchical Structure"Journal of Information Processing & Management. Vol38, No2. 221-236 (2002)

Related Report

[Publications] 北研二, 津田和彦, 獅々堀正幹: "情報検索アルゴリズム"共立出版株式会社. 212 (2002)

Related Report

獅々堀正幹徳島大学, 工学部, 助教授 (50274262)

[Publications] 柘植覚, 獅々堀正幹, 黒岩眞吾, 北研二: "サポートベクターマシンによる適合性フィードバックを用いた情報検索"情報処理学会論文誌. Vol44,No1. 59-67 (2003)

[Publications] 北研二, 津田和彦, 獅々堀正幹: "情報検索アルゴリズム"共立出版. 212 (2002)