• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Study on High Performance Classification Method for Constructing Information Resources from Large Scale WWW Data

Research Project

Project/Area Number 18300037
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Media informatics/Database
Research InstitutionNational Institute of Informatics

Principal Investigator

OYAMA Keizo  National Institute of Informatics, コンテンツ科学研究系, 教授 (90177022)

Co-Investigator(Kenkyū-buntansha) 高須 淳宏  国立情報学研究所, コンテンツ科学研究系, 教授 (90216648)
相澤 彰子  国立情報学研究所, コンテンツ科学研究系, 教授 (90222447)
高久 雅生  国立情報学研究所, 情報・システム研究機構新領域融合研究センター, 融合プロジェクト研究員 (00399271)
Co-Investigator(Renkei-kenkyūsha) TAKASU Atsuhiro  国立情報学研究所, コンテンツ科学研究系, 教授 (90216648)
AIZAWA Akiko  国立情報学研究所, コンテンツ科学研究系, 教授 (90222447)
TAKAKU Masao  物質・材料研究機構, 科学情報室, 主任エンジニア (00399271)
Project Period (FY) 2006 – 2008
Project Status Completed (Fiscal Year 2008)
Budget Amount *help
¥9,450,000 (Direct Cost: ¥7,800,000、Indirect Cost: ¥1,650,000)
Fiscal Year 2008: ¥3,380,000 (Direct Cost: ¥2,600,000、Indirect Cost: ¥780,000)
Fiscal Year 2007: ¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000)
Fiscal Year 2006: ¥2,300,000 (Direct Cost: ¥2,300,000)
KeywordsWebページ分類 / テキスト分類 / 機械学習 / 周辺ページ / 性能保証 / 判定コスト / 情報資源 / 情報検索
Research Abstract

ウェブデータから情報資源を構築する際の省力化には,ウェブページの自動分類の精度を高める必要がある。本研究では,周辺ページの内容を有効に活用して分類性能を高めるため,ウェブサイト内のリンクとディレクトリ階層に表現された潜在的意味を活用する手法,及び分類に悪影響を与える周辺ページを除去する手法を開発し,実験により有効性を確認した。本手法により,人手による確認・判定作業を大幅に削減することが可能となった。

Report

(4 results)
  • 2008 Annual Research Report   Final Research Report ( PDF )
  • 2007 Annual Research Report
  • 2006 Annual Research Report
  • Research Products

    (23 results)

All 2009 2008 2007 2006

All Journal Article (15 results) (of which Peer Reviewed: 9 results) Presentation (8 results)

  • [Journal Article] Web page classification based on surrounding page model representing connection type and directory hierarchy2009

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Journal Title

      情報処理学会論文誌データベース No.TOD-42 (印刷中)

    • Related Report
      2008 Final Research Report
    • Peer Reviewed
  • [Journal Article] Building web page collections efficiently exploiting local surrounding pages2009

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Journal Title

      Progress in Informatics No.6

      Pages: 27-39

    • NAID

      110007030564

    • Related Report
      2008 Final Research Report
    • Peer Reviewed
  • [Journal Article] Building web page collections efficiently exploiting local surrounding pages2009

    • Author(s)
      Yuxin WANG, Keizo OYAMA
    • Journal Title

      Progress in Informatics No. 6

      Pages: 27-39

    • NAID

      110007030564

    • Related Report
      2008 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Web Page Classification based on Surrounding Page Model representing Connection Type and Directory Hierarchy2009

    • Author(s)
      Yuxin WANG, Keizo OYAMA
    • Journal Title

      情報処理学会論文誌データベース TOD42号(印刷中)

    • Related Report
      2008 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 大規模データベースを利用したリンケージシステムの提案と実装2008

    • Author(s)
      相澤彰子, 高久雅生, 大山敬三
    • Journal Title

      日本データベース学会Letters Vol.6, No.4

      Pages: 17-20

    • NAID

      40015959138

    • Related Report
      2008 Final Research Report
    • Peer Reviewed
  • [Journal Article] 大規模データベースを利用したリンケージシステムの提案と実装2008

    • Author(s)
      相澤彰子, 高久雅生, 大山敬三
    • Journal Title

      日本データベース学会Letters 6(4)

      Pages: 17-20

    • NAID

      40015959138

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Framework for Building a High-Quality Web Page Collection Considering Page Group Structure2007

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Journal Title

      Proc. APWeb/WAIM 2007, HuangShan, China, June 16-18, 2007 LNCS 4505

      Pages: 95-107

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] A Smoothing Method for a Statistical String Similarity2007

    • Author(s)
      Atsuhiro Takasu, Kenro Aihara, Taizo Yamada
    • Journal Title

      Proc. IEEE Intl. Conf. on Information Reuse and Integration (IRI2007)

      Pages: 67-72

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Web Page Classification Considering Page Group Structure for Building a High-Quality Homepage Collection2007

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Journal Title

      Proc. 3rd International Conference on Web Information Systems and Technologies (WEBIST 2007) Vol. WIA

      Pages: 170-175

    • Related Report
      2006 Annual Research Report
  • [Journal Article] Combining page group structure and content for roughly filtering researchers' homepages with high recall2006

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Journal Title

      情報処理学会論文誌データベース Vol.47, No.SIG 8

      Pages: 11-23

    • Related Report
      2008 Final Research Report
    • Peer Reviewed
  • [Journal Article] Combining Page Group Structure and Content for Roughly Filtering Researchers' Homepages with High Recall2006

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Journal Title

      情報処理学会論文誌データベース Vol.47, No.SIG 8 (TOD 30)

      Pages: 11-23

    • Related Report
      2006 Annual Research Report
  • [Journal Article] An Analysis on Topic Features and Difficulties based on Web Navigational Retrieval Experiments2006

    • Author(s)
      Masao Takaku, Keizo Oyama, Akiko Aizawa
    • Journal Title

      Proc. Asia Information Retrieval Symposium (AIRS) 2006 LNCS, Vol. 4182/2006

      Pages: 625-632

    • Related Report
      2006 Annual Research Report
  • [Journal Article] Web Page Classification Exploiting Contents of Surrounding Pages for Building a High-quality Homepage Collection2006

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Journal Title

      Proc. 9th International Conference on Asian Digital Libraries (ICADL2006) LNCS, Vol. 4312/2006

      Pages: 515-518

    • Related Report
      2006 Annual Research Report
  • [Journal Article] An Approximate Multi-word Matching Algorithm for Robust Document Retrieval2006

    • Author(s)
      Atsuhiro Takasu
    • Journal Title

      Proc. ACM Conference on Knowledge and Information Management (CIKM)

      Pages: 34-42

    • Related Report
      2006 Annual Research Report
  • [Journal Article] Quality Enhancement in Information Extraction from Scanned Documents2006

    • Author(s)
      Atsuhiro Takasu, Kenro Aihara
    • Journal Title

      Proc. ACM Symposium on Document Engineering (DocEng)

      Pages: 122-124

    • Related Report
      2006 Annual Research Report
  • [Presentation] Name disambiguation of Japanese researchers: a case study with statistics research community2008

    • Author(s)
      Masao Takaku, Akiko Aizawa, Yasumasa Baba
    • Organizer
      Joint Meeting of 4th World Conference of the IASC and 6th Conference of the Asian Regional Section of the IASC on Computational Statistics & Data Analysis (IASC2008)
    • Place of Presentation
      Yokohama, Japan
    • Year and Date
      2008-12-05
    • Related Report
      2008 Final Research Report
  • [Presentation] Name Disambiguation of Japanese Researchers : A Case Study with Statistics Research Community2008

    • Author(s)
      Masao Takaku, Akiko Aizawa, Yasumasa Baba
    • Organizer
      Joint Meeting of 4th World Conference of the IASC and 6^<th> Conference of the Asian Regional Section of the IASC on Computational Statistics & Data Analysis (IASC2008)
    • Place of Presentation
      Yokohama, Japan
    • Year and Date
      2008-12-05
    • Related Report
      2008 Annual Research Report
  • [Presentation] Web page classification exploiting surrounding pages with noisy page filtering2008

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Organizer
      The 2008 International Conference on Data Mining (DMIN2008)
    • Place of Presentation
      Las Vegas, Nevada, USA
    • Year and Date
      2008-07-14
    • Related Report
      2008 Final Research Report
  • [Presentation] Web Page Classification exploiting Surrounding Pages with Noisy Page Filtering2008

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Organizer
      The 2008 International Conference on Data Mining (DMIN2008)
    • Place of Presentation
      Las Vegas, Nevada, USA
    • Year and Date
      2008-07-14
    • Related Report
      2008 Annual Research Report
  • [Presentation] A smoothing method for a statistical string similarity2007

    • Author(s)
      Atsuhiro Takasu, Kenro Aihara, Taizo Yamada
    • Organizer
      IEEE Intl. Conf. on Information Reuse and Integration (IRI2007)
    • Place of Presentation
      Las Vegas, USA
    • Year and Date
      2007-08-13
    • Related Report
      2008 Final Research Report
  • [Presentation] Framework for building a high-quality web page collection considering page group structure2007

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Organizer
      Joint 9th Asia-Pacific Web Conference, APWeb 2007, and 8th International Conference, on Web-Age Information Management, WAIM 2007
    • Place of Presentation
      HuangShan, China
    • Year and Date
      2007-07-16
    • Related Report
      2008 Final Research Report
  • [Presentation] Web page classification considering page group structure for building a high-quality homepage collection2007

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Organizer
      Third International Conference on Web Information Systems and Technologies (WEBIST 2007)
    • Place of Presentation
      Barcelona, Spain
    • Year and Date
      2007-03-03
    • Related Report
      2008 Final Research Report
  • [Presentation] Web page classification exploiting contents of surrounding pages for building a high-quality homepage collection2006

    • Author(s)
      Yuxin Wang, Keizo Oyama
    • Place of Presentation
      Kyoto, Japan
    • Year and Date
      2006-11-27
    • Related Report
      2008 Final Research Report

URL: 

Published: 2006-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi