• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Discovery Knowledge and Data Mining from Structured Data

Research Project

Project/Area Number 13680459
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionHiroshima City University

Principal Investigator

MIYAHARA Tetsuhiro  Hiroshima City University, Faculty of Information Sciences, Associate Professor, 情報科学部, 助教授 (90209932)

Co-Investigator(Kenkyū-buntansha) KUBOYAMA Tetsuji  The University of Tokyo, Center for Collaborative Research, Research Associate, 国際産学共同研究センター, 助手 (80302660)
SHOUDAI Takayoshi  Kyushu University, Department of Informatics, Associate Professor, 大学院・システム情報科学研究院, 助教授 (50226304)
UCHIDA Tomoyuki  Hiroshima City University, Faculty of Information Sciences, Associate Professor, 情報科学部, 助教授 (70264934)
Project Period (FY) 2001 – 2003
Project Status Completed (Fiscal Year 2003)
Budget Amount *help
¥3,600,000 (Direct Cost: ¥3,600,000)
Fiscal Year 2003: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2002: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2001: ¥1,800,000 (Direct Cost: ¥1,800,000)
Keywordsdata mining / knowledge discovery / graph-structured data / semistructured data / tree structured pattern / HTML / XML file / 帰納推論
Research Abstract

The purpose of this research project is to give theoretical foundations of data mining systems from graph-structured data or tree-structured data. Recently, Web documents such as HTML files and XIML files have increased rapidly. Such Web documents have no rigid structure and are called semistructured data. In general, such semistructured Web documents are represented by rooted trees. We have proposed methods for discovering frequent tree structured patterns in semistructured Web documents by using a tag tree pattern as a hypothesis. A tag tree pattern is an edge labeled tree which has ordered or unordered children and structured variables. An edge label is a tag or a keyword in such Web documents, and a variable can be substituted by an arbitrary tree. So a tag tree pattern is suited for representing tree structured patterns in such Web documents. Information Extraction from semistructured data becomes more and more important. In order to extract meaningful or interesting contents from semistructured data, we need to extract common structured patterns from semistructured data. We have presented a method for extracting characteristic tag tree patterns from irregular semistructured data by using an algorithm for finding a minimally generalized tag tree pattern explaining given data. Also we have given various learning algorithms of term trees, which are tree structured patterns with structured variables, from tree structured data, since such learning algorithms give theoretical foundations of data mining from semistructured data.

Report

(4 results)
  • 2003 Annual Research Report   Final Research Report Summary
  • 2002 Annual Research Report
  • 2001 Annual Research Report
  • Research Products

    (51 results)

All Other

All Publications (51 results)

  • [Publications] Tetsuhiro Miyahara et al.: "Discovery of frequent tree structured patterns in semistructured web documents"Proc.PAKDD-2001, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2035. 47-52 (2001)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Takayoshi Shoudai et al.: "Polynomial time algorithms for finding unordered tree patterns with internal variables"Proc.FCT-2001, Lecture Notes in Computer Science, Springer-Verlag. 2138. 335-346 (2001)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Osamu Maruyama et al.: "Learning conformation rules"Proc.DS-2001, Lecture Notes in Computer Science, Springer-Verlag. 2226. 243-257 (2001)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Tetsuhiro Miyahara et al.: "Discovery of frequent tag tree patterns in semistructured web documents"Proc.PAKDD-2002, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2336. 341-355 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Kazuyoshi Furukawa et al.: "Extracting characteristic structures among words in semistructured documents"Proc.PAKDD-2002, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2336. 356-367 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Yusuke Suzuki et al.: "Polynomial Time Inductive Inference of Ordered Tree Patterns with Internal Structured Variables from Positive Data"Proc.COLT02, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2375. 169-184 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Yusuke Suzuki et al.: "A Polynomial Time Matching Algorithm of Structured Ordered Tree Patterns for Data Mining from Semistructured Data"Proc.ILP02, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2583. 270-284 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Yusuke Suzuki et al.: "Ordered Term Tree Languages Which Are Polynomial Time Inductively Inferable from Positive Data"Proc.ALT02, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2533. 188-202 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Osamu Maruyama et al.: "Toward drawing an atlas of hypothesis classes"Proc.DS-2002, Lecture Notes in Computer Science, Springer-Verlag. 2534. 220-232 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Satoshi Matsumoto et al.: "Learning of Finite Unions of Tree Patterns with Internal Structured Variables from Queries"Proc.AI02, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2557. 523-534 (2002)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Tetsuhiro Miyahara et al.: "Extraction of Tag Tree Patterns with Contractible Variables from Irregular Semistructured data"Proc.PAKDD03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2637. 430-436 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Yuko Itokawa et al.: "Finding Frequent Subgraphs from Graph Structured Data with Geometric Information and Its Application to Lossless Compression"Proc.PAKDD03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2637. 582-594 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Yusuke Suzuki et al.: "Efficient Learning of Unlabeled Term Trees with Contractible Variables from Positive Data"Proc.ILP03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2835. 347-364 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Kazunori Yamagata et al.: "An Effective Grammar-Based Compression Algorithm for Tree Structured Data"Proc.ILP03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2835. 383-400 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Yusuke Suzuki et al.: "Efficient Learning of Ordered and Unordered Tree Patterns with Contractible Variables."Proc.ALT03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2842. 114-128 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Satoshi Matsumoto et al.: "Learning of Finite Unions of Tree Patterns with Repeated Internal Structured Variables from Queries"Proc.ALT03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2842. 144-158 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Sachio Hirokawa et al.: "Semi-Automatic Construction of Metadata from a Series of Web Documents."Proc.AI03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2903. 942-953 (2003)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] T.Miyahara, T.Shoudai, T.Uchida, K.Takahashi, H.Ueda: "Discovery of frequent tree structured patterns in semistructured web documents"Proceedings of the 5tg Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-2001) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2035. 47-52 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] T.Shodai, T.Uchida, T.Miyahara: "Polynomial time algorithms for finding unordered tree patterns with internal variables"Proceedings of the 13th International Symposium on Fundamentals of Computation Theory (FCT 2001) (Springer-Verlag) Lecture Notes in Computer Science. Vol.2138. 335-346 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] O.Maruyama, T.Shoudai, E.Furuichi, S.KUhara: "Learning Conformation Rules"Proceedings of the 4nd International Conference of Discovery Science (DS-2001) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2226. 243-257 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] T.Miyahara, Y.Suzuki, T.Shoudai, T.Uchida, K.Takahashi, H.Ueda: "Discovery of Frequent Tag Tree Patterns in Semistructured Web Document"Proceedings of the 6th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-2002) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2336. 341-355 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] K.Furukawa, T.Uchida, K.Yamada, T.Miyahara, T.Shoudai, Y.Nakamura: "Extracting Characteristic Structures among Words in Semistructured Documents"Proceedings 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD-2002) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2336. 356-367 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Y.Suzuki R.akanuma, T.Shoudai, T.Miyahara, T.Uchida: "Polynomial Time Inductive Inference of Ordered Tree Languages with Height-Constrained Variables from Positive Data"Proceedings of the 15th Annual Conference on Inductive Logic Programming (ILP-2002) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2375. 169-184 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Y.Suzuki, K.Inomae, T.Shoudai, T.Miyahara, T.Uhicda: "A polynomial time matching algorithm of structured ordered tree patterns for data mining from semistructured data"Proceedings of the 12th International Conference on Inductive Logic Programming (ILP-2002) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2583. 270-284 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Y.Suzuki, T.Shoudai, T.Uchida, T.Miyahara: "Ordered Term Tree Languages Which Are Polynomial Time Inductively Inferable from Positive Data"Proceedings 13th International Conference on Algorithmic Learning Theory (ALT-2002) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2533. 188-203 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] O.Maruyama, T.shoudai, S.Miyano: "Toward Drawing an Atlas of Hypothesis Classes"Proceedings of the 5th International Conference on Discovery Science (DS-2002) (Springer-Verlag) Lecture Notes in Computer Science. Vol.2534. 220-232 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] S.Mtsumoto, T.shoudai, T.Miyahara, T.Uchida: "Learning of Finite Unions of Tree Patterns with Repeated Internal Structured Variables from Queries"Proceedings of the 15th Australian Joint Conference on Artificial Intelligence (AI-2002) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2557. 523-534 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] T.Miyahara, Y.suzuki, T.Shoudai, T.Uchida, S.Hirokawa, K.Takahashi, H.Ueda: "Extraction of Tag Tree Patterns with Contractible Variables from Irregular semistructured data"Proceedings of the 7th Pacific-Asia Conference on Knowledge discovery and Data Mining (PAKDD-2003) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2637. 430-436 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Y.Itokawa, T.Uchida, T.Shoudai, T.Miyahara, Y.Nakamura: "Finding Frequent Subgraphs from Graph Structured Data with Geometric Information and Its Application to Lossless Compression"Proceedings of the 7th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-2003) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2637. 582-594 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Y.Suzuki, T.Shoudai, S.Matsumoto, T.Uchida: "Efficient Learning of Unlabeled Term Trees with Contractible Variables from Positive Data"Proceedings of the 13th International Conference on Inductive Logic Programming (ILP-2003) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2835. 347-364 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] K.Yamagata, T.Uchida, T.Shoudai, Y.Nakamura: "An Effective Grammar-Based Compression Algorithm for Tree Structured Data"Proceedings 13th International Conference on Inductive Logic Programming (ILP-2003) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2835. 383-400 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Y.Suzuki, T.Shoudai, S.Matsumoto, T.Uchida, T.Miyahara: "Efficient Learning of Ordered and Unordered Tree Patterns with Contractible Variables"Proceedings of the 14th Work-shop on Algorithmic Learning Theory (ALT-2003) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2842. 114-128 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] S.Matsumoto, Y.Suzuki, T.Shoudai, T.Miyahara, T.Uchida: "Learning of Finite Unions of Tree Patterns with Repeated Internal Structured Variables from Queries"Proceedings of the 14th Workshop on algorithmic Learning Theory (ALT-2003) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2842. 144-158 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] S.Hirokawa, E.Itoh, T.Miyahara: "Semi-Automatic Construction of Metadata from a Series of Web Documents"Proceedings of the 16th Australian Joint Conference on Artificial (AI-2003) (Springer-Verlag) Lecture Notes in Artificial Intelligence. Vol.2903. 942-953 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] Yusuke Suzuki: "Efficient Learning of Unlabeled Term Trees with Contractible Variables from Positive Data"Proc.ILP03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2835. 347-364 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Kazunori Yamagata: "An Effective Grammar-Based Compression Algorithm for Tree Structured Data"Proc.ILP03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2835. 383-400 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Yusuke Suzuki: "Efficient Learning of Ordered and Unordered Tree Patterns with Contractible Variables."Proc.ALT03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2842. 114-128 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Satoshi Matsumoto: "Learning of Finite Unions of Tree Patterns with Repeated Internal Structured Variables from Queries"Proc.ALT03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2842. 144-158 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Sachio Hirokawa: "Semi-Automatic Construction of Metadata from a Series of Web Documents."Proc.ALT03, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2903. 942-953 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] Tetsuhiro Miyahara: "Discovery of Maximally Frequent Tag Tree Patterns with Contractible Variables from Semistructured Documents"Proc.PAKDD04, Lecture Notes in Artificial Intelligence, Springer-Verlag. (発表予定). (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] Tomoyuki Uchida: "Finding Frequent Structural Features among Words in Tree-Structured Documents"Proc.PAKDD04, Lecture Notes in Artificial Intelligence, Springer-Verlag. (発表予定). (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] Yusuke Suzuki: "Polynomial Time Inductive Inference of Ordered Tree Patterns with Internal Structured Variables from Positive Data"Proc. COLT02, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2375. 169-184 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] Yusuke Suzuki: "Ordered Term Tree Languages Which Are Polynomial Time Inductively Inferable from Positive Data"Proc. ALT02, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2533. 188-202 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] Satoshi Matsumoto: "Learning of Finite Unions of Tree Patterns with Internal Structured Variables from Queries"Proc.A102, Lecture Notes in Artificial Intelligence, Springer-Verlag. 2557. 523-534 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] Yusuke Suzuki: "A Polynomial Time Matching Algorithm of Structured Ordered Tree Patterns for Data Mining from Semistructured Data"Proc. ILP02, Lecture Notes in Artificial Intefligence, Springer-Verlag. 2583. 270-284 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] Tetsuhiro Miyahara: "Extraction of Tag Tree Patterns with Contractible Variables from Irregular Semistructured data"Proc. PAKDD03, Lecture Notes in Artificial Intelligence, Springer-Verlag. (発表予定). (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] Yuko Itokawa: "Finding Frequent Subgraphs from Graph Structured Data with Geometric Information and Its Application to Lossless Compression"Proc. PAKDD03, Lecture Notes in Artificial Intelligence, Springer-Verlag. (発表予定). (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] Tetsuhiro Miyahara: "Discovery of Frequent Tree Structured Patterns in Semistructured Web Documents"Lecture Notes in Artificial Intelligence, Springer-verlag. 2035. 47-52 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] Takayoshi Shoudai: "Polynomial Time Algorithms for Finding Unordered Tree Patterns with Internal Variables"Lecture Notes in Computer Science, Springer-verlag. 2138. 335-346 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] Tetsuhiro Miyahara: "Discovery of Frequent Tag Tree Patterns in Semistructured Web Documents"Lecture Notes in Artificial Intelligence, Springer-verlag. 2336. (2002)

    • Related Report
      2001 Annual Research Report
  • [Publications] Kazuyoshi Furukawa: "Extracting Characteristic Structures among Words in Semistructured Documents"Lecture Notes in Artificial Intelligence, Springer-verlag. 2336. (2002)

    • Related Report
      2001 Annual Research Report

URL: 

Published: 2001-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi