• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Algorithms for Extraction of Common Patterns from Data in Bioinformatics

Research Project

Project/Area Number 13680394
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field 計算機科学
Research InstitutionKyoto University

Principal Investigator

AKUTSU Tatsuya  Kyoto University, Institute for Chemical Research, Professor, 化学研究所, 教授 (90261859)

Co-Investigator(Kenkyū-buntansha) MIYANO Satoru  University of Tokyo, Institute of Medical Science, Professor, 医科学研究所, 教授 (50128104)
UEDA Nobuhisa  Kyoto University, Institute for Chemical Research, Assistant Professor, 化学研究所, 助手 (80346048)
Project Period (FY) 2001 – 2003
Project Status Completed (Fiscal Year 2003)
Budget Amount *help
¥2,600,000 (Direct Cost: ¥2,600,000)
Fiscal Year 2003: ¥1,000,000 (Direct Cost: ¥1,000,000)
Fiscal Year 2002: ¥700,000 (Direct Cost: ¥700,000)
Fiscal Year 2001: ¥900,000 (Direct Cost: ¥900,000)
KeywordsBioinformatics / Pattern matching / Sequence motif / Algorithms / Kernel method / Position specific score matrix / Local alignment / サポートベクタマシン / ローカルアライメント / ホモロジー検索 / 点集合 / 合同性判定 / 最大共通部分点集合 / 近似マッチング / モチーフ抽出 / 最大クリーク / 電気泳動 / スポットマッチング / 構造アライメント / GIBBSサンプリング / 相対エントロピー / ローカルサーチ
Research Abstract

We studied algorithms for extracting common patterns from biological data such as DNA sequences, protein structures, two-dimensional electrophoresis image and gene expression data. We mainly obtained the following results.
1.Extraction of Common Patterns by Local Alignment. The Gibbs sampling algorithm is widely-used for extraction of common patterns from sequence data. We developed a variant of the Gibbs sampling algorithm, which can be applied to numerical sequences. We applied the developed algorithm to detection of motifs from protein structures.
2.On the Complexity of Deriving Patterns from Positive and Negative Sequences. We studied theoretical aspects of deriving position specific score matrices (PSSM) from positive and negative sequences.
3.Pattern Matching Algorithms for Electrophoresis Image Data and Protein Structures. We formulated a pattern matching problem for 2-dimensional electrophoresis image data as a geometric matching problems and proved that it is NP-hard. On the other hand, we developed practical algorithms for computing optimal matchings for electrophoresis image data analysis and protein structure alignment.
4.Kernel Method for Protein Sequence Classification. We developed a new kernel function for protein sequences based on the well-known local alignment algorithm for protein sequences. The kernel was combined with the support vector machine and was tested using some benchmark data. The results show that the new kernel is better than existing kernels.
5.Inference of Protein-Protein Interactions Using Linear Programming. We developed a new method for inference of probabilities of domain-domain interactions from experimental data by formulating the inference problem as a linear program. The method was applied to inference of protein-protein interactions and was compared with existing methods. The results show that the proposed method outperforms existing methods for numerical data.

Report

(4 results)
  • 2003 Annual Research Report   Final Research Report Summary
  • 2002 Annual Research Report
  • 2001 Annual Research Report
  • Research Products

    (27 results)

All 2004 2003 2002 2001 Other

All Journal Article (14 results) Book (1 results) Publications (12 results)

  • [Journal Article] Protein homology detection using string alignment kernels2004

    • Author(s)
      H.Saigo, J-P.Vert, N.Ueda, T.Akutsu
    • Journal Title

      Bioinformatics 20

      Pages: 1682-1689

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] Algorithms for point set matching with k-differences2004

    • Author(s)
      T.Akutsu
    • Journal Title

      Lecture Notes in Computer Science 3106

      Pages: 249-258

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] A simple greedy algorithm for finding functional relations : efficient implementation and average case analysis2003

    • Author(s)
      T.Akutsu, S.Miyano, S.Kuhara
    • Journal Title

      Theoretical Computer Science 292

      Pages: 481-495

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] Point matching under non-uniform distortions2003

    • Author(s)
      T.Akutsu, K.Kanaya, A.Ohyama, A.Fujiyama
    • Journal Title

      Discrete Applied Mathematics 127

      Pages: 5-21

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] Identification of genetic networks by strategic gene disruptions and gene overexpressions under a boolean model2003

    • Author(s)
      T.Akutsu, S.Kuhara, O.Maruyama, S.Miyano
    • Journal Title

      Theoretical Computer Science 298

      Pages: 235-251

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] Inferring strengths of protein-protein interactions from experimental data using linear programming2003

    • Author(s)
      M.Hayashida, N.Ueda, T.Akutsu
    • Journal Title

      Bioinformatics 19

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] Finding optimal degenerated patterns in DNA sequences2003

    • Author(s)
      D.Shinozaki, T.Akutsu, O.Maruyama
    • Journal Title

      Bioinformatics 19

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] On the complexity of deriving position specific score matrices from examples2002

    • Author(s)
      T.Akutsu, H.Bannai, S.Miyano, S.Ott
    • Journal Title

      Lecture Notes in Computer Science 2373

      Pages: 168-177

    • NAID

      110002812473

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] Inferring a union of halfspaces from examples2002

    • Author(s)
      T.Akutsu, S.Ott
    • Journal Title

      Lecture Notes in Computer Science 2387

      Pages: 117-126

    • NAID

      110002812493

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] Point matching under non-uniform distortions and protein side chain packing based on an efficient maximum clique algorithm2002

    • Author(s)
      K.C.D.Bahadur, T.Akutsu, E.Tomita, T.Seki, A.Fujiyama
    • Journal Title

      Genome Informatics 13

      Pages: 143-152

    • NAID

      130003997148

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] A local search algorithm for local multiple alignment : special case analysis and application to cancer classification2001

    • Author(s)
      T.Akutsu
    • Journal Title

      Proc.2001 International Conference on Parallel and Distributed Processing Techniques and Applications

      Pages: 1284-1290

    • NAID

      110002936539

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] Local multiple alignment of numerical sequences : detection of subtlemotifs from protein sequences and structures2001

    • Author(s)
      T.Akutsu, K.Horimoto
    • Journal Title

      Genome Informatics 12

      Pages: 83-92

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] A local search algorithm for local multiple alignment : special case analysis and application to cancer classification2001

    • Author(s)
      T.Akutsu
    • Journal Title

      Proc. 2001 Int.Conf. Parallel and Distributed Processing Techniques and Applications

      Pages: 1284-1290

    • NAID

      110002936539

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Journal Article] Local multiple alignment of numerical sequences : detection of subtle motifs from protein sequences and structures2001

    • Author(s)
      T.Akutsu, K.Horimoto
    • Journal Title

      Genome Informatics 12

      Pages: 83-92

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Book] 遺伝子発現情報解析のための数理モデルとアルゴリズム2003

    • Author(s)
      阿久津達也
    • Total Pages
      36
    • Publisher
      (財)国際高等研究所
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2003 Final Research Report Summary
  • [Publications] T.Akutsu, K.Kanaya, A.Ohyama, A.Fujiyama: "Point matching under non-uniform distortions"Discrete Applied Mathematics. 127. 5-21 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] T.Akutsu, S.Kuhara, O.Maruyama, S.Miyano: "Identification of genetic networks by strategic gene disruptions and overexpressions under a Boolean model"Theoretical Computer Science. 298. 235-251 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] D.Shinozaki, T.Akutsu, O.Maruyama: "Finding optimal degenerate patterns in DNA sequences"Bioinformatics. 19・Suppl.2. ii206-ii214 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] N.Hayashida, N.Ueda, T.Akutsu: "Inferring strengths of protein-protein interactions from experimental data using linear programming"Bioinformatics. 19・Suppl.2. ii58-ii65 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] H.Saigo, J-P.Vert, N.Ueda, T.Akutsu: "Protein homology detection using string alignment kernels"Bioinformatics. (in press). (2004)

    • Related Report
      2003 Annual Research Report
  • [Publications] T.Akutsu, H.Bannai, S.Miyano, S.Ott: "On the complexity of deriving position specific score matrices from examples"Lecture Notes in Computer Science. 2373. 168-177 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] T.Akutsu, S.Ott: "Inferring a union of halfspaces from examples"Lecture Notes in Computer Science. 2387. 117-126 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] K.C.D.Bahadur, T.Akutsu, E.Tomita, T.Seki, A.Fujiyama: "Point matching under non-uniform distortions and protein side chain packing based on an efficient maximum clique algorithm"Genome Informatics. 13. 143-152 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] T.Akutsu, S.Miyano, S.Kuhara: "A simple greedy algorithm for finding functional relations : efficient implementation and average case analysis"Theoretical Computer Science. 292. 481-495 (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] T.Akutsu: "A local search algorithm for local multiple alignment : special case analysis and application to cancer classification"Proceedings of International Conference on Parallel and Distributed Processing Techniques and Applications. 1284-1290 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] T.Akutsu, K.Horimoto: "Local multiple alignment of numerical sequences : detection of subtle motifs from protein sequences and structures"Genome Informatics. 12. 83-92 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] T.Akutsu, H.Bannai, S.Ott, S.Miyano: "On the complexity of deriving position specific score matrices from examples"Proc. 13^<th> Annual Symposium on Combinatorial Pattern Matvjing(CPM 2002). (採録決定ずみ).

    • Related Report
      2001 Annual Research Report

URL: 

Published: 2001-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi