2007 Fiscal Year Annual Research Report

大規模ゲノムデータ処理に対する高速高精度アルゴリズムの開発

Research Project

Project/Area Number	18017015
Research Institution	Nagoya University
Principal Investigator	柳浦睦憲 Nagoya University, 情報科学研究科, 准教授 (10263120)
Co-Investigator(Kenkyū-buntansha)	宇野毅明国立情報学研究所, 情報学プリンシプル研究系, 准教授 (00302977) 小野廣隆九州大学, 大学院・システム情報科学研究院, 助教 (00346826)
Keywords	ゲノム情報 / 高度な検索・比較 / データマイニング / 列挙アルゴリズム / 確率的解析 / 頻出集合
Research Abstract	ゲノム研究に関わるデータは巨大なものが多い.全体的な特徴の観察や,類似する項目の発見・グループ分け(類似検索・クラスタリング),確からしいルール・特徴ある部分構造の発見(ルール/データマイニング)を行うことは,ゲノム研究において非常に重要な位置を占める.しかし,データが巨大であるため,従来の素朴な方法では計算に莫大な時間がかかる.全ての項目を総当りで比較するのではなく,効率良く類似する可能性のあるペアだけを絞り込むことができれば,極めて短時間で計算を終了することが可能である.本年度は,ゲノム情報学で基礎的な問題の中から,実験結果の解析に使われるパターンマイニング,最適分類規則発見,配列の決定やアセンブリなどで用いられる相同性の発見アルゴリズムと並び替えを行うアルゴリズムの開発に関して,最適化・アルゴリズム的な技術を適用して改善できる点を見つけ出し,そこに新たな技法を提案した.代表的な成果を以下に挙げる. ・与えられたグラフから,クリークに近い構造を全て見つける問題,データベースから多くの項目にあいまいさを許容した意味で含まれる集合を全て見つけ出す問題に対するアルゴリズムを開発した. ・ベクトル集合の各要素に真か偽が与えられているデータ集合に対するパターン抽出の基本問題に関する性質を解析した. ・集合被覆問題に対する高速近似解法を設計する上で有効な手法を検討し,知見を得た. ・DNA解析等で利用される,所定の熱力学的制約を満たしたDNA配列集合を自動的に生成(設計)するアルゴリズムを提案した.

Research Products
(6 results)

All 2008 2007

All Journal Article (6 results) (of which Peer Reviewed: 6 results)

[Journal Article] An Efficient Algorithm for Finding Similar Short Substrings from Large Scale String Data2008
- Author(s)
  Uno, T.
- Journal Title
  
  The Pacific-Asia Conference on Knowledge Discovery and Data Mining
- Peer Reviewed
[Journal Article] A Randomness Based Analysis on the Data Size Needed for Removing Deceptive Patterns2008
- Author(s)
  Haraguchi, K., Yagiura, M., Boros, E., and Ibaraki, T.
- Journal Title
  
  IEICE Transactions on Information and Systems E91-D
  
  Pages: 781-788
- Peer Reviewed
[Journal Article] Efficient Polynomial Delay Algorithm fbr Pseudo Frequent Itemset Mining2007
- Author(s)
  Uno, T., and Arimura, H.
- Journal Title
  
  Lecture Notes in Artificial Intelligence 4755
  
  Pages: 219-230
- Peer Reviewed
[Journal Article] Mining complex genotypic features for predicting HIV-1 drug resistance2007
- Author(s)
  Saigo, H., Uno, T., and Tsuda, K.
- Journal Title
  
  Bioinformatics 23
  
  Pages: 2455-2462
- Peer Reviewed
[Journal Article] Relaxation Heuristics for the Set Covering Problem2007
- Author(s)
  Umetani, S., and Yagiura, M.
- Journal Title
  
  Journal of the Operations Research Society of Japan 50
  
  Pages: 350-375
- Peer Reviewed
[Journal Article] Neighborhood Searches for Thermodynamically Designing DNA Sequence2007
- Author(s)
  Kawashimo, S., Ono, H., Sadakane. K., and Yamashita, M.
- Journal Title
  
  Preliminary Proceedings of the 13th International Meeting on DNA Computing, Memphis
  
  Pages: 211-220
- Peer Reviewed

2007 Fiscal Year Annual Research Report

大規模ゲノムデータ処理に対する高速高精度アルゴリズムの開発

Principal Investigator

柳浦 睦憲 Nagoya University, 情報科学研究科, 准教授 (10263120)

Research Products

[Journal Article] An Efficient Algorithm for Finding Similar Short Substrings from Large Scale String Data2008

Author(s)

Journal Title

[Journal Article] A Randomness Based Analysis on the Data Size Needed for Removing Deceptive Patterns2008

Author(s)

Journal Title

[Journal Article] Efficient Polynomial Delay Algorithm fbr Pseudo Frequent Itemset Mining2007

Author(s)

Journal Title

[Journal Article] Mining complex genotypic features for predicting HIV-1 drug resistance2007

Author(s)

Journal Title

[Journal Article] Relaxation Heuristics for the Set Covering Problem2007

Author(s)

Journal Title

[Journal Article] Neighborhood Searches for Thermodynamically Designing DNA Sequence2007

Author(s)

Journal Title

柳浦睦憲 Nagoya University, 情報科学研究科, 准教授 (10263120)