Developing fast algorithm for analyzing Giga-sequence data

Research Project

Project/Area Number	22700319
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Single-year Grants
Research Field	Bioinformatics/Life informatics
Research Institution	National Institute of Advanced Industrial Science and Technology
Principal Investigator	SHIMIZU Kana 独立行政法人産業技術総合研究所, 生命情報工学研究センター, 研究員 (60367050)
Project Period (FY)	2010 – 2011
Project Status	Completed (Fiscal Year 2011)
Budget Amount *help	¥3,250,000 (Direct Cost: ¥2,500,000、Indirect Cost: ¥750,000) Fiscal Year 2011: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000) Fiscal Year 2010: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
Keywords	ゲノム / ギガシークエンサー / アルゴリズム / ショートリード / 類似配列検索 / 編集距離 / ギガシークエンスデーター / 最小全域木
Research Abstract	Next Generation Sequencing(NGS) technology calls for fast and accurate algorithms that can evaluate sequence similarity for a huge amount data. In this study, we designed and implemented exact algorithm SlideSort that finds all similar pairs whose edit-distance does not exceed a given threshold from NGS data, which helps many important analyses, such as de novo genome assembly, identification of frequently appearing sequence patterns and accurate clustering. In comparison to state-of-the-art methods, our method is much faster in finding remote matches, scaling easily to tens of millions of sequences. Our software has an additional function of single link clustering, which is useful in summarizing NGS data for further processing.

Report

(3 results)

2011 Annual Research Report Final Research Report ( PDF )
2010 Annual Research Report

Research Products
(12 results)

All 2011 2010 Other

All Journal Article (2 results) (of which Peer Reviewed: 2 results) Presentation (5 results) Remarks (3 results) Patent(Industrial Property Rights) (2 results)

[Journal Article] SlideSort: All Pairs Similarity Search for Short Reads2011
- Author(s)
  K. Shimizu and K. Tsuda
- Journal Title
  
  Bioinformatics
  
  Volume: 27 Issue: 4 Pages: 464-470
- DOI
  10.1093/bioinformatics/btq677
- Related Report
  2011 Final Research Report
- Peer Reviewed
[Journal Article] SlideSort : All Pairs Similarity Search for Short Reads2011
- Author(s)
  Kana Shimizu, Koji Tsuda
- Journal Title
  
  Bioinformatics
  
  Volume: 27(4) Pages: 464-470
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Presentation] Fast and exact algorithm for Next Generation Sequencing data analysis2011
- Author(s)
  Kana Shimizu, Koji Tsuda
- Organizer
  ISMB/ECCB 2011 Highlights Track
- Place of Presentation
  Vienna
- Year and Date
  2011-07-18
- Related Report
  2011 Final Research Report
[Presentation] Fast and exact algorithm for Next Generation Sequencing data analysis2011
- Author(s)
  Kana Shimizu, Koji Tsuda
- Organizer
  ISMB 2011 Highlights Track
- Place of Presentation
  Vienna, Austria
- Year and Date
  2011-07-18
- Related Report
  2011 Annual Research Report
[Presentation] SlideSort : A fast and exact tool for finding all similar pairs from next-generation sequencing data2011
- Author(s)
  Kana Shimizu, Koji Tsuda
- Organizer
  RECOMB 2011
- Place of Presentation
  Vancouver
- Year and Date
  2011-03-29
- Related Report
  2011 Final Research Report 2010 Annual Research Report
[Presentation] SLIDESORT : All pairs similarity search for short reads2010
- Author(s)
  Kana Shimizu, Koji Tsuda
- Organizer
  2010年日本バイオインフォマティクス学会年会
- Place of Presentation
  九州
- Year and Date
  2010-12-15
- Related Report
  2011 Final Research Report
[Presentation] Developing an exact method to find similar pairs with small edit-distance2010
- Author(s)
  Kana Shimizu, Koji Tsuda
- Organizer
  ISMB 2010
- Place of Presentation
  Boston
- Year and Date
  2010-07-12
- Related Report
  2011 Final Research Report 2010 Annual Research Report
[Remarks]
- Related Report
  2011 Final Research Report
[Remarks]
- URL
  http://www.cbrc.jp/shimizu/slidesort/index.php
- Related Report
  2011 Annual Research Report
[Remarks]
- URL
  http://www.cbrc.jp/~shimizu/slidesort/index.php
- Related Report
  2010 Annual Research Report
[Patent(Industrial Property Rights)] 配列解析装置,配列解析方法およびコンピュータプログラム2010
- Inventor(s)
  清水佳奈,津田宏治
- Industrial Property Rights Holder
  産業技術総合研究所
- Filing Date
  2010-07-09
- Related Report
  2011 Final Research Report
[Patent(Industrial Property Rights)] 配列解析装置、配列解析方法およびコンピュータプログラム2010
- Inventor(s)
  清水佳奈, 津田宏治
- Industrial Property Rights Holder
  清水佳奈, 津田宏治
- Industrial Property Number
  2010-156342
- Filing Date
  2010-07-09
- Related Report
  2010 Annual Research Report

Developing fast algorithm for analyzing Giga-sequence data

Principal Investigator

SHIMIZU Kana 独立行政法人産業技術総合研究所, 生命情報工学研究センター, 研究員 (60367050)

¥3,250,000 (Direct Cost: ¥2,500,000、Indirect Cost: ¥750,000)

Report

Research Products

[Journal Article] SlideSort: All Pairs Similarity Search for Short Reads2011

Author(s)

Journal Title

DOI

Related Report

[Journal Article] SlideSort : All Pairs Similarity Search for Short Reads2011

Author(s)

Journal Title

Related Report

[Presentation] Fast and exact algorithm for Next Generation Sequencing data analysis2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Fast and exact algorithm for Next Generation Sequencing data analysis2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] SlideSort : A fast and exact tool for finding all similar pairs from next-generation sequencing data2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] SLIDESORT : All pairs similarity search for short reads2010

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Developing an exact method to find similar pairs with small edit-distance2010

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Remarks]

Related Report

[Remarks]

URL

Related Report

[Remarks]

URL

Related Report

[Patent(Industrial Property Rights)] 配列解析装置,配列解析方法およびコンピュータプログラム2010

Inventor(s)

Industrial Property Rights Holder

Filing Date

Related Report

[Patent(Industrial Property Rights)] 配列解析装置、配列解析方法およびコンピュータプログラム2010

Inventor(s)

Industrial Property Rights Holder

Industrial Property Number

Filing Date

Related Report