• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Efficient Data Staging into The Big Data Analytics Platform

Research Project

Project/Area Number 16K21675
Research Category

Grant-in-Aid for Young Scientists (B)

Allocation TypeMulti-year Fund
Research Field Multimedia database
High performance computing
Research InstitutionNational Institute of Advanced Industrial Science and Technology

Principal Investigator

Tanimura Yusuke  国立研究開発法人産業技術総合研究所, 情報・人間工学領域, 主任研究員 (80415710)

Project Period (FY) 2016-04-01 – 2018-03-31
Project Status Completed (Fiscal Year 2017)
Budget Amount *help
¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)
Fiscal Year 2017: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Fiscal Year 2016: ¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
Keywordsデータストレージ / ビッグデータ解析 / データステージング / 資源管理 / クラウド
Outline of Final Research Achievements

Data staging between the big data analytics platform (the main computing system) and the backend storage, pre- and post- processing in the backend to achieve efficient data staging, and appropriate scheduling to avoid performance interference by concurrent data staging was studied. The prototype system by using Spark and Alluxio was designed and implemented, and then I/O (staging) performance including storage-side processing was evaluated. The basic results would enable multi-tenant operation of the big data analytics platform and more effective use of the backend storage.

Report

(3 results)
  • 2017 Annual Research Report   Final Research Report ( PDF )
  • 2016 Research-status Report
  • Research Products

    (3 results)

All 2018 2017 2016

All Presentation (3 results) (of which Int'l Joint Research: 2 results)

  • [Presentation] Storage-Side Processing for Spark with Tiered Storage2018

    • Author(s)
      Kaihui Zhang, Yusuke Tanimura, Hidemoto Nakada, Hirotaka Ogawa
    • Organizer
      情報処理学会 第163回ハイパフォーマンスコンピューティング研究会
    • Related Report
      2017 Annual Research Report
  • [Presentation] Understanding and Improving Disk-based Intermediate Data Caching in Spark2017

    • Author(s)
      Kaihui Zhang, Yusuke Tanimura, Hidemoto Nakada, Hirotaka Ogawa
    • Organizer
      6th Workshop on Scalable Cloud Data Management in 2017 IEEE International Conference on Big Data
    • Related Report
      2017 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Towards Efficient Data Staging for Multi-Tenant Big Data Analytics2016

    • Author(s)
      Yusuke Tanimura
    • Organizer
      The 25th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC)
    • Place of Presentation
      京都府京都市国際交流会館
    • Year and Date
      2016-06-02
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research

URL: 

Published: 2016-04-21   Modified: 2019-03-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi