• 研究課題をさがす
  • 研究者をさがす
  • KAKENの使い方
  1. 課題ページに戻る

2016 年度 実施状況報告書

A machine learning based system for storing and processing big spatial-temporal data

研究課題

研究課題/領域番号 16K16038
研究機関会津大学

研究代表者

李 鵬 (李鵬)  会津大学, コンピュータ理工学部, 准教授 (30735915)

研究期間 (年度) 2016-04-01 – 2018-03-31
キーワードbig data processing / cloud
研究実績の概要

In FY2016, we develop an intelligent software platform for storing and processing big spatial-temporal data. First, we construct a hierarchy indexing structure based on 3-dimensional R-tree and distribute the R-tree and its associated data to multiple nodes. Second, we propose a traffic-aware task placement to minimize job completion time of MapReduce jobs on Spark. We develop an optimization framework by jointly considering both data and task placement in the MapReduce model. Finally, we study the randomness in MapReduce job execution and propose a novel optimization framework to guarantee predictable job completion time.

現在までの達成度 (区分)
現在までの達成度 (区分)

1: 当初の計画以上に進展している

理由

In FY2016, we achieve our research goals by addressing many challenges in system performance optimization. In the beginning, we build the system and it works well in a small-scale cluster (less than 10 machines). However, when we deploy the system into a larger cluster, its performance is unsatisfied. Therefore, we make many research efforts on performance optimization by proposing new algorithms for job scheduling, traffic and storage management.

今後の研究の推進方策

In FY2017, we will continue to study the ML-engine as we proposed in the original research plan. First, we adopt the machine learning technology to identify and predict data skew. Specifically, we first classify the space into several clusters according to the known data skew in history. Second, we adopt deep learning technology to extract data access pattern based on activity traces collected from services/application layer. Specifically, we create a multi-layer artificial neural network model. Then, we use a bottom-to-up process to obtain activation probabilities for all hidden units in the network. After that, a top-to-bottom process obtains good initial weights with minimum error. Finally, we integrate all components by defining clean interfaces and optimize message flow among them.

  • 研究成果

    (3件)

すべて 2017

すべて 雑誌論文 (2件) (うち国際共著 2件、 査読あり 2件、 謝辞記載あり 1件) 学会発表 (1件) (うち国際学会 1件)

  • [雑誌論文] Traffic-Aware Geo-Distributed Big Data Analytics with Predictable Job Completion Time2017

    • 著者名/発表者名
      Peng Li, Song Guo, Toshiaki Miyazaki, Xiaofei Liao, Hai Jin, Albert Y. Zomaya, and Kun Wang
    • 雑誌名

      IEEE Transactions on Parallel and Distributed Systems

      巻: 28 ページ: 1785-1796

    • DOI

      10.1109/TPDS.2016.2626285

    • 査読あり / 国際共著
  • [雑誌論文] Vehicle-assist Resilient Information and Network System for Disaster Management2017

    • 著者名/発表者名
      Peng Li, Toshiaki Miyazaki, Kun Wang, Song Guo and Weihua Zhuang
    • 雑誌名

      IEEE Transactions on Emerging Topics in Computing

      巻: none ページ: none

    • DOI

      10.1109/TETC.2017.2693286

    • 査読あり / 国際共著 / 謝辞記載あり
  • [学会発表] Traffic-aware Task Placement with Guaranteed Job Completion Time for Geo-distributed Big Data2017

    • 著者名/発表者名
      Peng Li, Toshiaki Miyazaki, and Song Guo
    • 学会等名
      IEEE International Conference on Communications
    • 発表場所
      Paris, France
    • 年月日
      2017-05-21 – 2017-05-25
    • 国際学会

URL: 

公開日: 2018-01-16  

サービス概要 検索マニュアル よくある質問 お知らせ 利用規程 科研費による研究の帰属

Powered by NII kakenhi