2016 Fiscal Year Research-status Report

A machine learning based system for storing and processing big spatial-temporal data

Research Project

Project/Area Number	16K16038
Research Institution	The University of Aizu
Principal Investigator	李鵬 (李鵬) 会津大学, コンピュータ理工学部, 准教授 (30735915)
Project Period (FY)	2016-04-01 – 2018-03-31
Keywords	big data processing / cloud
Outline of Annual Research Achievements	In FY2016, we develop an intelligent software platform for storing and processing big spatial-temporal data. First, we construct a hierarchy indexing structure based on 3-dimensional R-tree and distribute the R-tree and its associated data to multiple nodes. Second, we propose a traffic-aware task placement to minimize job completion time of MapReduce jobs on Spark. We develop an optimization framework by jointly considering both data and task placement in the MapReduce model. Finally, we study the randomness in MapReduce job execution and propose a novel optimization framework to guarantee predictable job completion time.
Current Status of Research Progress	Current Status of Research Progress 1: Research has progressed more than it was originally planned. Reason In FY2016, we achieve our research goals by addressing many challenges in system performance optimization. In the beginning, we build the system and it works well in a small-scale cluster (less than 10 machines). However, when we deploy the system into a larger cluster, its performance is unsatisfied. Therefore, we make many research efforts on performance optimization by proposing new algorithms for job scheduling, traffic and storage management.
Strategy for Future Research Activity	In FY2017, we will continue to study the ML-engine as we proposed in the original research plan. First, we adopt the machine learning technology to identify and predict data skew. Specifically, we first classify the space into several clusters according to the known data skew in history. Second, we adopt deep learning technology to extract data access pattern based on activity traces collected from services/application layer. Specifically, we create a multi-layer artificial neural network model. Then, we use a bottom-to-up process to obtain activation probabilities for all hidden units in the network. After that, a top-to-bottom process obtains good initial weights with minimum error. Finally, we integrate all components by defining clean interfaces and optimize message flow among them.

Research Products
(3 results)

All 2017

All Journal Article (2 results) (of which Int'l Joint Research: 2 results, Peer Reviewed: 2 results, Acknowledgement Compliant: 1 results) Presentation (1 results) (of which Int'l Joint Research: 1 results)

[Journal Article] Traffic-Aware Geo-Distributed Big Data Analytics with Predictable Job Completion Time2017
- Author(s)
  Peng Li, Song Guo, Toshiaki Miyazaki, Xiaofei Liao, Hai Jin, Albert Y. Zomaya, and Kun Wang
- Journal Title
  
  IEEE Transactions on Parallel and Distributed Systems
  
  Volume: 28 Pages: 1785-1796
- DOI
  10.1109/TPDS.2016.2626285
- Peer Reviewed / Int'l Joint Research
[Journal Article] Vehicle-assist Resilient Information and Network System for Disaster Management2017
- Author(s)
  Peng Li, Toshiaki Miyazaki, Kun Wang, Song Guo and Weihua Zhuang
- Journal Title
  
  IEEE Transactions on Emerging Topics in Computing
  
  Volume: none Pages: none
- DOI
  10.1109/TETC.2017.2693286
- Peer Reviewed / Int'l Joint Research / Acknowledgement Compliant
[Presentation] Traffic-aware Task Placement with Guaranteed Job Completion Time for Geo-distributed Big Data2017
- Author(s)
  Peng Li, Toshiaki Miyazaki, and Song Guo
- Organizer
  IEEE International Conference on Communications
- Place of Presentation
  Paris, France
- Year and Date
  2017-05-21 – 2017-05-25
- Int'l Joint Research

2016 Fiscal Year Research-status Report

A machine learning based system for storing and processing big spatial-temporal data

Principal Investigator

李 鵬 (李鵬) 会津大学, コンピュータ理工学部, 准教授 (30735915)

Current Status of Research Progress

Reason

Research Products

[Journal Article] Traffic-Aware Geo-Distributed Big Data Analytics with Predictable Job Completion Time2017

Author(s)

Journal Title

DOI

[Journal Article] Vehicle-assist Resilient Information and Network System for Disaster Management2017

Author(s)

Journal Title

DOI

[Presentation] Traffic-aware Task Placement with Guaranteed Job Completion Time for Geo-distributed Big Data2017

Author(s)

Organizer

Place of Presentation

Year and Date

李鵬 (李鵬) 会津大学, コンピュータ理工学部, 准教授 (30735915)