2016 Fiscal Year Final Research Report

Visual Concept Modeling of Verbs Based on a Large Scale Set of Tagged Videos Provided by Folksonomy

Research Project

Project/Area Number	26730090
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Multi-year Fund
Research Field	Perceptual information processing
Research Institution	Osaka University
Principal Investigator	Kazuaki Nakamura 大阪大学, 工学研究科, 助教 (10584047)
Project Period (FY)	2014-04-01 – 2017-03-31
Keywords	視覚メディア処理 / 視覚概念学習 / 動作認識 / 映像処理 / 画像処理
Outline of Final Research Achievements	This research project investigates a technology for constructing visual models of concepts represented by verbs using tagged videos which are stored in web-based video sharing services such as YouTube. In general, a video consists of several segments, and each of them has a different semantic content. Therefore, we cannot obtain the complete correspondence between tags and segments. To cope with this problem, this project proposes a method to extract a set of segments whose semantic content commonly appears in videos that have the same tag (called “common segments” in this report), and construct a visual model of the tag using the extracted segments. In the process of common segment extraction, it is quite important to measure the similarity between two segments. This project also proposes a method for calculating the similarity based on a large scale set of tagged images.
Free Research Field	視覚メディア処理