Project/Area Number |
09558036
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 展開研究 |
Research Field |
Intelligent informatics
|
Research Institution | Osaka Prefecture University |
Principal Investigator |
FUKUNAGA Kunio Osaka Prefecture University, Dept.of Computer and Systems Sciences, Professor, 工学部, 教授 (60081296)
|
Co-Investigator(Kenkyū-buntansha) |
KOJIMA Atsuhiro Osaka Prefecture University, Library and Science Information Center, Research As, 総合情報センター, 助手 (80291607)
AKASHI Hiroshi Solar Research Institute, President, 代表取締役
OGIHARA Akio Osaka Prefecture University, Dept.of Computer and Systems Sciences, Assistant Pr, 工学部, 講師 (60244654)
IZUMI Masao Osaka Prefecture University, Dept.of Computer and Systems Sciences, Associate Pr, 工学部, 助教授 (60223046)
TAKAMATSU Shinobu Osaka Prefecture University, Dept.of Information Systems, Professor, 工学部, 教授 (00081290)
|
Project Period (FY) |
1997 – 1998
|
Project Status |
Completed (Fiscal Year 1998)
|
Budget Amount *help |
¥13,200,000 (Direct Cost: ¥13,200,000)
Fiscal Year 1998: ¥4,200,000 (Direct Cost: ¥4,200,000)
Fiscal Year 1997: ¥9,000,000 (Direct Cost: ¥9,000,000)
|
Keywords | Model-based position and orientation estimation / Natural language description / Motion recognition / Model-based object recognition / Video image recoginition / 画像処理 / 動物体認識 / 動作の記述 / 動作のテキスト表現 / 動物体の位置・姿勢推定 |
Research Abstract |
In this projeact we propose a system, which generates natural language descriptions of human behavior from image sequences and speech. In general, video-monitoring systems are widely used for pelient monitoring television system in medical facilities, monitoring rare animals in a zoo and condition monitoring for security systems. It is necessary for these systems to reduce traffic under limited communication capacity when transmit the scene of monitoring objects or the condition. The proposed system sequentially estimates the 3-D position and orientation of a man appeared on a monitoring image sequence. Then, the trace or series of estimated positions is divided into concrete motion segments. For each segment, features of an action of the man appeared on the monitoring images is extracted and associated with the most suitable verb, in some case carried with the aid of speech recognition. In addition, some sequences of verbs are combined into high-level verbs. Finally the system composes sentences expressing the movements of the man using the extracted verb and the corresponding frame structure expression based on the flame structure grammar.
|