Project/Area Number |
17K00234
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Perceptual information processing
|
Research Institution | Kyoto Sangyo University (2018-2020) Osaka University (2017) |
Principal Investigator |
KAWAMURA Arata 京都産業大学, 情報理工学部, 教授 (60362646)
|
Project Period (FY) |
2017-04-01 – 2021-03-31
|
Project Status |
Completed (Fiscal Year 2020)
|
Budget Amount *help |
¥3,510,000 (Direct Cost: ¥2,700,000、Indirect Cost: ¥810,000)
Fiscal Year 2019: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2018: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2017: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
|
Keywords | 画像の音変換 / スペクトログラム / 位相スペクトル / 長時間フーリエ変換 / 反復位相復元 / 情報システム / 画像 / 音声等認識 / 情報通信工学 |
Outline of Final Research Achievements |
In this study, we proposed an image to sound mapping method. This technique treats an image as a spectrogram and maps it to a sound by taking inverse Fourier transform of the spectrogram. The embedded image destroyed speech spectral amplitude. We compensate the speech quality by using a speech spectral phase obtained by taking LTFT (Long-Term Fourier Transform). The speech spectral phase on LTFT contains speech intelligibility. The proposed method synthesis a speech signal with spectrogram consisting of an original image and speech spectral phase on LTFT. The synthesis speech signal is transmitted from a loudspeaker, and received at a microphone equipped on a mobile device. The received speech signal is transformed to a spectrogram which directly displays the transmitted image. The proposed method does not require any special transformation technique excepted of Fourier Transform.
|
Academic Significance and Societal Importance of the Research Achievements |
本研究では,画像を埋め込んだ合成音声をスピーカ等から放射し,受信側で音声から画像を復元する.この技術が完成すれば,音声から得られる言葉の情報とともに,画像情報も同時に伝達できる.また,WiFi環境が整備されていない場所でも受信が可能となり,受信可能範囲も,スピーカの音量調整により制御可能となる.応用例は多岐にわたり,防災用スピーカからの緊急放送に避難経路や災害現場の写真を埋め込む,ラジオの天気予報に天気図を埋め込む,絵本の読み聞かせに該当ページの絵を埋め込む,タイムセール放送に商品や売り場の地図を埋め込む,海外のバスや電車の音声アナウンスに翻訳情報を埋め込む,などが考えられる.
|