2020 Fiscal Year Final Research Report
Signal processing technology based on deep learning and application to singing voice and musical instrument sound generation
Project/Area Number |
18K11163
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Review Section |
Basic Section 60010:Theory of informatics-related
|
Research Institution | Nagoya Institute of Technology |
Principal Investigator |
Oura Keiichiro 名古屋工業大学, 工学(系)研究科(研究院), 研究員 (20588579)
|
Project Period (FY) |
2018-04-01 – 2021-03-31
|
Keywords | 信号処理 / ディープラーニング / 歌声合成 / 音声合成 / 楽器音生成 |
Outline of Final Research Achievements |
For singing voices and instrument sounds, we proceeded research on acoustic modeling about automatic selection method of training data, modeling method of speech waveform itself, and end-to-end structure capable of direct conversion from musical score to waveform, etc. and publish some of them as academic papers. Among them, the waveform generation from periodic / aperiodic signals based on deep learning by applying the cycle structure of CycleGAN which show high performance in the field of image conversion has achieved results such as receiving the KIYOSHI AWAYA academic encouragement award from the acoustical society of Japan and the Microsoft informatics research award from the information processing society of Japan.
|
Free Research Field |
音声合成
|
Academic Significance and Societal Importance of the Research Achievements |
現状のほとんどの音声関連技術には,従来型のデジタル信号処理理論を基礎としており,従来型のデジタル信号処理理論は音声関連の研究分野では最も根本的な考え方として広く普及しているが,このような変換・処理で取り扱える枠組みの中に制限されていたため,モデル構造に関する過度の制約による性能限界があった.本研究は,このような状況にブレークスルーをもたらすため,近年急速に技術革新が進んでいる深層学習に基づいた音声波形の直接モデル化手法を新たに開拓しようとするものである.
|