2015 Fiscal Year Annual Research Report

信号処理と記号処理の確率的協働による音楽知能の創発

Research Project

Project/Area Number	26700020
Research Institution	Kyoto University
Principal Investigator	吉井和佳京都大学, 情報学研究科, 講師 (20510001)
Project Period (FY)	2014-04-01 – 2018-03-31
Keywords	ノンパラメトリックベイズ / 機械学習 / 音楽情報処理
Outline of Annual Research Achievements	音楽音響信号を楽譜に変換するための取り組みとして、音響モデルと言語モデルの両面で当初の想定以上の進展があった。まず、音響モデルに関して、無限個のソースと無限個のフィルタからなる複合自己回帰モデル（ソース・フィルタ型NMF）を、人間の聴覚特性に合致する対数周波数軸上で再定式化することに成功した。これにより、自動採譜結果が10%以上向上し、競合する最新手法と同等レベルの性能を達成することができた。本研究成果は、音楽情報処理分野のトップカンファレンスであるISMIR 2015に採択された。また、NMF自体の改良として、尤度関数を複素ガウス分布ではなく、複素t分布に置き換えることにより、初期値依存性が低く、頑健な音源分離を行う手法を開発した。本研究成果は、信号処理分野のトップカンファレンスであるICASSP 2016に採択された。一方、音楽音響信号に対するコード認識に関しても、クロマベクトル特徴量の抽出の前に、あらかじめ歌声・伴奏音・打楽器音に分離しておく方式を考案した。識別器に関しても、通常のHMMではなく、コード遷移にビート位置依存性を考慮し、出力分布にvon MisesーFisher混合分布を用いることにより、大幅な性能向上を達成した。言語モデルに関しては、計算機上で音楽理論を扱うために標準的であったGenerative Theory of Tonal Music (GTTM)を確率的生成モデルの見地からとらえ直すことで、メロディ（一次元の音符系列）に対する確率的文脈自由文法 (PCFG)を定式化することに成功した。従来のGTTMでは、導出規則を人出でチューニングしていたのに対し、提案モデルでは自動的に推論されるにもかかわらず、従来より優れた構文解析精度を達成した。本研究成果も、ICASSP 2016に採択された。
Current Status of Research Progress	Current Status of Research Progress 1: Research has progressed more than it was originally planned. Reason 「研究実績の概要」に述べた通り、音響モデル・言語モデルの両側面で大幅な進展が見られたため、NMF, HMM, PCFGなどを統合するうえでの基盤ができあがった。
Strategy for Future Research Activity	当初の計画通り、音響モデルと言語モデルを統合することにより、音楽音響信号から自動採譜を行うと同時に、コード進行を学習する方式の考案に取り組む。さらに、複旋律音楽に対する音符配置モデルの定式化にも取り組む。
Causes of Carryover	楽譜に対す記号処理に関する研究推進のため、ポスドクを1名半年程度雇用することを急遽決定したので、人件費・謝金が当初の予定より増加した。一方、研究環境的な側面では、現有の機材で十分に対応可能であったので、高額な計算サーバなどの購入は見送ることとなった。
Expenditure Plan for Carryover Budget	本年度はいよいよ大規模なデータ解析を行うステージに移行するため、CPUやGPUの世代交代のタイミングに合わせて、高速な計算サーバの購入を検討する。

Research Products
(12 results)

All 2016 2015

All Presentation (12 results) (of which Int'l Joint Research: 8 results)

[Presentation] Student's t Nonnegative Matrix Factorization and Positive Semidefinite Tensor Factorization for Single-Channel Audio Source Separation2016
- Author(s)
  Kazuyoshi Yoshii, Katsutoshi Itoyama, Masataka Goto
- Organizer
  IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
- Place of Presentation
  Shanghai, China
- Year and Date
  2016-03-20 – 2016-03-25
- Int'l Joint Research
[Presentation] Tree-Structured Probabilistic Model of Monophonic Written Music Based on the Generative Theory of Tonal Music2016
- Author(s)
  Eita Nakamura, Masatoshi Hamanaka, Keiji Hirata, Kazuyoshi Yoshii
- Organizer
  IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
- Place of Presentation
  Shanghai, China
- Year and Date
  2016-03-20 – 2016-03-25
- Int'l Joint Research
[Presentation] 音楽音響信号解析のためのステューデントt分布に基づく非負値行列分解と半正定値テンソル分解2015
- Author(s)
  吉井和佳, 糸山克寿, 後藤真孝
- Organizer
  電子情報通信学会第18回情報論的学習理論ワークショップ
- Place of Presentation
  つくば国際会議場
- Year and Date
  2015-11-25 – 2015-11-28
[Presentation] Infinite Superimposed Discrete All-pole Modeling for Source-Filter Decomposition of Wavelet Spectrograms2015
- Author(s)
  Kazuyoshi Yoshii, Katsutoshi Itoyama, Masataka Goto
- Organizer
  International Society for Music Information Retrieval Conference (ISMIR)
- Place of Presentation
  Malaga, Spain
- Year and Date
  2015-10-26 – 2015-10-30
- Int'l Joint Research
[Presentation] Audio-Visual Beat Tracking based on a State-Space Model for a Music Robot Dancing with Humans2015
- Author(s)
  Misato Ohkita, Yoshiaki Bando, Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii
- Organizer
  EEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- Place of Presentation
  Hamburg, Germany
- Year and Date
  2015-09-28 – 2015-10-02
- Int'l Joint Research
[Presentation] 非ガウス性モノラル音響信号に対する音源分離のための非負値行列分解と半正定値テンソル分解2015
- Author(s)
  吉井和佳, 糸山克寿, 後藤真孝
- Organizer
  情報処理学会第108回音楽情報科学研究会
- Place of Presentation
  名古屋大学
- Year and Date
  2015-08-31 – 2015-09-01
[Presentation] 音楽音響信号に対する歌声・伴奏音・打楽器音分離に基づくコード認識2015
- Author(s)
  丸尾智志, 池宮由楽, 糸山克寿, 吉井和佳
- Organizer
  情報処理学会第108回音楽情報科学研究会
- Place of Presentation
  名古屋大学
- Year and Date
  2015-08-31 – 2015-09-01
[Presentation] A Music Performance Assistance System based on Vocal, Harmonic, and Percussive Source Separation and Content Visualization for Music Audio Signals2015
- Author(s)
  Ayaka Dobashi, Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii
- Organizer
  Sound and Music Computing Conference (SMC)
- Place of Presentation
  Maynooth, Ireland
- Year and Date
  2015-07-26 – 2015-08-01
- Int'l Joint Research
[Presentation] A Score-Informed Piano Tutoring System with Mistake Detection and Score Simplification2015
- Author(s)
  Tsubasa Fukuda, Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii
- Organizer
  Sound and Music Computing Conference (SMC)
- Place of Presentation
  Maynooth, Ireland
- Year and Date
  2015-07-26 – 2015-08-01
- Int'l Joint Research
[Presentation] モノラル音楽音響信号を対象としたRPCAと音高推定に基づく歌声・伴奏分離2015
- Author(s)
  池宮由楽, 糸山克寿, 吉井和佳
- Organizer
  情報処理学会第107回音楽情報科学研究会
- Place of Presentation
  電気通信大学
- Year and Date
  2015-05-23 – 2015-05-24
[Presentation] Singing Voice Analysis and Editing based on Mutually Dependent F0 Estimation and Source Separation2015
- Author(s)
  Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama
- Organizer
  IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
- Place of Presentation
  Brisbane, Australia
- Year and Date
  2015-05-19 – 2015-05-24
- Int'l Joint Research
[Presentation] A Feedback Framework for Improved Chord Recognition Based on NMF-based Approximate Note Transcription2015
- Author(s)
  Satoshi Maruo, Kazuyoshi Yoshii, Katsutoshi Itoyama, Matthias Mauch, Masataka Goto
- Organizer
  IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
- Place of Presentation
  Brisbane, Australia
- Year and Date
  2015-05-19 – 2015-05-24
- Int'l Joint Research

2015 Fiscal Year Annual Research Report

信号処理と記号処理の確率的協働による音楽知能の創発

Principal Investigator

吉井 和佳 京都大学, 情報学研究科, 講師 (20510001)

Current Status of Research Progress

Reason

Research Products

[Presentation] Student's t Nonnegative Matrix Factorization and Positive Semidefinite Tensor Factorization for Single-Channel Audio Source Separation2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Tree-Structured Probabilistic Model of Monophonic Written Music Based on the Generative Theory of Tonal Music2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音楽音響信号解析のためのステューデントt分布に基づく非負値行列分解と半正定値テンソル分解2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Infinite Superimposed Discrete All-pole Modeling for Source-Filter Decomposition of Wavelet Spectrograms2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Audio-Visual Beat Tracking based on a State-Space Model for a Music Robot Dancing with Humans2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 非ガウス性モノラル音響信号に対する音源分離のための非負値行列分解と半正定値テンソル分解2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音楽音響信号に対する歌声・伴奏音・打楽器音分離に基づくコード認識2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A Music Performance Assistance System based on Vocal, Harmonic, and Percussive Source Separation and Content Visualization for Music Audio Signals2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A Score-Informed Piano Tutoring System with Mistake Detection and Score Simplification2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] モノラル音楽音響信号を対象としたRPCAと音高推定に基づく歌声・伴奏分離2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Singing Voice Analysis and Editing based on Mutually Dependent F0 Estimation and Source Separation2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A Feedback Framework for Improved Chord Recognition Based on NMF-based Approximate Note Transcription2015

Author(s)

Organizer

Place of Presentation

Year and Date

吉井和佳京都大学, 情報学研究科, 講師 (20510001)