構音障がい者のための声質変換

Research Project

Project/Area Number	14J04514
Research Category	Grant-in-Aid for JSPS Fellows
Allocation Type	Single-year Grants
Section	国内
Research Field	Perceptual information processing
Research Institution	Kobe University
Principal Investigator	相原龍神戸大学, システム情報学研究科, 特別研究員(DC1)
Project Period (FY)	2014-04-25 – 2017-03-31
Project Status	Completed (Fiscal Year 2016)
Budget Amount *help	¥2,800,000 (Direct Cost: ¥2,800,000) Fiscal Year 2016: ¥900,000 (Direct Cost: ¥900,000) Fiscal Year 2015: ¥900,000 (Direct Cost: ¥900,000) Fiscal Year 2014: ¥1,000,000 (Direct Cost: ¥1,000,000)
Keywords	声質変換 / 障がい者支援 / 識別学習 / 発話リズム / Duration / 構音障がい / 脳性麻痺 / アテトーゼ現象 / 不特定話者 / 発話支援 / 障がい者福祉
Outline of Annual Research Achievements	声質変換は，ある話者の声をあたかも別人が発話しているかのように変換する技術である．アテトーゼ型脳性麻痺による構音障がい者の不明瞭な発話を，この声質変換技術を用いて聞き取りやすく変換することが本研究の目標である．声質変換技術はテキスト認識を行わない，音声から音声へ変換するシステムであるため，手足の動きが不自由な発話障がい者にとっても使いやすい技術であると考えられる．本年度は，声質変換精度の向上を目標として，「識別的学習」と「発話リズム変換」の２つのタスクに取り組んだ．構音障がい者発話が不明瞭になる原因として，音素の曖昧性が指摘されている．音素は音声において分割可能な最小単位とされている．構音障がい者は，口や舌など発話する機構が不自由であるため，健常者と比較して発話が曖昧になりやすい．提案手法では，これまで我々が研究してきた，構音障がい者の声質変換で用いられてきたアルゴリズムに，音素を識別するモデルを導入し，発話が明瞭に変換されるよう改良を加えた．この研究成果は，音声信号処理において世界最大級の国際学会INTERSPEECH2016において発表された．構音障害がい者の発話の特徴として，発話が不自然に間延びするという点がある．健常者の発話リズムは基本的に一定であるのに対して，障がい者の発話リズムは，その前後の音素の関係や発話者の体調によって大きく変化する．この発話リズムの変動が，障がい者の発話を聞き取りにくくする原因の一つとなっていた．発話リズムの変換はこれまで例が少なく，特に声質変換システムにおいて，発話リズムは入力話者のものをそのまま用いることがほとんどであった．そのため，発話リズムを変換する新たな特徴量を提案し，リズムを健常者に近づけることに成功した．これらの研究成果は日本音響学会ならびに電子情報通信学会で発表され，現在，INTERSPEECH2017に投稿中である．
Research Progress Status	28年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	28年度が最終年度であるため、記入しない。

Report

(3 results)

Research Products
(45 results)

All 2017 2016 2015 2014

All Journal Article (10 results) (of which Peer Reviewed: 8 results, Open Access: 2 results) Presentation (34 results) (of which Int'l Joint Research: 10 results) Book (1 results)

[Journal Article] Multiple Non-negative Matrix Factorization for Many-to-many Voice Conversion2016
- Author(s)
  Ryo Aihara, Testuya Takiguchi, Yasuo Ariki
- Journal Title
  
  EEE/ACM Transactions on Audio, Speech, and Language Processing
  
  Volume: 24 Issue: 7 Pages: 1175-1184
- DOI
  10.1109/taslp.2016.2522643
- Related Report
  2016 Annual Research Report
- Peer Reviewed
[Journal Article] Multiple Non-negative Matrix Factorization for Many-to-many Voice Conversion2016
- Author(s)
  Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
- Journal Title
  
  IEEE/ACM Trans. on Audio, Speech, and Language Processing
  
  Volume: PP Pages: 1-10
- Related Report
  2015 Annual Research Report
- Peer Reviewed
[Journal Article] Individuality-Preserving Voice Conversion for Articulation Disorders Using Phoneme-Categorized Exemplars2015
- Author(s)
  Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
- Journal Title
  
  ACM Trans. on Accessible Computing; Special Issue on Speech and Language Processing for AT
  
  Volume: 6
- Related Report
  2015 Annual Research Report
- Peer Reviewed
[Journal Article] Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss2015
- Author(s)
  Yuki Takashima, Yasuhiro Kakihara, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki, Nobuyuki Mitani, Kiyohiro Omori, Kaoru Nakazono
- Journal Title
  
  IPSJ Trans. on Computer Vision and Applications
  
  Volume: 7 Pages: 64-68
- Related Report
  2015 Annual Research Report
- Peer Reviewed
[Journal Article] Multimodal voice conversion based on non-negative matrix factorization2015
- Author(s)
  Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
- Journal Title
  
  EURASIP Journal on Audio, Speech, and Music Processing
  
  Volume: 2015:24 Issue: 1 Pages: 1-9
- DOI
  10.1186/s13636-015-0067-4
- Related Report
  2015 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Small-parallel exemplar-based voice conversion in noisy environments using affine non-negative matrix factorization2015
- Author(s)
  Ryo Aihara, Takao Fujii, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
- Journal Title
  
  EURASIP Journal on Audio, Speech, and Music Processing
  
  Volume: 2015:32 Issue: 1 Pages: 1-9
- DOI
  10.1186/s13636-015-0075-4
- Related Report
  2015 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Individuality-preserving Voice Conversion for Articulation Disorders Using Phoneme-categorized Exemplars2015
- Author(s)
  Ryo Aihara, Tetsuya Takiguchi and Yasuo Ariki
- Journal Title
  
  Transactions on Accessible Computing
  
  Volume: 未定
- Related Report
  2014 Annual Research Report
- Peer Reviewed
[Journal Article] Noise-Robust Voice Conversion Based on Sparse Spectral Mapping Using Non-negative Matrix Factorization2014
- Author(s)
  Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi and Yasuo Ariki
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: E97-D Pages: 1411-1418
- Related Report
  2014 Annual Research Report
- Peer Reviewed
[Journal Article] スパース辞書学習による構音障害者の話者性を維持した声質変換2014
- Author(s)
  相原龍，滝口哲也，有木康雄
- Journal Title
  
  電子情報通信学会技術研究報告
  
  Volume: 91 Pages: 39-44
- NAID
  40020156739
- Related Report
  2014 Annual Research Report
[Journal Article] Multiple Non-negative Matrix Factorization を用いた多対一声質変換2014
- Author(s)
  相原龍, 滝口哲也, 有木康雄
- Journal Title
  
  電子情報通信学会技術研究報告
  
  Volume: 114 Pages: 75-80
- NAID
  110009850921
- Related Report
  2014 Annual Research Report
[Presentation] isual-to-Speech Conversion Based on Maximum Likelihood Estimation2017
- Author(s)
  羅里奈
- Organizer
  MVA2017, The Fifteenth IAPR International Conference on Machine Vision Applications
- Place of Presentation
  Nagoya University, Nagoya, Japan
- Year and Date
  2017-05-08
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] 声質変換のための音素識別的特徴量2017
- Author(s)
  相原龍
- Organizer
  日本音響学会2017年春季研究発表会
- Place of Presentation
  明治大学，神奈川，日本
- Year and Date
  2017-03-09
- Related Report
  2016 Annual Research Report
[Presentation] 最尤変換における唇動画像からの音声生成2017
- Author(s)
  羅里奈
- Organizer
  日本音響学会2017年春季研究発表会
- Place of Presentation
  明治大学，神奈川，日本
- Year and Date
  2017-03-09
- Related Report
  2016 Annual Research Report
[Presentation] 構音障害者のためのDurationを含んだ統計的声質変換2017
- Author(s)
  相原龍
- Organizer
  電子情報通信学会音声研究会（SP）
- Place of Presentation
  沖縄産業支援センター，沖縄，日本
- Year and Date
  2017-03-01
- Related Report
  2016 Annual Research Report
[Presentation] 非負値行列因子分解に基づく声質変換のための Graph Embedding を用いたパラレル辞書学習2016
- Author(s)
  相原龍
- Organizer
  日本音響学会2016年秋季研究発表会
- Place of Presentation
  富山大学，富山，日本
- Year and Date
  2016-09-14
- Related Report
  2016 Annual Research Report
[Presentation] 複素NMFを用いた声質変換の検討2016
- Author(s)
  李権俊
- Organizer
  日本音響学会2016年秋季研究発表会
- Place of Presentation
  富山大学，富山，日本
- Year and Date
  2016-09-14
- Related Report
  2016 Annual Research Report
[Presentation] 非負値行列因子分解を用いたマルチモーダル声質変換における画像特徴量の検討2016
- Author(s)
  羅里奈
- Organizer
  日本音響学会2016年秋季研究発表会
- Place of Presentation
  富山大学，富山，日本
- Year and Date
  2016-09-14
- Related Report
  2016 Annual Research Report
[Presentation] Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-embedded Non-negative Matrix Factorization2016
- Author(s)
  相原龍
- Organizer
  INTERSPEECH2016
- Place of Presentation
  Hyatt Regency, San Francisco, USA
- Year and Date
  2016-09-08
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] Audio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss2016
- Author(s)
  高島悠樹
- Organizer
  INTERSPEECH2016
- Place of Presentation
  Hyatt Regency, San Francisco, USA
- Year and Date
  2016-09-08
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] Discriminative Graph-embedded Non-negative Matrix_Factorizationを用いた声質変換のためのパラレル辞書学習2016
- Author(s)
  相原龍
- Organizer
  電子情報通信学会音声研究会（SP）
- Place of Presentation
  京都大学，京都，日本
- Year and Date
  2016-08-24
- Related Report
  2016 Annual Research Report
[Presentation] SEMI-NON-NEGATIVE MATRIX FACTORIZATION USING ALTERNATING DIRECTION METHOD OF MULTIPLIERS FOR VOICE CONVERSION2016
- Author(s)
  Ryo Aihara, Testuya Takiguchi, and Yasuo Ariki
- Organizer
  IEEE ICASSP 2016
- Place of Presentation
  Shanghai, China
- Year and Date
  2016-03-20
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] Dysarthric Speech Modification Using Parallel Utterance Based on Non-negative Temporal Decomposition2016
- Author(s)
  相原龍
- Organizer
  SLPAT 2016, 7th Workshop on Speech and Language Processing for Assistive Technologies
- Place of Presentation
  San Francisco, USA
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] Alternating Direction Method of Multipliersを用いた声質変換のためのパラレル辞書学習2015
- Author(s)
  相原龍，滝口哲也，有木康雄
- Organizer
  第17回音声言語シンポジウム
- Place of Presentation
  名古屋工業大学
- Year and Date
  2015-12-02
- Related Report
  2015 Annual Research Report
[Presentation] Individuality-Preserving Voice Conversion for Articulation Disorders Using Phoneme-Categorized Exemplars2015
- Author(s)
  Ryo Aihara, Testuya Takiguchi, and Yasuo Ariki
- Organizer
  The 17th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2015)
- Place of Presentation
  Lisbon, Portugal
- Year and Date
  2015-10-26
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] MANY-TO-ONE VOICE CONVERSION USING EXEMPLAR-BASED SPARSE REPRESENTATION2015
- Author(s)
  Ryo Aihara, Testuya Takiguchi, and Yasuo Ariki
- Organizer
  IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA2015)
- Place of Presentation
  New Paltz, USA
- Year and Date
  2015-10-18
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] 任意話者を対象としたExemplar-based声質変換2015
- Author(s)
  相原龍, 滝口哲也, 有木康雄
- Organizer
  電子情報通信学会音声研究会（SP）
- Place of Presentation
  神戸大学
- Year and Date
  2015-10-15
- Related Report
  2015 Annual Research Report
[Presentation] LIP-TO-SPEECH SYNTHESIS USING LOCALITY-CONSTRAINT NON-NEGATIVE MATRIX FACTORIZATION2015
- Author(s)
  Ryo AIHARA, Kenta MASAKA, Tetsuya TAKIGUCHI, Yasuo ARIKI
- Organizer
  The First International Workshop on Machine Learning in Spoken Language Processing (MLSLP)
- Place of Presentation
  Aizu-Wakamatsu, Japan
- Year and Date
  2015-09-19
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] Multiple Non-negative Matrix Factorization に基づく多対多声質変換2015
- Author(s)
  相原龍, 滝口哲也, 有木康雄
- Organizer
  日本音響学会2015年秋季研究発表会
- Place of Presentation
  会津大学
- Year and Date
  2015-09-16
- Related Report
  2015 Annual Research Report
[Presentation] Many-to-many Voice Conversion Based on Multiple Non-negative Matrix Factorization2015
- Author(s)
  Ryo Aihara, Testuya Takiguchi, and Yasuo Ariki
- Organizer
  INTERSPEECH 2015
- Place of Presentation
  Dresden, Germany
- Year and Date
  2015-09-06
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] NOISE-ROBUST VOICE CONVERSION USING A SMALL PARALLEL DATA BASED ON NON-NEGATIVE MATRIX FACTORIZATION2015
- Author(s)
  Ryo Aihara, Takao Fujii, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
- Organizer
  The 23rd European Signal Processing Conference (EUSIPCO)
- Place of Presentation
  Nice, France
- Year and Date
  2015-08-31
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] ACTIVITY-MAPPING NON-NEGATIVE MATRIX FACTORIZATION FOR EXEMPLAR-BASED VOICE CONVERSION2015
- Author(s)
  Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki, Kobe University
- Organizer
  ICASSP2015
- Place of Presentation
  Brisbane, Australia
- Year and Date
  2015-04-21 – 2015-04-24
- Related Report
  2014 Annual Research Report
[Presentation] Multiple Non-negative Matrix Factorizationに基づく多対一声質変換2015
- Author(s)
  相原龍, 滝口哲也, 有木康雄
- Organizer
  日本音響学会2015年春季研究発表会
- Place of Presentation
  中央大学
- Year and Date
  2015-03-16 – 2015-03-18
- Related Report
  2014 Annual Research Report
[Presentation] 少量のパラレルデータを用いたNon-negative Matrix Factorizationによる雑音環境下の声質変換2015
- Author(s)
  藤井貴生, 相原龍, 中鹿亘, 滝口哲也, 有木康雄
- Organizer
  日本音響学会2015年春季研究発表会
- Place of Presentation
  中央大学
- Year and Date
  2015-03-16 – 2015-03-18
- Related Report
  2014 Annual Research Report
[Presentation] 非負値行列因子分解に基づく唇動画像からの音声生成2015
- Author(s)
  真坂健太, 相原龍, 滝口哲也, 有木康雄
- Organizer
  日本音響学会2015年春季研究発表会
- Place of Presentation
  中央大学
- Year and Date
  2015-03-16 – 2015-03-18
- Related Report
  2014 Annual Research Report
[Presentation] Exemplar-based Emotional Voice Conversion Using Non-negative Matrix Factorization2014
- Author(s)
  Ryo AIHARA, Reina UEDA, Tetsuya TAKIGUCHI, Yasuo ARIKI
- Organizer
  APSIPA2014
- Place of Presentation
  Siem Reap, Cambodia
- Year and Date
  2014-12-09 – 2014-12-12
- Related Report
  2014 Annual Research Report
[Presentation] Multimodal Exemplar-based Voice Conversion using Lip Features in Noisy Environments2014
- Author(s)
  Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
- Organizer
  Interspeech2014
- Place of Presentation
  Singpore
- Year and Date
  2014-09-14 – 2014-09-18
- Related Report
  2014 Annual Research Report
[Presentation] Error Correction of Automatic Speech Recognition Based on Normalized Web Distance2014
- Author(s)
  E. Byambakhishig, K. Tanaka, R. Aihara, T. Nakashika, T. Takiguchi, Y. Ariki
- Organizer
  Interspeech2014
- Place of Presentation
  Singpore
- Year and Date
  2014-09-14 – 2014-09-18
- Related Report
  2014 Annual Research Report
[Presentation] アクティビティマッピングによる非負値行列因子分解を用いた声質変換2014
- Author(s)
  相原龍, 滝口哲也, 有木康雄
- Organizer
  日本音響学会2014年秋季研究発表会
- Place of Presentation
  北海学園大学
- Year and Date
  2014-09-03 – 2014-09-05
- Related Report
  2014 Annual Research Report
[Presentation] 話者適応を用いたNMFによる雑音環境下の声質変換2014
- Author(s)
  藤井貴生，相原龍，中鹿亘，滝口哲也，有木康雄
- Organizer
  日本音響学会2014年秋季研究発表会
- Place of Presentation
  北海学園大学
- Year and Date
  2014-09-03 – 2014-09-05
- Related Report
  2014 Annual Research Report
[Presentation] ハイスピードカメラ画像を用いたマルチモーダルNMF声質変換2014
- Author(s)
  真坂健太，相原龍，滝口哲也，有木康雄
- Organizer
  日本音響学会2014年秋季研究発表会
- Place of Presentation
  北海学園大学
- Year and Date
  2014-09-03 – 2014-09-05
- Related Report
  2014 Annual Research Report
[Presentation] Individuality-preserving Voice Conversion for Articulation Disorders Using Dictionary Selective Non-negative Matrix Factorization2014
- Author(s)
  Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
- Organizer
  SLPAT 2014, 5th Workshop on Speech and Language Processing for Assistive Technologies
- Place of Presentation
  Baltimore, U.S.
- Year and Date
  2014-06-26
- Related Report
  2014 Annual Research Report
[Presentation] Normalized Web Distanceを用いた音声認識誤りの訂正法2014
- Author(s)
  エンフボロルビャムバヒシグ, 田中克幸, 相原龍, 滝口哲也, 有木康雄
- Organizer
  第28回人工知能学会全国大会
- Place of Presentation
  愛媛県県民文化会館
- Year and Date
  2014-05-12 – 2014-05-15
- Related Report
  2014 Annual Research Report
[Presentation] VOICE CONVERSION BASED ON NON-NEGATIVE MATRIX FACTORIZATION USING PHONEME-CATEGORIZED DICTIONARY2014
- Author(s)
  Ryo AIHARA, Toru NAKASHIKA, Tetsuya TAKIGUCHI, Yasuo ARIKI
- Organizer
  ICASSP2014
- Place of Presentation
  Florence, Italy
- Year and Date
  2014-05-04 – 2014-05-09
- Related Report
  2014 Annual Research Report
[Presentation] MULTIMODAL VOICE CONVERSION USING NON-NEGATIVE MATRIX FACTORIZATION IN NOISY ENVIRONMENTS2014
- Author(s)
  Kenta MASAKA, Ryo AIHARA, Tetsuya TAKIGUCHI, Yasuo ARIKI
- Organizer
  ICASSP2014
- Place of Presentation
  Florence, Italy
- Year and Date
  2014-05-04 – 2014-05-09
- Related Report
  2014 Annual Research Report
[Book] Computer and Information Science2016
- Author(s)
  Roger Lee (Editor), Ryo Aihara, Kenta Masaka, Tetsuya Takiguchi, Yasuo Ariki
- Total Pages
  181
- Publisher
  Springer International Publishing
- Related Report
  2016 Annual Research Report

構音障がい者のための声質変換

Principal Investigator

相原 龍 神戸大学, システム情報学研究科, 特別研究員(DC1)

¥2,800,000 (Direct Cost: ¥2,800,000)

Report

Research Products

[Journal Article] Multiple Non-negative Matrix Factorization for Many-to-many Voice Conversion2016

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Multiple Non-negative Matrix Factorization for Many-to-many Voice Conversion2016

Author(s)

Journal Title

Related Report

[Journal Article] Individuality-Preserving Voice Conversion for Articulation Disorders Using Phoneme-Categorized Exemplars2015

Author(s)

Journal Title

Related Report

[Journal Article] Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss2015

Author(s)

Journal Title

Related Report

[Journal Article] Multimodal voice conversion based on non-negative matrix factorization2015

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Small-parallel exemplar-based voice conversion in noisy environments using affine non-negative matrix factorization2015

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Individuality-preserving Voice Conversion for Articulation Disorders Using Phoneme-categorized Exemplars2015

Author(s)

Journal Title

Related Report

[Journal Article] Noise-Robust Voice Conversion Based on Sparse Spectral Mapping Using Non-negative Matrix Factorization2014

Author(s)

Journal Title

Related Report

[Journal Article] スパース辞書学習による構音障害者の話者性を維持した声質変換2014

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Multiple Non-negative Matrix Factorization を用いた多対一声質変換2014

Author(s)

Journal Title

NAID

Related Report

[Presentation] isual-to-Speech Conversion Based on Maximum Likelihood Estimation2017

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 声質変換のための音素識別的特徴量2017

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 最尤変換における唇動画像からの音声生成2017

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 構音障害者のためのDurationを含んだ統計的声質変換2017

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 非負値行列因子分解に基づく声質変換のための Graph Embedding を用いたパラレル辞書学習2016

Author(s)

Organizer

Place of Presentation

Year and Date

相原龍神戸大学, システム情報学研究科, 特別研究員(DC1)