2019 Fiscal Year Annual Research Report

Design of Robust Speech Recognition System and Development of its Energy Harvesting System

Research Project

Project/Area Number	18H03212
Research Institution	Hokkaido University
Principal Investigator	宮永喜一北海道大学, 情報科学研究院, 教授 (20166185)
Project Period (FY)	2018-04-01 – 2022-03-31
Keywords	音声認識システム / 音声情報処理 / エナジーハーベスト / 低消費電力技術 / 雑音ロバスト / ディジタル信号処理 / ハードウェア・ソフトウェア協調設計 / 回路とシステム
Outline of Annual Research Achievements	本研究は，4年間において，2つの環境（劣悪音響環境，サステナビリティ環境）に対して有効な音声認識LSIシステムを設計・開発し，そのフィールド実験を実施することで，実用性の高い音声認識・対話システムの実現を目指している。研究計画の前半2年間（2018年度及び2019年度）では，劣悪条件下における新しい音声認識技術の設計・開発・実現を行った。新しい雑音ロバスト音声認識手法の開発と，同時に，低消費電力化のためのハードウエア・ソフトウエアの協調設計によるシステム設計を実施した。2019年度において，以下の研究成果が得られた。（1）雑音に埋もれた音声の分析技術の方式提案：　申請者がすでに開発している雑音ロバスト音声認識システムに，時変モデルの解析手法を導入した。さらに，聴覚心理学理論に基づき，新しく設計提案している動的なマスキング現象をモデル化し，音声の特徴分析精度を向上させ，Missing Feature Theoryを拡張した。（2）劣悪条件下での音声認識の方式提案：　雑音抑制・エコー除去手法とその動作条件を，子供，成人男女，高齢者などの発話者クラスタに対して最適化した。様々な観測環境や，エコー・雑音環境を想定し，異なる条件下において，最適解を得られるような雑音抑制手法の設計とその動作条件を求めた。これらの評価結果に基づくハードウエア・ソフトウエアの協調設計を実施した。（3）誤認識動作を抑制する音声棄却の方式提案：　ケプストラム領域と時間領域での不要な信号・音・音声の特徴を抽出し，尤度検定による類似度を計算し，類似度の特性を複数の評価基準により多角的に評価し，不要な信号や非対象音声等を自動除去する音声棄却処理を提案・開発し，その性能評価を行っている。
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 2019年度における研究計画は，劣悪条件下における新しい音声認識技術の開発と実現，および，低消費電力化のための新技術（極低消費電力化アーキテクチャ）の開発である。ここでは，音声と非音声の区別を行う技術，雑音に埋もれた音声の特徴量を推定する技術，劣悪条件下での音声認識，誤認識動作を抑制する音声棄却を提案・実現すること。同時に，ハードウエア・ソフトウエアの協調設計によるシステム設計となっている。（1）雑音に埋もれた音声の分析技術の方式提案：　申請者が提案する時変モデルの解析手法を導入し，さらに，聴覚心理学理論に基づく新しいマスキングモデルを設計・開発し，音声の特徴抽出精度を向上さた。これらの成果は，国際会議・ジャーナル論文として公表済み。（2）劣悪条件下での音声認識の方式提案：あらゆる条件に適応する雑音抑制手法の設計は現実的ではなく，想定される種々の条件下における最適な手法を設計することが重要となる。雑音抑制・エコー除去手法とその動作条件を，子供，成人男女，高齢者などの発話者クラスタに対して最適化した。これらの成果も，複数の国際会議にすでに発表した。（3）誤認識動作を抑制する音声棄却の方式提案：　ケプストラム領域と時間領域での不要な信号・音・音声の特徴を抽出し，尤度検定による類似度を計算し，類似度の特性を複数の評価基準により多角的に評価し，不要な信号や非対象音声等を自動除去する音声棄却処理を設計・開発した。現在は，その性能評価を継続して実施し，高性能化を目指している。上記（1～3）の新技術の開発により，様々なエコー環境と，劣悪なSNR環境においても高い認識性能を実現した。これらの方式は，ハードウエア・ソフトウエアの協調設計によりシステム実現を進めており，予定された研究計画に沿って，おおむね順調に進展している。
Strategy for Future Research Activity	本研究は，2つの環境（劣悪音響環境，サステナビリティ環境）に対して有効な音声認識LSIシステムを設計・開発し，そのフィールド実験を実施することで，実用性の高い音声認識・対話システムの実現を目指す。研究計画の前半2年間（2018年度及び2019年度）では，劣悪条件下における新しい音声認識技術の開発と実現を目指しており，新しい方式の提案・設計は完了している。2019年度において，その性能評価を実施した。並行して，低消費電力化のためのハードウエア・ソフトウエアの協調設計によるシステム設計を行った。後半2年間（2020年度及び2021年度）では，低消費電力型LSIシステムの設計・開発とその消費電力評価を行い，2000フレーズの音声に対する高性能音声認識LSIをFPGA上において実現する。そのLSIを用いた音声認識・対話のソウトウエア・ハードウエアの協調設計によるシステム開発も行い，フィールドによる実証実験を実施する。ここでの対話モデルは，ディープニューラルネットワークなどによる学習対話モデルを利用する。本研究開発で想定している対話は，家電・自動車などの装置に対して音声制御を行うタスク指向型のモデルとする。特に，ナチュラルエナジーハーべスティング技術を導入し，極低消費電力技術を用いた認識システムを実現する予定。そこでは，申請が提案する申請書の研究計画（第4項目）にある，次の新技術を開発する予定。（4）エナジーハーベストシステム指向認識システムの設計と開発：　処理の並列化によりクロック周波数は低減されるが，ゲート総数が増加するためにリーク電流による消費電力が増加する。そこで，少ないゲート数による高度な並列・パイプライン処理を実現可能とする動的アーキテクチャを新たに設計し，並列パイプライン処理による電力消費極小化システムの実現を目指す(ゲート数削減，クロック低減，リーク電流低減)。

Research Products
(19 results)

All 2019 Other

All Int'l Joint Research (2 results) Journal Article (12 results) (of which Int'l Joint Research: 6 results, Peer Reviewed: 12 results, Open Access: 3 results) Presentation (4 results) (of which Int'l Joint Research: 4 results, Invited: 4 results) Remarks (1 results)

[Int'l Joint Research] Chulalongkorn University/Faculty of Engineering(タイ)
- Country Name
  THAILAND
- Counterpart Institution
  Chulalongkorn University/Faculty of Engineering
[Int'l Joint Research] Gadjah Mada University/Faculty of ENgineering(インドネシア)
- Country Name
  INDONESIA
- Counterpart Institution
  Gadjah Mada University/Faculty of ENgineering
[Journal Article] Psychoacoustical Masking Effect-Based Feature Extraction for Robust Speech Recognition2019
- Author(s)
  Hay Mar Soe Naing, Risanuri Hidayat, Bondhan Winduratna, Yoshikazu Miyanaga
- Journal Title
  
  International Journal of Innovative Computing, Information and Control
  
  Volume: 15, 5 Pages: 1641-1654
- DOI
  10.24507/ijicic.15.05.1641
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] An Architecture for Real-Time Retinex-Based Image Enhancement and Haze Removal and Its FPGA Implementation2019
- Author(s)
  KASAUKA Dabwitso、SUGIYAMA Kenta、TSUTSUI Hiroshi、OKUHATA Hiroyuki、MIYANAGA Yoshikazu
- Journal Title
  
  IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
  
  Volume: E102.A Pages: 775～782
- DOI
  10.1587/transfun.E102.A.775
- Peer Reviewed
[Journal Article] Hierarchical-P Reference Picture Selection Based Error Resilient Video Coding Framework for High Efficiency Video Coding Transmission Applications2019
- Author(s)
  Maung Maung Htoo、Aramvith Supavadee、Miyanaga Yoshikazu
- Journal Title
  
  Electronics
  
  Volume: 8 Pages: 310～310
- DOI
  10.3390/electronics8030310
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Fast Coding Unit Encoding Scheme for HEVC Using Genetic Algorithm2019
- Author(s)
  Tun Ei Ei、Aramvith Supavadee、Miyanaga Yoshikazu
- Journal Title
  
  IEEE Access
  
  Volume: 7 Pages: 68010～68021
- DOI
  10.1109/ACCESS.2019.2918508
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] An Evaluation of Stack Light Indicator Color Detection System Using Web Cameras for Automatic Production Lines2019
- Author(s)
  Hiroshi Tsutsui, Kentaro Yamada, Akihiro Sudou, Yoshikazu Miyanaga
- Journal Title
  
  Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
  
  Volume: 1 Pages: 1423-1426
- Peer Reviewed
[Journal Article] Voice Activity Detection Using Running Spectrum Analysis for Noise Robust Speech Recognition2019
- Author(s)
  Riku Takanashi, Tatsuya Nakagoshi, Noboru Hayasaka, Yoshikazu Miyanaga, Hiroshi Tsutsui
- Journal Title
  
  Proceedings of International Symposium on Multimedia and Communication Technology
  
  Volume: 1 Pages: RS2-4 - RS2-6
- Peer Reviewed
[Journal Article] Robust Isolated Speech Recognition for Keyword Detection System Using Hidden Markov Model2019
- Author(s)
  Jiayue Tang, Yu Tian, Hiroshi Tsutsui, Yoshikazu Miyanaga
- Journal Title
  
  Proceedings of International Symposium on Multimedia and Communication Technology
  
  Volume: 1 Pages: RS2-1 - RS2-3
- Peer Reviewed
[Journal Article] Filterbank Analysis of MFCC Feature Extraction in Robust Children Speech Recognition2019
- Author(s)
  Hay Mar Soe Naing, Yoshikazu Miyanaga, Risanuri Hidayat, Bondhan Winduratna
- Journal Title
  
  Proceedings of International Symposium on Multimedia and Communication Technology
  
  Volume: 1 Pages: RS2-7 - RS2-12
- Peer Reviewed / Int'l Joint Research
[Journal Article] Encoder Control Enhancement in HEVC Based on R-Lambda Coefficient Distribution2019
- Author(s)
  Sovann Chen, Supavadee Aramvith, Yoshikazu Miyanaga
- Journal Title
  
  Proceedings of International Symposium on Multimedia and Communication Technology
  
  Volume: 1 Pages: RS4-25 - RS4-28
- Peer Reviewed / Int'l Joint Research
[Journal Article] Improvement on Children Speech Recognition under Low Signal-to-Noise Ratio Environment2019
- Author(s)
  Yu Tian, Jiayue Tang, Hiroshi Tsutsui, Yoshikazu Miyanaga
- Journal Title
  
  Proceedings of International Symposium on Multimedia and Communication Technology
  
  Volume: 1 Pages: RS1-4 - RS1-6
- Peer Reviewed
[Journal Article] Construction and Management of Fingerprint Database with Estimated Reference Locations for WiFi Indoor Positioning Systems2019
- Author(s)
  Myat Hsu Aung, Hiroshi Tsutsui, Yoshikazu Miyanaga
- Journal Title
  
  Proceedings of the 23rd Multi-conference on Systemics, Cybernetics and Informatics
  
  Volume: 2 Pages: 7-10
- Peer Reviewed
[Journal Article] A Fast CU Depth Estimation Algorithm for HEVC Inter Coding2019
- Author(s)
  Ei Ei Tun, Supavadee Aramvith, Yoshikazu Miyanaga
- Journal Title
  
  Proceedings of 2019 IEEE International Conference on Consumer Electronics - Asia
  
  Volume: 1 Pages: 120-121
- Peer Reviewed / Int'l Joint Research
[Presentation] Psycho-acoustic Masking Effect for Robust Speech Communication Robot2019
- Author(s)
  Yoshikazu Miyanaga
- Organizer
  International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management
- Int'l Joint Research / Invited
[Presentation] Autonomous ROBOT System with Psycho-acoustic Masking Speech Recognition2019
- Author(s)
  Yoshikazu Miyanaga
- Organizer
  Regional Conference on Computer Information and Engineering
- Int'l Joint Research / Invited
[Presentation] Psychoacoustic Masking Effect for Noise Robust Speech Recognition Robot2019
- Author(s)
  Yoshikazu Miyanaga
- Organizer
  2019 14-th International Symposium on Signals, Circuits, and Systems, IEEE
- Int'l Joint Research / Invited
[Presentation] Noise Robust Speech Recognition Robot with Psychoacoustic Effect2019
- Author(s)
  Yoshikazu Miyanaga
- Organizer
  2019 IEEE International Conference on Consumer Electronics - Asia
- Int'l Joint Research / Invited
[Remarks] 北海道大学情報科学研究院情報科学専攻メディアネットワークコース情報通信ネットワーク研究室
- URL
  https://csw.ist.hokudai.ac.jp/

2019 Fiscal Year Annual Research Report

Design of Robust Speech Recognition System and Development of its Energy Harvesting System

Principal Investigator

宮永 喜一 北海道大学, 情報科学研究院, 教授 (20166185)

Current Status of Research Progress

Reason

Research Products

[Int'l Joint Research] Chulalongkorn University/Faculty of Engineering(タイ)

Country Name

Counterpart Institution

[Int'l Joint Research] Gadjah Mada University/Faculty of ENgineering(インドネシア)

Country Name

Counterpart Institution

[Journal Article] Psychoacoustical Masking Effect-Based Feature Extraction for Robust Speech Recognition2019

Author(s)

Journal Title

DOI

[Journal Article] An Architecture for Real-Time Retinex-Based Image Enhancement and Haze Removal and Its FPGA Implementation2019

Author(s)

Journal Title

DOI

[Journal Article] Hierarchical-P Reference Picture Selection Based Error Resilient Video Coding Framework for High Efficiency Video Coding Transmission Applications2019

Author(s)

Journal Title

DOI

[Journal Article] Fast Coding Unit Encoding Scheme for HEVC Using Genetic Algorithm2019

Author(s)

Journal Title

DOI

[Journal Article] An Evaluation of Stack Light Indicator Color Detection System Using Web Cameras for Automatic Production Lines2019

Author(s)

Journal Title

[Journal Article] Voice Activity Detection Using Running Spectrum Analysis for Noise Robust Speech Recognition2019

Author(s)

Journal Title

[Journal Article] Robust Isolated Speech Recognition for Keyword Detection System Using Hidden Markov Model2019

Author(s)

Journal Title

[Journal Article] Filterbank Analysis of MFCC Feature Extraction in Robust Children Speech Recognition2019

Author(s)

Journal Title

[Journal Article] Encoder Control Enhancement in HEVC Based on R-Lambda Coefficient Distribution2019

Author(s)

Journal Title

[Journal Article] Improvement on Children Speech Recognition under Low Signal-to-Noise Ratio Environment2019

Author(s)

Journal Title

[Journal Article] Construction and Management of Fingerprint Database with Estimated Reference Locations for WiFi Indoor Positioning Systems2019

Author(s)

Journal Title

[Journal Article] A Fast CU Depth Estimation Algorithm for HEVC Inter Coding2019

Author(s)

Journal Title

[Presentation] Psycho-acoustic Masking Effect for Robust Speech Communication Robot2019

Author(s)

Organizer

[Presentation] Autonomous ROBOT System with Psycho-acoustic Masking Speech Recognition2019

Author(s)

Organizer

[Presentation] Psychoacoustic Masking Effect for Noise Robust Speech Recognition Robot2019

Author(s)

Organizer

[Presentation] Noise Robust Speech Recognition Robot with Psychoacoustic Effect2019

Author(s)

Organizer

[Remarks] 北海道大学情報科学研究院情報科学専攻メディアネットワークコース情報通信ネットワーク研究室

URL

宮永喜一北海道大学, 情報科学研究院, 教授 (20166185)