Speech Representation Using Emotion-Speaker Controllable Probabilistic Model Based on Extended Boltzmann Distribution

Toru NAKASHIKA

戻る

その他

Speech Representation Using Emotion-Speaker Controllable Probabilistic Model Based on Extended Boltzmann Distribution

Toru NAKASHIKA

01/04/2018–31/03/2021

抄録

Offer Organization: Japan Society for the Promotion of Science, System Name: Grants-in-Aid for Scientific Research, Category: Grant-in-Aid for Early-Career Scientists, Fund Type: competitive_research_funding, Overall Grant Amount: - (direct: 3200000, indirect: 960000) In speech signal processing, few methods have been established to simultaneously perform multiple different tasks such as speaker recognition and emotion recognition. In this research, we focused on the Boltzmann machine, which has the property of representing the relationships between various factors with its high potential ability, and examined the effectiveness of simultaneously realizing speaker recognition, emotion recognition, speaker conversion, and emotion conversion with it. From the experimental results, it was found that speaker recognition, emotion recognition, speaker conversion, and emotion conversion can be achieved using only a Boltzmann machine. We also revealed that the Boltzmann machine that simultaneously represents speakers and emotions outperformed the Boltzmann machine that represents either speakers or emotions in recognition and voice conversion accuracy.

ファイルとリンク (1)

url

https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-18K18069表示

メトリック

1 レコードビュー

詳細

タイトル: Speech Representation Using Emotion-Speaker Controllable Probabilistic Model Based on Extended Boltzmann Distribution
作成者 – 役職なし: Toru NAKASHIKA
ID: 991002558209407421
組織: The University of Electro-Communications
資料タイプ: その他
リソースのサブタイプ: rm_research_projects