研究業績リスト
会議発表プレゼンテーション
プロンプトにより韻律と声質を制御する日本語テキスト音声合成システムのための検討
公開済 06/06/2025
音学シンポジウム2025, 13/06/2025–14/06/2025
会議発表プレゼンテーション
公開済 06/03/2019
Acoustical Society of Japan 2019 Spring Meeting, 電気通信大学
農業水利施設のパイプラインの漏水診断を目的として現地の流下試験で収録した音響データについて、観測された各種音響イベント・漏水音の特性とその自動検出に関し、実験用水路のデータと比較して考察する。
会議発表プレゼンテーション
公開済 12/09/2018
日本音響学会2018年秋季研究発表会, 大分大学旦野原キャンパス
農業水利施設のパイプラインの漏水診断を目的として、実験用水路で 収録した音響データから
機械学習した漏水音と非漏水音の複数帯域音響モデルにより漏水音を自動検知することを試みた。
会議発表プレゼンテーション
Automatic detection of water leakage sound in a water pipeline from underwater acoustic data
公開済 13/03/2018
日本音響学会2018年春季研究発表会, 日本工業大学宮代キャンパス
会議発表プレゼンテーション
調音クラス事後確率による言語識別における連続型言語モデルの検討
公開済 13/03/2018
日本音響学会2018年春季研究発表会, 日本工業大学宮代キャンパス
会議発表プレゼンテーション
調音クラス事後確率を用いた言語識別 -線形判別分析を用いた特徴量抽出の改良-
公開済 26/09/2017
日本音響学会2017年秋季研究発表会, 愛媛大学
本研究室は,調音に着目した言語識別法を提案した.先行研究では,調音クラス抽出における認識率が約60から90%であった.そこで,LDA分析を用いることで,調音クラス抽出の精度をあげ,言語識別率の向上を目的とした.
会議発表プレゼンテーション
公開済 11/03/2016
日本音響学会2016年春季研究発表会, 桐蔭横浜大学
In most of current speech processing techniques, MFCC obtained from amplitude spectrum and delta-MFCC calculated as time derivative of MFCC are widely used as acoustic features. However, these features consider neither frequency derivative of amplitude spectrum nor phase information of speech waveform. Local feature and group delay spectrum are among the features claimed by previous works to possess such information useful for speech processing. We therefore examine their effectiveness on speech recognition performance. We conducted phoneme recognition experiments using speaker-dependent phoneme HMMs trained with local feature, group delay spectrum, and MFCC in same speaker, same gender, and different gender conditions. We obtained highest recognition rate by local feature, while the other features showed better performance for some phonemes. Likelihood combination of local feature, group delay spectrum, and MFCC HMMs yielded better phoneme recognition rate than the case in which each HMM was used solely. Results show that it is promising that recognition performance degradation can be alleviated by a combination of local feature, group delay spectrum, and MFCC.
会議発表プレゼンテーション
公開済 03/03/2015
電子情報通信学会技術報告,2015年3月度音声研究会, 南の美ら花ホテルミヤヒラ
In most of current speech processing techniques, MFCC obtained from amplitude spectrum and delta-MFCC calculated as time derivative of MFCC are widely used as acoustic features. However, these features consider neither frequency derivative of amplitude spectrum nor phase information of speech waveform. Local feature and group delay spectrum are among the features claimed by previous works to possess such information useful for speech processing. We therefore examine their effectiveness on speech recognition performance. We conducted phoneme recognition experiments using speaker-dependent phoneme HMMs trained with local feature, group delay spectrum, and MFCC in same speaker, same gender, and different gender conditions. We obtained highest recognition rate by local feature, while the other features showed better performance for some phonemes. Likelihood combination of local feature, group delay spectrum, and MFCC HMMs yielded better phoneme recognition rate than the case in which each HMM was used solely. Results show that it is promising that recognition performance degradation can be alleviated by a combination of local feature, group delay spectrum, and MFCC.
会議発表プレゼンテーション
公開済 03/03/2015
電子情報通信学会技術報告,2015年3月度音声研究会, 南の美ら花ホテルミヤヒラ
Wesetarticulatoryclassesbasedonarticulatoryfeature,andweuseposteriorprobabilitiesonarticula-toryclassesforlanguageidenti cation.PosteriorprobabilityoneacharticulatoryclassiscalculatedbyarticulatoryfeatureextractorbasedonGMMs.Theposteriorprobabilityvaluesofthearticulatoryclassesareconcatenatedtoformvectorateachanalysisframe.ThesevectorsarethenquantizedtoyieldVQcodesequence,whichisusedasthetrainingdataforan-gramlanguagemodel.Theresultsoflanguageidenti cationexperimentbetweenJapaneseandEnglishshowedchangeinidenti cationperformancebycodebooksize.Themethodthatuseslanguage-dependentarticulatoryfeatureextractorshowedidenti cationrateof98.1%whencodebooksizewas64,andthemethodthatuseslanguage-independentarticulatoryfeatureextractorshowedidenti cationrateof95.6%whencodebooksizewas256.
会議発表プレゼンテーション
Automatic Language Identification Based on Posterior Probability on Articulatory Classes
公開済 16/12/2014
電子情報通信学会技術報告,第16回音声言語シンポジウム, 東京工業大学すずかけ台キャンパス
Extraction of features from input speech that are effective in distinguishing the language is a key issue for language identification system. We use posterior probabilities on articulatory classes as features for language identification. Posterior probability on each articulatory class is calculated by GMMs. Each GMM is trained with MFCC data of speech segments labeled with the phonemes or acoustic events that correspond to the articulatory class. The posterior probability values of the articulatory classes are concatenated to form an articulatory-feature-class-posterior-probability (AFCPP) vector at each analysis frame. These vectors are then quantized to yield VQ code sequence, which is used as the training data for a n-gram language model. Language identification is performed by selecting the n-gram model that yields the highest likelihood for the AFCPP vector sequence of the input utterance. Language identification experiment between Japanese and English by the present method showed identification rate of 97.1%.