Excellence in Research and Innovation for Humanity

Keisuke Nakamura

Publications

2

Publications

2
16406
Automatic Distance Compensation for Robust Voice-based Human-Computer Interaction
Abstract:

Distant-talking voice-based HCI system suffers from performance degradation due to mismatch between the acoustic speech (runtime) and the acoustic model (training). Mismatch is caused by the change in the power of the speech signal as observed at the microphones. This change is greatly influenced by the change in distance, affecting speech dynamics inside the room before reaching the microphones. Moreover, as the speech signal is reflected, its acoustical characteristic is also altered by the room properties. In general, power mismatch due to distance is a complex problem. This paper presents a novel approach in dealing with distance-induced mismatch by intelligently sensing instantaneous voice power variation and compensating model parameters. First, the distant-talking speech signal is processed through microphone array processing, and the corresponding distance information is extracted. Distance-sensitive Gaussian Mixture Models (GMMs), pre-trained to capture both speech power and room property are used to predict the optimal distance of the speech source. Consequently, pre-computed statistic priors corresponding to the optimal distance is selected to correct the statistics of the generic model which was frozen during training. Thus, model combinatorics are post-conditioned to match the power of instantaneous speech acoustics at runtime. This results to an improved likelihood in predicting the correct speech command at farther distances. We experiment using real data recorded inside two rooms. Experimental evaluation shows voice recognition performance using our method is more robust to the change in distance compared to the conventional approach. In our experiment, under the most acoustically challenging environment (i.e., Room 2: 2.5 meters), our method achieved 24.2% improvement in recognition performance against the best-performing conventional method.

Keywords:
Human Machine Interaction, Human Computer Interaction, Voice Recognition, Acoustic Model Compensation, Acoustic Speech Enhancement.
1
13786
Extending E-learning systems based on Clause-Rule model
Abstract:
E-Learning systems are used by many learners and teachers. The developer is developing the e-Learning system. However, the developer cannot do system construction to satisfy all of users- demands. We discuss a method of constructing e-Learning systems where learners and teachers can design, try to use, and share extending system functions that they want to use; which may be nally added to the system by system managers.
Keywords:
Clause-Rule-Model, database-access, e-Learning,Web-Application.