A Less Invasive and Computationally Efficient Silent Speech Interface Using Facial Electromyography

Thumbnail Image

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

National Institute of Technology Karnataka, Surathkal

Abstract

Silent Speech Interface (SSI) is one of the promising areas of Human Computer Interaction (HCI) research. The Surface Electromyography (SEMG) based SSI is a technique where the electric activity of facial muscles are used to detect speech. The existing SSI techniques use computationally expensive methods and complex machine learning algorithms for the identification of silently uttered speech. The increased computational expense prevents real time implementation of SSI models especially in cost efficient applications such as communicative assistance for laryngectomy patients. Thus the objective of this research work is to develop a less complex and computationally less expensive SEMG based SSI model with superior accuracy. To achieve this goal, investigations are done on many feature extraction methods to check if they are suitable for SEMG based SSI. Detrended Fluctuation Analysis (DFA) is found to be promising for the recognition of silent speech using SEMG. The use of computationally less expensive classification algorithms was envisioned in this research work to develop a simpler and faster SSI model. The research identified K Nearest Neighbors and Decision Trees as suitable pattern recognition algorithms for this work. The number of channels associated with SEMG based SSI is also a matter of important concern. A state-of-the-art model uses seven channels of SEMG data for the recognition of silent speech. Considering the use of some unipolar electrodes along with the bipolar ones, the number of electrodes to be accommodated on the face usually ranges from eight to twelve. For practical applications this is a high number especially in the case of medical conditions faced by laryngectomy patients. Too many number of electrodes on the subject’s face creates inconvenience to the user who have undergone laryngectomy. It can hinder facial movement and can also contribute to the occurrence of cross talk between different facial muscles. Thus the reduction of number of channels is necessary and hence it is included as an important objective of this research work. The effectiveness of using DFA for successful channel reduction is investigated thoroughly. The analysis using DFA is also compared with channel reduction performed on models that employ existing state-of-the-art methods. The availability of reliable data is vital for every researcher to carry out fruitful research. But as far as SEMG based SSI is considered, data availability is a major concern. There are very few reliable data sets (with sufficient vocabulary) available for SEMG based research. This is primarily due to the popular research orientation towards acoustic speech recognition. Thus the creation of an extensive database is a promising aspect to consider and the initial steps to that cause is also considered as an important goal of this research work. Hardware purchase and assembly, drafting of a detailed data acquisition methodology, and a sample data collection is done as part of the work.

Description

Keywords

SEMG, Human-Machine Interactions, Channel Reduction

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By