A Less Invasive and Computationally Efficient Silent Speech Interface Using Facial Electromyography
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
National Institute of Technology Karnataka, Surathkal
Abstract
Silent Speech Interface (SSI) is one of the promising areas of Human Computer
Interaction (HCI) research. The Surface Electromyography (SEMG) based SSI is
a technique where the electric activity of facial muscles are used to detect speech.
The existing SSI techniques use computationally expensive methods and complex
machine learning algorithms for the identification of silently uttered speech. The
increased computational expense prevents real time implementation of SSI models
especially in cost efficient applications such as communicative assistance for laryngectomy
patients. Thus the objective of this research work is to develop a less
complex and computationally less expensive SEMG based SSI model with superior
accuracy. To achieve this goal, investigations are done on many feature extraction
methods to check if they are suitable for SEMG based SSI. Detrended Fluctuation
Analysis (DFA) is found to be promising for the recognition of silent speech using
SEMG. The use of computationally less expensive classification algorithms was
envisioned in this research work to develop a simpler and faster SSI model. The
research identified K Nearest Neighbors and Decision Trees as suitable pattern
recognition algorithms for this work.
The number of channels associated with SEMG based SSI is also a matter of
important concern. A state-of-the-art model uses seven channels of SEMG data for
the recognition of silent speech. Considering the use of some unipolar electrodes
along with the bipolar ones, the number of electrodes to be accommodated on the
face usually ranges from eight to twelve. For practical applications this is a high
number especially in the case of medical conditions faced by laryngectomy patients.
Too many number of electrodes on the subject’s face creates inconvenience to the
user who have undergone laryngectomy. It can hinder facial movement and can
also contribute to the occurrence of cross talk between different facial muscles.
Thus the reduction of number of channels is necessary and hence it is included as
an important objective of this research work. The effectiveness of using DFA for
successful channel reduction is investigated thoroughly. The analysis using DFA is
also compared with channel reduction performed on models that employ existing
state-of-the-art methods.
The availability of reliable data is vital for every researcher to carry out fruitful
research. But as far as SEMG based SSI is considered, data availability is a
major concern. There are very few reliable data sets (with sufficient vocabulary)
available for SEMG based research. This is primarily due to the popular research
orientation towards acoustic speech recognition. Thus the creation of an extensive
database is a promising aspect to consider and the initial steps to that cause is
also considered as an important goal of this research work. Hardware purchase
and assembly, drafting of a detailed data acquisition methodology, and a sample
data collection is done as part of the work.
Description
Keywords
SEMG, Human-Machine Interactions, Channel Reduction
