Conference Papers
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506
Browse
3 results
Search Results
Item Estimating multiple physical parameters from speech data(IEEE Computer Society help@computer.org, 2016) Kalluri, S.B.; Vijayakumar, A.; Vijayasenan, D.; Singh, R.In this work, we explore prediction of different physical parameters from speech data. We aim to predict shoulder size and waist size of people from speech data in addition to the conventional height and weight parameters. A data-set with this information is created from 207 volunteers. A bag of words representation based on log magnitude spectrum is used as features. A support vector regression predicts the physical parameters from the bag of the words representation. The system is able to achieve a root mean square error of 6.6 cm for height estimation, 2.6cm for shoulder size, 7.1cm for waist size and 8.9 kg for weight estimation. The results of height estimation is on par with state of the art results. © 2016 IEEE.Item Robust features for automatic estimation of physical parameters from speech(Institute of Electrical and Electronics Engineers Inc., 2017) Kalluri, K.S.; Vijayasenan, D.Estimating speaker's physical parameters like height, weight and shoulder size can assist in voice forensics by providing additional knowledge about the speaker. In this work, statistics of the components of background GMM are employed as features in estimating the physical parameters. These features improved the performance of height and shoulder size estimation as compared to our earlier attempt based on a Bag of Word representation. The robustness of the features is validated using two different training subsets containing different languages. © 2017 IEEE.Item Study of Wireless Channel Effects on Audio Forensics(Institute of Electrical and Electronics Engineers Inc., 2018) Vijayasenan, D.; Kalluri, S.B.; Sreekanth, K.; Issac, A.In this work, we try to study the effect of a wireless channel on physical parameter prediction based on speech data. Speech data from 207 speakers along with corresponding speaker's height and weight is collected. A three path Rayleigh fading channel with typical values of Doppler shift, path gain and path delay is utilized to create the mobile channel output audio. A Bag of Words (BoW) representation based on log magnitude spectrum is used as features. Support Vector Regression (SVR) predicts the physical parameter of the speaker from the BoW representation. The proposed system is able to achieve a Root Mean Square Error (RMSE) of 6.6 cm for height estimation and 8.9 Kg for weight estimation for clean speech. The effect of Rayleigh channel increase the RMSE values to 8.17 cm and 11.84 Kg respectively for height and weight. © 2016 IEEE.
