Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 6 of 6

Noise measurement in a mechanized opencast bauxite mine: A case study
(Multi-Science Publishing Co. Ltd claims@sagepub.com, 2015) Tripathy, D.P.; Rao, D.S.
Occupational noise is one of the most common pollutants in the mining sector. Prolonged exposure of miners to high levels of noise can cause noise induced hearing loss besides several non-auditory health effects. As these effects have a significant impact on people's health, it is essential to assess the noise exposure and to adopt ameliorative solutions. This work has been carried out to study the noise levels of different machines in a highly mechanized opencast bauxite mine of Odisha, India. A number of measurements have been carried out using an Extech 1/3 Octave band analyser and sound level meter to collect data where the miners are exposed to different tasks during the occupational process. Noise measurements were carried out at work places of 10 different machinery sources in the bauxite mine. The average noise levels generated from the different machines were compared with permissible exposure levels of workers as per OSHA guidelines. Further contour mapping was plotted to show the noise levels at each work place.
Accurate estimation of decay coefficients for dynamic range compressors in hearing aids and a hardware level comparison of different architectures
(Elsevier B.V., 2020) Deepu, S.P.; Ramesh Kini, M.R.; Sumam David, S.S.
Dynamic Range Compression (DRC) algorithm helps to protect the residual hearing ability of hearing aid users by compressing the signal levels which go above a particular threshold. This paper addresses two different aspects of DRC for hearing aid applications. In the first part, methods to estimate the decay coefficients corresponding to the required time constants for a feed-forward DRC architecture accurately, to meet the hearing aid specifications are proposed. The effect of compression on the attack and release time parameters are compensated with the new formula. The hardware implementation of four different DRC architectures is explained in the second part of the paper. The estimated decay coefficients for a test signal were used for the corresponding hardware implementations and verified the validity of proposed algorithmic modifications. The architectures were implemented using UMC 65 nm standard cell libraries and the power and error results were compared. The proposed methods to estimate the decay coefficients for both attack and release phases show close to 0 dB error from expected output values, while conventional methods are not meeting the specifications. Hardware implementation shows that there is not much improvement in power performance, between a lower resolution Look-Up Table (LUT) based logarithm implementation and a higher resolution one. From the results, we propose using the absolute level detector based DRC with higher resolution logarithm without a gain smoothing stage at the output for lowest power consumption and better approximation error performance. © 2020 Elsevier B.V.
Design and implementation of a signal processing ASIC for digital hearing aids
(Elsevier B.V., 2022) Deepu, D.; Ramesh Kini, R.K.; Sumam David, S.
People with hearing loss can be benefited from assistive devices like hearing aids. This article presents the implementation of a signal processing chip for digital hearing aid applications. The functionality of the proposed design was tested in real-time using two field programmable gate arrays (FPGAs), one of them modeled as a hearing aid processor and the other as an external audio CODEC. The hearing aid processor contains an 18-band 1/3-octave ANSI S1.11 filter bank, which performs the audiogram compensation and a dynamic range compression algorithm to restrict the output signal to an acceptable loudness. The functionality of an external audio CODEC was replicated in the other FPGA to act as the analog front end circuit of a hearing aid. Serial Peripheral Interface (SPI) was used for communication between the two FPGAs. The SPI protocol was modified to make the hearing aid programmable through the data in line of the interface itself. The proposed hearing aid chip was implemented using standard cell based design flow with a 5x5 mm fixed die size intended to fit in a 48-pin package. © 2022 Elsevier B.V.
An Improved Noise Reduction Technique for Enhancing the Intelligibility of Sinewave Vocoded Speech: Implication in Cochlear Implants
(Institute of Electrical and Electronics Engineers Inc., 2023) Poluboina, V.; Pulikala, A.; Pitchaimuthu, A.N.P.
A cochlear implant (CI) is the most suitable option for individuals with severe profound hearing loss. CI restores the audibility to near perfection and offers good speech understanding in quiet. However, the speech perception in noise with CIs is less optimal as most speech coding strategies of CIs encode only the temporal envelope. Besides the current CI signal coding strategies lacks sophisticated pre-processing. In the current study, we proposed a novel pre-processing method to improve speech Intelligibility in noise and tested using the acoustic simulations of cochlear implants. The proposed noise reduction technique aims to minimize the mean square error (MSE) between the temporal envelopes of the enhanced speech and its clean speech. Therefore, the proposed method will be suitable for CI applications. This paper provides an analysis of the theoretical derivation of the noise suppression function and also the performance evaluation using objective and subjective tests. The effectiveness of the proposed method was objectively evaluated using the SRMR-CI and ESTOI. Additionally, speech recognition through the acoustic simulations of the cochlear implant was done for the subjective evaluation. Performance of the proposed method was compared with the Weiner filter (WF) and sigmoidal functions. The sinewave vocoder was used to simulate the cochlear implant perception. Both objective and subjective scores revealed that the performance of the proposed technique is superior to the WF and sigmoidal function. © 2013 IEEE.
Video Captioning using Sentence Vector-enabled Convolutional Framework with Short-Connected LSTM
(Springer, 2024) Naik, D.; Jaidhar, C.D.
The principal objective of video/image captioning is to portray the dynamics of a video clip in plain natural language. Captioning is motivated by its ability to make the video more accessible to deaf and hard-of-hearing individuals, to help people focus on and recall information more readily, and to watch it in sound-sensitive locations. The most frequently utilized design paradigm is the revolutionary structurally improved encoder-decoder configuration. Recent developments emphasize the utilization of various creative structural modifications to maximize efficiency while demonstrating their viability in real-world applications. The utilization of well-known and well-researched technological advancements such as deep Convolutional Neural Networks (CNNs) and Sentence Transformers are trending in encoder-decoders. This paper proposes an approach for efficiently captioning videos using CNN and a short-connected LSTM-based encoder-decoder model blended with a sentence context vector. This sentence context vector emphasizes the relationship between the video and text spaces. Inspired by the human visual system, the attention mechanism is utilized to selectively concentrate on the context of the important frames. Also, a contextual hybrid embedding block is presented for connecting the two vector spaces generated during the encoding and decoding stages. The proposed architecture is investigated through well-known CNN architectures and various word embeddings. It is assessed using two benchmark video captioning datasets, MSVD and MSR-VTT, considering standard evaluation metrics such as BLEU, METEOR, ROUGH, and CIDEr. In accordance with experimental exploration, when the proposed model with NASNet-large alone is viewed across all three embeddings, the BERT findings on MSVD Dataset performed better than the results obtained with the other two embeddings. Inception-v4 outperformed VGG-16, ResNet-152, and NASNet-Large for feature extraction. Considering word embedding initiatives, BERT is far superior to ELMo and GloVe based on the MSR-VTT dataset. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Rare Sound Event Detection Using Multi-resolution Cochleagram Features and CRNN with Attention Mechanism
(Birkhauser, 2025) Pandey, G.; Koolagudi, S.G.
Acoustic event detection (AED) or sound event detection (SED) is a problem that focuses on automatically detecting acoustic events in an audio recording along with their onset and offset times. Rare acoustic event detection in AED is a challenging problem. Rare AED aims to detect rare but significant sound events in an audio signal. Traditional methods used for SED often struggle to accurately detect rare sound events due to their infrequent occurrence and diverse characteristics. This paper introduces novel features named as multi-resolution cochleagrams (MRCGs) for rare SED tasks. Different cochleagrams with different resolutions are extracted from the audio recording and stacked to get the MRCG feature vector. The equivalent rectangular bandwidth (ERB) scale used in the cochleagram simulates the human auditory filter. The classifier used is a convolutional recurrent neural network (CRNN) embedded with an attention module. This work considers the Task 2 DCASE 2017 dataset for detecting rare sound events. Results show that the proposed MRCG and CRNN with attention combination improves the performance. The proposed method achieved an average error rate of 0.11 and an average F1 score of 94.3%. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.

Faculty Publications

Browse

Filters

Settings

Sort By

Results per page

Search Results