Repository logo
Communities & Collections
All of DSpace
  • English
  • العربية
  • বাংলা
  • Català
  • Čeština
  • Deutsch
  • Ελληνικά
  • Español
  • Suomi
  • Français
  • Gàidhlig
  • हिंदी
  • Magyar
  • Italiano
  • Қазақ
  • Latviešu
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Српски
  • Svenska
  • Türkçe
  • Yкраї́нська
  • Tiếng Việt
Log In
Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Sreenivas, T.V."

Filter results by typing the first few letters
Now showing 1 - 6 of 6
  • Results Per Page
  • Sort Options
  • No Thumbnail Available
    Item
    Comparison of low-dimension speech segment embeddings: Application to speaker diarization
    (2019) Chetupalli, S.R.; Sreenivas, T.V.; Gopalakrishnan, A.
    Segment clustering is a crucial step in unsupervised speaker diarization. Bottom-up approaches, such as, hierarchical agglomerative clustering technique are used traditionally for segment clustering. In this paper, we consider the top-down approach to clustering, in which a speaker sensitive, low-dimensional representation of segments (speaker space) is obtained first, followed by Gaussian mixture model (GMM) based clustering. We explore three methods of obtaining the low dimension segment representation: (i) multi-dimensional scaling (MDS) based on segment to segment stochastic distances; (ii) traditional principal component analysis (PCA), and (iii) factor analysis (i-vectors), of GMM mean super-vectors. We found that, MDS based embeddings result in better representation and hence result in better diarization performance compared to PCA and even i-vector embeddings. � 2019 IEEE.
  • No Thumbnail Available
    Item
    Comparison of low-dimension speech segment embeddings: Application to speaker diarization
    (Institute of Electrical and Electronics Engineers Inc., 2019) Chetupalli, S.R.; Sreenivas, T.V.; Gopalakrishnan, A.
    Segment clustering is a crucial step in unsupervised speaker diarization. Bottom-up approaches, such as, hierarchical agglomerative clustering technique are used traditionally for segment clustering. In this paper, we consider the top-down approach to clustering, in which a speaker sensitive, low-dimensional representation of segments (speaker space) is obtained first, followed by Gaussian mixture model (GMM) based clustering. We explore three methods of obtaining the low dimension segment representation: (i) multi-dimensional scaling (MDS) based on segment to segment stochastic distances; (ii) traditional principal component analysis (PCA), and (iii) factor analysis (i-vectors), of GMM mean super-vectors. We found that, MDS based embeddings result in better representation and hence result in better diarization performance compared to PCA and even i-vector embeddings. © 2019 IEEE.
  • No Thumbnail Available
    Item
    Feature selection and model optimization for semi-supervised speaker spotting
    (2016) Chetupalli, S.R.; Gopalakrishnan, A.; Sreenivas, T.V.
    We explore, experimentally, feature selection and optimization of stochastic model parameters for the problem of speaker spotting. Based on an initially identified segment of speech of a speaker, an iterative model refinement method is developed along with a latent variable mixture model so that segments of the same speaker are identified in a long speech record. It is found that a GMM with moderate number of mixtures is better suited for the task than a large number mixture model as used in speaker identification. Similarly, a PCA based low-dimensional projection of MFCC based feature vector provides better performance. We show that about 6 seconds of initially identified speaker data is sufficient to achieve > 90% performance of speaker segment identification. � 2016 IEEE.
  • No Thumbnail Available
    Item
    Feature selection and model optimization for semi-supervised speaker spotting
    (European Signal Processing Conference, EUSIPCO, 2016) Chetupalli, S.R.; Gopalakrishnan, A.; Sreenivas, T.V.
    We explore, experimentally, feature selection and optimization of stochastic model parameters for the problem of speaker spotting. Based on an initially identified segment of speech of a speaker, an iterative model refinement method is developed along with a latent variable mixture model so that segments of the same speaker are identified in a long speech record. It is found that a GMM with moderate number of mixtures is better suited for the task than a large number mixture model as used in speaker identification. Similarly, a PCA based low-dimensional projection of MFCC based feature vector provides better performance. We show that about 6 seconds of initially identified speaker data is sufficient to achieve > 90% performance of speaker segment identification. © 2016 IEEE.
  • Thumbnail Image
    Item
    Low complexity bit allocation algorithms for MP3/AAC encoding
    (2008) Nithin, S.; Suresh, K.; Sreenivas, T.V.
    We have developed two reduced complexity bit-allocation algorithms for MP3/AAC based audio encoding, which can be useful at low bit-rates. One algorithm derives optimum bit-allocation using constrained optimization of weighted noise-to-mask ratio and the second algorithm uses decoupled iterations for distortion control and rate control, with convergence criteria. MUSHRA based evaluation indicated that the new algorithm would be comparable to AAC but requiring only about 1/10 th the complexity.
  • No Thumbnail Available
    Item
    Low complexity bit allocation algorithms for MP3/AAC encoding
    (2008) Nithin, S.; Suresh, K.; Sreenivas, T.V.
    We have developed two reduced complexity bit-allocation algorithms for MP3/AAC based audio encoding, which can be useful at low bit-rates. One algorithm derives optimum bit-allocation using constrained optimization of weighted noise-to-mask ratio and the second algorithm uses decoupled iterations for distortion control and rate control, with convergence criteria. MUSHRA based evaluation indicated that the new algorithm would be comparable to AAC but requiring only about 1/10 th the complexity.

Maintained by Central Library NITK | DSpace software copyright © 2002-2026 LYRASIS

  • Privacy policy
  • End User Agreement
  • Send Feedback
Repository logo COAR Notify