speech recognition 2018
Lattice generation in attention-based speech recognition models
free download
Attention-based neural speech recognition models are frequently decoded with beam search, which produces a tree of hypotheses. In many cases, such as when using external language models, numerous decoding hypotheses need to be considered, requiring large
Advances in automatic speech recognition for child speech using factored time delay neural network
free download
Automatic speech recognition (ASR) has shown huge advances in adult speech ; however, when the models are tested on child speech , the performance does not achieve satisfactory word error rates (WER). This is mainly due to the high variance in acoustic features of child
Tower Controller Command Prediction for Future Speech Recognition Applications
free download
Air traffic controllers(ATCos) workload often is a limiting factor for air traffic capacity. Thus, electronic support systems intend to reduce ATCos workload. Automatic Speech Recognition (ASR) can extract controller command elements from verbal clearances to
Dialect-Specific Models for Automatic Speech Recognition of African American Vernacular English
free download
Abstract African American Vernacular English (AAVE) is a widely-spoken dialect of English, yet it is under-represented in major speech corpora. As a result, speakers of this dialect are often misunderstood by NLP applications. This study explores the effect on transcription
Generative Noise Modeling and Channel Simulation for Robust Speech Recognition in Unseen Conditions
free download
Multi-conditioned training is a state-of-the-art approach to achieve robustness in Automatic Speech Recognition (ASR) systems. This approach works well in practice for seen degradation conditions. However, the performance of such system is still an issue for
Kernel Approximation Methods for Speech Recognition .
free download
We study the performance of kernel methods on the acoustic modeling task for automatic speech recognition , and compare their performance to deep neural networks (DNNs). To scale the kernel methods to large data sets, we use the random Fourier feature method of
Automatic recognition of Slovak-English bilingual speech
free download
This article describes the progress of a joint project on Multilingual Automatic Speech Recognition using Deep Neural Networks, in which the Technical University works together with National Taipei University of Technology in Taiwan. During the last year, we managed
Extract, Adapt and Recognize: an End-to-end Neural Network for Corrupted Monaural Speech Recognition
free download
Automatic speech recognition (ASR) in challenging conditions, such as in the presence of interfering speakers or music, remains an unsolved problem. This paper presents Extract, Adapt, and Recognize (EAR), an end-to-end neural network that allows fully learnable
Speech Audio Super-Resolution For Speech Recognition
free download
Automatic bandwidth extension (restoring high-frequency information from low sample rate audio) has a number of applications in speech processing. We introduce an end-to-end deep learning based system for speech bandwidth extension for use in a downstream
Emotion Impacts Speech Recognition Performance
free download
It has been established that the performance of speech recognition systems depends on multiple factors including the lexical content, speaker identity and dialect. Here we use three English datasets of acted emotion to demonstrate that emotional content also impacts the
Advancing sequence-to-sequence based speech recognition
free download
The paper presents our endeavor to improve state-of-the-art speech recognition results using attention based neural network approaches. Our test focus was LibriSpeech, a well- known, publicly available, large, speech corpus, but the methodologies are clearly
SCALING UP ONLINE SPEECH RECOGNITION USING CONVNETS
free download
We design an online end-to-end speech recognition system based on Time-Depth Separable (TDS) convolutions and Connectionist Temporal Classification (CTC). The system has almost three times the throughput of a well tuned hybrid ASR baseline while also
Life after Speech Recognition : Fuzzing Semantic Misinterpretation for Voice Assistant Applications.
free download
Popular Voice Assistant (VA) services such as Amazon Alexa and Google Assistant are now rapidly appifying their platforms to allow more flexible and diverse voice-controlled service experience. However, the ubiquitous deployment of VA devices and the increasing number
Forget a Bit to Learn Better: Soft Forgetting for CTC-based Automatic Speech Recognition
free download
Prior work has shown that connectionist temporal classification (CTC)-based automatic speech recognition systems perform well when using bidirectional long short-term memory (BLSTM) networks unrolled over the whole speech utterance. This is because whole
Jointly Adversarial Enhancement Training for Robust End-to-End Speech Recognition
free download
Recently, the end-to-end system has made significant breakthroughs in the field of speech recognition . However, this single end-to-end architecture is not especially robust to the input variations interfered of noises and reverberations, resulting in performance degradation
On the choice of modeling unit for sequence-to-sequence speech recognition
free download
In conventional speech recognition , phoneme-based models outperform grapheme-based models for non-phonetic languages such as English. The performance gap between the two typically reduces as the amount of training data is increased. In this work, we examine the
End-to-End Articulatory Attribute Modeling for Low-resource Multilingual Speech Recognition
free download
The end-to-end (E2E) model allows for training of automatic speech recognition (ASR) systems without the hand-designed language-specific pronunciation lexicons. However, constructing the multilingual low-resource E2E ASR system is still challenging due to the
Trainable Dynamic Subsampling for End-to-End Speech Recognition
free download
Jointly optimised attention-based encoder-decoder models have yielded impressive speech recognition results. The recurrent neural network (RNN) encoder is a key component in such models it learns the hidden representations of the inputs. However, it is difficult for RNNs to
Speech Recognition Systems A Comprehensive Study Of Concepts And Mechanism
free download
ABSTRACT Speech Recognition Systems now-a-days use many interdisciplinary technologies ranging from Pattern Recognition , Signal Processing, Natural Language Processing implementing to unified statistical framework. Such systems find a wide area of
Improving transformer-based end-to-end speech recognition with connectionist temporal classification and language model integration
free download
The state-of-the-art neural network architecture named Transformer has been used successfully for many sequence-tosequence transformation tasks. The advantage of this architecture is that it has a fast iteration speed in the training stage because there is no
CSE PROJECTS