speech recognition 2018



Lattice generation in attention-based speech recognition models
free download

Attention-based neural speech recognition models are frequently decoded with beam search, which produces a tree of hypotheses. In many cases, such as when using external language models, numerous decoding hypotheses need to be considered, requiring large

Advances in automatic speech recognition for child speech using factored time delay neural network
free download

Automatic speech recognition (ASR) has shown huge advances in adult speech ; however, when the models are tested on child speech , the performance does not achieve satisfactory word error rates (WER). This is mainly due to the high variance in acoustic features of child

Tower Controller Command Prediction for Future Speech Recognition Applications
free download

Air traffic controllers(ATCos) workload often is a limiting factor for air traffic capacity. Thus, electronic support systems intend to reduce ATCos workload. Automatic Speech Recognition (ASR) can extract controller command elements from verbal clearances to

Dialect-Specific Models for Automatic Speech Recognition of African American Vernacular English
free download

Abstract African American Vernacular English (AAVE) is a widely-spoken dialect of English, yet it is under-represented in major speech corpora. As a result, speakers of this dialect are often misunderstood by NLP applications. This study explores the effect on transcription

Generative Noise Modeling and Channel Simulation for Robust Speech Recognition in Unseen Conditions
free download

Multi-conditioned training is a state-of-the-art approach to achieve robustness in Automatic Speech Recognition (ASR) systems. This approach works well in practice for seen degradation conditions. However, the performance of such system is still an issue for

Kernel Approximation Methods for Speech Recognition .
free download

We study the performance of kernel methods on the acoustic modeling task for automatic speech recognition , and compare their performance to deep neural networks (DNNs). To scale the kernel methods to large data sets, we use the random Fourier feature method of

Automatic recognition of Slovak-English bilingual speech
free download

This article describes the progress of a joint project on Multilingual Automatic Speech Recognition using Deep Neural Networks, in which the Technical University works together with National Taipei University of Technology in Taiwan. During the last year, we managed

Extract, Adapt and Recognize: an End-to-end Neural Network for Corrupted Monaural Speech Recognition
free download

Automatic speech recognition (ASR) in challenging conditions, such as in the presence of interfering speakers or music, remains an unsolved problem. This paper presents Extract, Adapt, and Recognize (EAR), an end-to-end neural network that allows fully learnable

Speech Audio Super-Resolution For Speech Recognition
free download

Automatic bandwidth extension (restoring high-frequency information from low sample rate audio) has a number of applications in speech processing. We introduce an end-to-end deep learning based system for speech bandwidth extension for use in a downstream

Emotion Impacts Speech Recognition Performance
free download

It has been established that the performance of speech recognition systems depends on multiple factors including the lexical content, speaker identity and dialect. Here we use three English datasets of acted emotion to demonstrate that emotional content also impacts the

Advancing sequence-to-sequence based speech recognition
free download

The paper presents our endeavor to improve state-of-the-art speech recognition results using attention based neural network approaches. Our test focus was LibriSpeech, a well- known, publicly available, large, speech corpus, but the methodologies are clearly

SCALING UP ONLINE SPEECH RECOGNITION USING CONVNETS
free download

We design an online end-to-end speech recognition system based on Time-Depth Separable (TDS) convolutions and Connectionist Temporal Classification (CTC). The system has almost three times the throughput of a well tuned hybrid ASR baseline while also

Life after Speech Recognition : Fuzzing Semantic Misinterpretation for Voice Assistant Applications.
free download

Popular Voice Assistant (VA) services such as Amazon Alexa and Google Assistant are now rapidly appifying their platforms to allow more flexible and diverse voice-controlled service experience. However, the ubiquitous deployment of VA devices and the increasing number

Forget a Bit to Learn Better: Soft Forgetting for CTC-based Automatic Speech Recognition
free download

Prior work has shown that connectionist temporal classification (CTC)-based automatic speech recognition systems perform well when using bidirectional long short-term memory (BLSTM) networks unrolled over the whole speech utterance. This is because whole

Jointly Adversarial Enhancement Training for Robust End-to-End Speech Recognition
free download

Recently, the end-to-end system has made significant breakthroughs in the field of speech recognition . However, this single end-to-end architecture is not especially robust to the input variations interfered of noises and reverberations, resulting in performance degradation

On the choice of modeling unit for sequence-to-sequence speech recognition
free download

In conventional speech recognition , phoneme-based models outperform grapheme-based models for non-phonetic languages such as English. The performance gap between the two typically reduces as the amount of training data is increased. In this work, we examine the

End-to-End Articulatory Attribute Modeling for Low-resource Multilingual Speech Recognition
free download

The end-to-end (E2E) model allows for training of automatic speech recognition (ASR) systems without the hand-designed language-specific pronunciation lexicons. However, constructing the multilingual low-resource E2E ASR system is still challenging due to the

Trainable Dynamic Subsampling for End-to-End Speech Recognition
free download

Jointly optimised attention-based encoder-decoder models have yielded impressive speech recognition results. The recurrent neural network (RNN) encoder is a key component in such models it learns the hidden representations of the inputs. However, it is difficult for RNNs to

Speech Recognition Systems A Comprehensive Study Of Concepts And Mechanism
free download

ABSTRACT Speech Recognition Systems now-a-days use many interdisciplinary technologies ranging from Pattern Recognition , Signal Processing, Natural Language Processing implementing to unified statistical framework. Such systems find a wide area of

Improving transformer-based end-to-end speech recognition with connectionist temporal classification and language model integration
free download

The state-of-the-art neural network architecture named Transformer has been used successfully for many sequence-tosequence transformation tasks. The advantage of this architecture is that it has a fast iteration speed in the training stage because there is no


CSE PROJECTS

FREE IEEE PAPER AND PROJECTS

FREE IEEE PAPER