ASR Automatic Speech Recognition
Automatic Speech Recognition (ASR) is the process of deriving the transcription (word sequence) of an utterance, given the speech waveform. Speech understanding goes one step further, and gleans the meaning of the utterance in order to carry out the speaker’s command.
History of modulation spectrum in ASR
free download
Most of natural signals change over time. A dominant source of change in speech signals is changing shape of a vocal tract that enhances and attenuates individual spectral components of a spectral envelope of speech. Most of the phonetic information is carried in these changes. Spectral components
Improving ASR Robustness to Perturbed Speech Using Cycle-consistent Generative Adversarial Networks
free download
Naturally introduced perturbations in audio signal, caused by emotional and physical states of the speaker, can significantly degrade the performance of Automatic Speech Recognition ( ASR ) systems. In this paper, we propose a front-end based on Cycle-Consistent Generative
On the use of artificial reverberation for ASR in highly reverberant environments
free download
In this paper, we discuss the use of artificial room reverberation methods to increase the performance of automatic speech recognition ( ASR ) systems in highly reverberant enclosures. Our approach consists in training acoustic models on artificially reverberated
Temporal signal processing for ASR
free download
1. INTRODUCTION For decades, speech recognition systems have used pattern recognition techniques to identify lexical items from a sequence of short-term spectra or cepstra, often with some additional linear or nonlinear processing of these features. This basic scheme
Efficient assessment of ASR Systems by Using Subsets of a Test Database
free download
In this paper, assessment of ASR systems with a limited set of speech data selected from a larger testing corpus was studied for connected Dutch digits. Three methods of data selection were applied, namely random, knowledge-based, and datadriven selection. The
Auditory effects for ASR
free download
I briefly (and informally) summarize some of the kinds of auditory or psychoacoustic effects that one might want to try to exploit in automatic speech recognition ( ASR ). Its an admittedly Lyon-centric view, attepting to justify all this auditory modeling stuff I do in terms of its
A comparison of transformer and lstm encoder decoder models for asr
free download
We present competitive results using a Transformer encoderdecoder-attention model for end- to-end speech recognition needing less training time compared to a similarly performing LSTM model. We observe that the Transformer training is in general more stable compared
Blind deconvolution for multi-microphone speech dereverberation: Application to ASR in reverberant environments
free download
In this paper, a deterministic time-domain algorithm for multichannel blind deconvolution is presented. The proposed algorithm assumes that a source signal is measured by several sensors after propagating through finite impulse response channels and being corrupted by
Why is ASR harder for fast speech and what can we do about it
free download
1. INTRODUCTION It has been observed in various NIST evaluations (eg WSJ-Nov93 RM- Sep92) that ASR systems typically have about 2-3 times higher word error rates on very fast speakers [ 3]. This observation naturally inspires the following question: why do ASR
Purely sequence-trained neural networks for ASR based on lattice-free MMI.
free download
In this paper we describe a method to perform sequencediscriminative training of neural network acoustic models without the need for frame-level cross-entropy pre-training. We use the lattice-free version of the maximum mutual information (MMI) criterion: LF-MMI. To make
Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise
free download
In this study, techniques for classification with missing or unreliable data are applied to the problem of noise-robustness in Automatic Speech Recognition ( ASR ). The techniques described make minimal assumptions about any noise background and rely instead on what
ASR corrective feedback on pronunciation: Does it really work
free download
We studied a group of immigrants who were following regular, teacher-fronted Dutch classes, and who were assigned to three groups using either a) Dutch CAPT, an ASR -based Computer Assisted Pronunciation Training (CAPT) system that provides feedback on a
Asr dependent techniques for speaker recognition
free download
This thesis is concerned with improving the performance of speaker recognition systems in three areas: speaker modeling, verification score computation, and feature extraction in telephone quality speech. We first seek to improve upon traditional modeling approaches for
Which ASR should I choose for my dialogue system
free download
We present an analysis of several publicly available automatic speech recognizers (ASRs) in terms of their suitability for use in different types of dialogue systems. We focus in particular on cloud based ASRs that recently have become available to the community. We
Automotive shredder residue ( ASR ) and compact disc (CD) waste: options for recovery of materials and energy
free download
Two types of solid waste streams that will be rapidly increasing in the near future, requiring more processing capacity, are automotive shredder residue ( ASR , in Finnish: autopaloittamojäte) and waste compact discs (CDs). Both contain large fractions of polymers
Modeling pronunciation variation for ASR : Overview and comparison of methods
free download
In this contribution an overview is provided of the papers presented at this workshop. First, the most important characteristics that distinguish the various studies on pronunciation variation modeling are discussed. Subsequently, the issues of evaluation and comparison
Elicited Imitation as an Oral Proficiency Measure with ASR Scoring.
free download
This paper discusses development and evaluation of a practical, valid and reliable instrument for evaluating the spoken language abilities of second-language (L2) learners of English. First we sketch the theory and history behind elicited imitation (EI) tests and the
Effect of alkali silica reaction ( ASR ) in geopolymer concrete
free download
Alkali silica reaction ( ASR ) occurs due to chemical reaction between hydroxyl ions in the pore water within the concrete matrix and certain forms of silica. This reaction could lead to strength loss, cracking, volume expansion and potentially failure of the structure. This
Comparing ASR modeling methods for spoken dialogue simulation and optimal strategy learning
free download
Speech enabled interfaces are nowadays becoming ubiquitous. The most advanced ones rely on probabilistic pattern matching systems and especially on automatic speech recognition systems. Because of their statistical nature, performances of such systems never
Far-Field ASR Without Parallel Data.
free download
In far-field speech recognition systems, training acoustic models with alignments generated from parallel close-talk microphone data provides significant improvements. However it is not practical to assume the availability of large corpora of parallel close-talk microphone
Ratings of relations between DSM-IV diagnostic categories and items of the Adult Self-Report ( ASR ) and Adult Behavior Checklist (ABCL)
free download
This project was designed to:(a) construct DSM-oriented scales comprising ASR and ABCL items that mental health professionals rated as very consistent with DSM-IV categories; and (b) identify items that clinicians be particularly concerned about ( critical items )
New Nonsense Syllables Database–Analyses and Preliminary ASR Experiments
free download
In the first half of the 20th century, series of experiments on human perception of nonsense syllables were carried out at Bell Laboratories. Since the original data were never recorded, Linguistic Data Consortium designed and recorded a corpus which loosely corresponds to
Towards large vocabulary ASR on embedded platforms
free download
In this paper we present an overview of an automatic speech recognition system implementation in the context of embedded systems. Specific challenges presented by low resource platforms will be addressed for the basic components of an ASR decoder. Our
ASR systems in noisy environment: Analysis and solutions for increasing noise robustness
free download
This paper deals with the analysis of Automatic Speech Recognition ( ASR ) suitable for usage within noisy environment and suggests optimum configuration under various noisy conditions. The behavior of standard parameterization techniques was analyzed from the
Multilingual hierarchical MRASTA features for ASR .
free download
Abstract Recently, a multilingual Multi Layer Perceptron (MLP) training method was introduced without having to explicitly map the phonetic units of multiple languages to a common set. This paper further investigates this method using bottleneck (BN) tandem
Effective feedback on L2 pronunciation in ASR -based CALL
free download
Abstract Computer Assisted Language Learning (CALL) has now established itself as a prolific area whose advantages are well-known to educators. Yet, many authors lament the lack of a reliable integrated conceptual framework linking technology advances and second
TC-STAR: New language resources for ASR and SLT purposes.
free download
In TC-STAR a variety of Language Resources (LR) is being produced. In this contribution we address the resources that have been created for Automatic Speech Recrognition and Spoken Language Translation. As yet, these are 14 LR in total: two training SLR for ASR
Joint learning of phonetic units and word pronunciations for ASR
free download
The creation of a pronunciation lexicon remains the most inefficient process in developing an Automatic Speech Recognizer ( ASR ). In this paper, we propose an unsupervised alternative requiring no language-specific knowledge to the conventional manual
Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR .
free download
Automatic speech recognition ( ASR ) enables very intuitive human-machine interaction. However, signal degradations due to reverberation or noise reduce the accuracy of audio- based recognition. The introduction of a second signal stream that is not affected by