ASR Automatic Speech Recognition



Automatic Speech Recognition (ASR) is the process of deriving the transcription (word sequence) of an utterance, given the speech waveform. Speech understanding goes one step further, and gleans the meaning of the utterance in order to carry out the speaker’s command.

History of modulation spectrum in ASR
free download

Most of natural signals change over time. A dominant source of change in speech signals is changing shape of a vocal tract that enhances and attenuates individual spectral components of a spectral envelope of speech. Most of the phonetic information is carried in these changes. Spectral components

Improving ASR Robustness to Perturbed Speech Using Cycle-consistent Generative Adversarial Networks
free download

Naturally introduced perturbations in audio signal, caused by emotional and physical states of the speaker, can significantly degrade the performance of Automatic Speech Recognition ( ASR ) systems. In this paper, we propose a front-end based on Cycle-Consistent Generative

On the use of artificial reverberation for ASR in highly reverberant environments
free download

In this paper, we discuss the use of artificial room reverberation methods to increase the performance of automatic speech recognition ( ASR ) systems in highly reverberant enclosures. Our approach consists in training acoustic models on artificially reverberated

Temporal signal processing for ASR
free download

1. INTRODUCTION For decades, speech recognition systems have used pattern recognition techniques to identify lexical items from a sequence of short-term spectra or cepstra, often with some additional linear or nonlinear processing of these features. This basic scheme

Efficient assessment of ASR Systems by Using Subsets of a Test Database
free download

In this paper, assessment of ASR systems with a limited set of speech data selected from a larger testing corpus was studied for connected Dutch digits. Three methods of data selection were applied, namely random, knowledge-based, and datadriven selection. The

Auditory effects for ASR
free download

I briefly (and informally) summarize some of the kinds of auditory or psychoacoustic effects that one might want to try to exploit in automatic speech recognition ( ASR ). Its an admittedly Lyon-centric view, attepting to justify all this auditory modeling stuff I do in terms of its

A comparison of transformer and lstm encoder decoder models for asr
free download

We present competitive results using a Transformer encoderdecoder-attention model for end- to-end speech recognition needing less training time compared to a similarly performing LSTM model. We observe that the Transformer training is in general more stable compared

Blind deconvolution for multi-microphone speech dereverberation: Application to ASR in reverberant environments
free download

In this paper, a deterministic time-domain algorithm for multichannel blind deconvolution is presented. The proposed algorithm assumes that a source signal is measured by several sensors after propagating through finite impulse response channels and being corrupted by

Why is ASR harder for fast speech and what can we do about it
free download

1. INTRODUCTION It has been observed in various NIST evaluations (eg WSJ-Nov93 RM- Sep92) that ASR systems typically have about 2-3 times higher word error rates on very fast speakers [ 3]. This observation naturally inspires the following question: why do ASR

Purely sequence-trained neural networks for ASR based on lattice-free MMI.
free download

In this paper we describe a method to perform sequencediscriminative training of neural network acoustic models without the need for frame-level cross-entropy pre-training. We use the lattice-free version of the maximum mutual information (MMI) criterion: LF-MMI. To make

Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise
free download

In this study, techniques for classification with missing or unreliable data are applied to the problem of noise-robustness in Automatic Speech Recognition ( ASR ). The techniques described make minimal assumptions about any noise background and rely instead on what

ASR corrective feedback on pronunciation: Does it really work
free download

We studied a group of immigrants who were following regular, teacher-fronted Dutch classes, and who were assigned to three groups using either a) Dutch CAPT, an ASR -based Computer Assisted Pronunciation Training (CAPT) system that provides feedback on a

Asr dependent techniques for speaker recognition
free download

This thesis is concerned with improving the performance of speaker recognition systems in three areas: speaker modeling, verification score computation, and feature extraction in telephone quality speech. We first seek to improve upon traditional modeling approaches for

Which ASR should I choose for my dialogue system
free download

We present an analysis of several publicly available automatic speech recognizers (ASRs) in terms of their suitability for use in different types of dialogue systems. We focus in particular on cloud based ASRs that recently have become available to the community. We

Automotive shredder residue ( ASR ) and compact disc (CD) waste: options for recovery of materials and energy
free download

Two types of solid waste streams that will be rapidly increasing in the near future, requiring more processing capacity, are automotive shredder residue ( ASR , in Finnish: autopaloittamojäte) and waste compact discs (CDs). Both contain large fractions of polymers

Modeling pronunciation variation for ASR : Overview and comparison of methods
free download

In this contribution an overview is provided of the papers presented at this workshop. First, the most important characteristics that distinguish the various studies on pronunciation variation modeling are discussed. Subsequently, the issues of evaluation and comparison

Elicited Imitation as an Oral Proficiency Measure with ASR Scoring.
free download

This paper discusses development and evaluation of a practical, valid and reliable instrument for evaluating the spoken language abilities of second-language (L2) learners of English. First we sketch the theory and history behind elicited imitation (EI) tests and the

Effect of alkali silica reaction ( ASR ) in geopolymer concrete
free download

Alkali silica reaction ( ASR ) occurs due to chemical reaction between hydroxyl ions in the pore water within the concrete matrix and certain forms of silica. This reaction could lead to strength loss, cracking, volume expansion and potentially failure of the structure. This

Comparing ASR modeling methods for spoken dialogue simulation and optimal strategy learning
free download

Speech enabled interfaces are nowadays becoming ubiquitous. The most advanced ones rely on probabilistic pattern matching systems and especially on automatic speech recognition systems. Because of their statistical nature, performances of such systems never

Far-Field ASR Without Parallel Data.
free download

In far-field speech recognition systems, training acoustic models with alignments generated from parallel close-talk microphone data provides significant improvements. However it is not practical to assume the availability of large corpora of parallel close-talk microphone

Ratings of relations between DSM-IV diagnostic categories and items of the Adult Self-Report ( ASR ) and Adult Behavior Checklist (ABCL)
free download

This project was designed to:(a) construct DSM-oriented scales comprising ASR and ABCL items that mental health professionals rated as very consistent with DSM-IV categories; and (b) identify items that clinicians be particularly concerned about ( critical items )

New Nonsense Syllables Database–Analyses and Preliminary ASR Experiments
free download

In the first half of the 20th century, series of experiments on human perception of nonsense syllables were carried out at Bell Laboratories. Since the original data were never recorded, Linguistic Data Consortium designed and recorded a corpus which loosely corresponds to

Towards large vocabulary ASR on embedded platforms
free download

In this paper we present an overview of an automatic speech recognition system implementation in the context of embedded systems. Specific challenges presented by low resource platforms will be addressed for the basic components of an ASR decoder. Our

ASR systems in noisy environment: Analysis and solutions for increasing noise robustness
free download

This paper deals with the analysis of Automatic Speech Recognition ( ASR ) suitable for usage within noisy environment and suggests optimum configuration under various noisy conditions. The behavior of standard parameterization techniques was analyzed from the

Multilingual hierarchical MRASTA features for ASR .
free download

Abstract Recently, a multilingual Multi Layer Perceptron (MLP) training method was introduced without having to explicitly map the phonetic units of multiple languages to a common set. This paper further investigates this method using bottleneck (BN) tandem

Effective feedback on L2 pronunciation in ASR -based CALL
free download

Abstract Computer Assisted Language Learning (CALL) has now established itself as a prolific area whose advantages are well-known to educators. Yet, many authors lament the lack of a reliable integrated conceptual framework linking technology advances and second

TC-STAR: New language resources for ASR and SLT purposes.
free download

In TC-STAR a variety of Language Resources (LR) is being produced. In this contribution we address the resources that have been created for Automatic Speech Recrognition and Spoken Language Translation. As yet, these are 14 LR in total: two training SLR for ASR

Joint learning of phonetic units and word pronunciations for ASR
free download

The creation of a pronunciation lexicon remains the most inefficient process in developing an Automatic Speech Recognizer ( ASR ). In this paper, we propose an unsupervised alternative requiring no language-specific knowledge to the conventional manual

Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR .
free download

Automatic speech recognition ( ASR ) enables very intuitive human-machine interaction. However, signal degradations due to reverberation or noise reduce the accuracy of audio- based recognition. The introduction of a second signal stream that is not affected by