Speech & natural language publications
-
Detecting nonnative speech using speaker recognition approaches
Detecting whether a talker is speaking his native language is useful for speaker recognition, speech recognition, and intelligence applications. We study the problem of detecting nonnative speakers of American English,…
-
An anticorrelation kernel for improved system combination in speaker verification
This paper presents a method for training SVM-based classification systems for combination with other existing classification systems designed for the same task.
-
Recognizing Arabic speakers with English phones
We investigate the question of whether phone recognition models trained on large English databases can be used for speaker recognition in another language.
-
Improving NER in Arabic using a morphological tagger
We discuss a named entity recognition system for Arabic, and show how we incorporated the information provided by MADA, a full morphological tagger which uses a morphological analyzer.
-
Reranking Machine Translation Hypotheses With Structured and Web-based Language Models
In this paper, we investigate the use of linguistically motivated and computationally efficient structured language models for reranking N-best hypotheses in a statistical machine translation system.
-
Building A Highly Accurate Mandarin Speech Recognizer
We describe a highly accurate large-vocabulary continuous Mandarin speech recognizer, a collaborative effort among four research organizations. Particularly, we build two acoustic models (AMs) with significant differences but with similar…
-
OOV Detection by Joint Word/Phone Lattice Alignment
We propose a new method for detecting out-of-vocabulary (OOV) words for large vocabulary continuous speech recognition (LVCSR) systems. Our method is based on performing a joint alignment between independently generated…
-
Integrating Several Annotation Layers for Statistical Information Distillation
We present a sentence extraction algorithm for Information Distillation, a task where for a given templated query, relevant passages must be extracted from massive audio and textual document sources.
-
Morph-Based Speech Recognition and Modeling of Out-of-Vocabulary Words Across Languages
We explore the use of morph-based language models in large-vocabulary continuous speech recognition systems across four so-called “morphologically rich” languages: Finnish, Estonian, Turkish, and Egyptian Colloquial Arabic. The morphs are…
-
Extending Boosting for Large Scale Spoken Language Understanding
We propose three methods for extending the Boosting family of classifiers motivated by the real-life problems we have encountered. Our results indicate that it is possible to obtain the same…
-
Capturing a Taxonomy of Failures During Automatic Interpretation of Questions Posed in Natural Language
In this paper, we present a study – conducted in the context of the Halo Project – cataloging the types of failures that occur when capturing knowledge from natural language.
-
Capturing and Answering Questions Posed to a Knowledge-Based System
As part of the ongoing project, Project Halo, our goal is to build a system capable of answering questions posed by novice users to a formal knowledge base. In our…