Publications
-
Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech?
This study asks whether current approaches, which use mainly word information, could be improved by adding prosodic information. The study is based on more than 1000 conversations from the Switchboard…
-
MVIEWS: Multimodal Tools for the Video Analyst
SRI has developed MVIEWS, a system for annotating, indexing, extracting, and disseminating information from video streams for surveillance and intelligence applications. MVIEWS is implemented within the Open Agent Architecture, a…
-
Information extraction from HTML: application of a general machine learning approach
We show how information extraction can be cast as a standard machine learning problem, and argue for the suitability of relational learning in solving it.
-
Melding Authentic Science, Technology, And Inquiry-Based Teaching: Experiences Of The Globe Program
Initial findings from the evaluation of the GLOBE Program are used to shed light on three issues concerning student-scientist partnerships.
-
Globe Year 3 Evaluation: Implementation And Progress
Year three evaluation of the Global Learning and Observations to Benefit the Environment (GLOBE) program; an international environmental science research and education program.
-
Automatic Detection of Discourse Structure for Speech Recognition and Understanding
We describe a new approach for statistical modeling and detection of discourse structure for natural conversational speech. Our model is based on 42 `Dialog Acts' (DAs), (question, answer, backchannel, agreement,…
-
Using Information Extraction to Improve Information Retrieval
The authors describe an approach to applying a particular kind of Natural Language Processing NLP system to the TREC routing task in Information Retrieval IR.
-
Diagrammatic Methods for Deriving and Relating Temporal Neural Network Algorithms
We present an alternative approach based on a set of simple block diagram manipulation rules. The approach provides a common framework to derive popular algorithms including backpropagation and backpropagation-through-time, without…
-
A Lognormal Tied Mixture Model of Pitch for Prosody-Based Speaker Recognition
In this work, we develop a statistical model of pitch that allows unbiased estimation of pitch statistics from pitch tracks which are subject to doubling and/or halving.
-
Modeling Linguistic Segment and Turn Boundaries for N-best Rescoring of Spontaneous Speech
We present an N-best rescoring algorithm that removes the effect of segmentation mismatch. Furthermore, we show that explicit language modeling of hidden linguistic segment boundaries is improved by including turn-boundary…
-
Explicit Word Error Minimization in N-best List Rescoring
We show that the standard hypothesis scoring paradigm used in maximum-likelihood-based speech recognition systems is not optimal with regard to minimizing the word error rate, the commonly used performance metric…
-
Mixture Input Transformations for Adaptation of Hybrid Connectionist Speech Recognizers
In this paper, we propose a new algorithm to train mixtures of transformation networks (MTNs) in the hybrid connectionist recognition framework. We apply the new algorithm to nonnative speaker adaptation,…