A Unified Langauge Model Architecture for Webbased Speech

A Unified Langauge Model Architecture for Web-based Speech Recognition Grammars XML ABNF IHD BNF JSGF Wesley Holland, Daniel May, Julie Baca, Georgios Lazarou, Joseph Picone Center for Advanced Vehicular Systems Mississippi State University

Speech Recognition • Acoustic Model • Maps audio data to words or phonemes • Language Model • Specifies order in which a sequence of words or phonemes is likely to occur • Described using grammar Language Model Grammar Conversion Page 1 of 10

Grammar Specifications • Backus-Naur Form (BNF) • Augmented BNF (ABNF) • JSpeech Grammar Format (JSGF) • Speech Recognition Grammar Specification (SRGS) • ISIP Hierarchical Digraph (IHD) BNF ABNF <A>: : =a. B <B>: : =b. B <B>: : =ε JSGF <A>: : =ab* XML-SRGS <A>=a(b)*; IHD a <item repeat=“ 0 -”> b </item> Language Model Grammar Conversion Page 2 of 10

Conversion Design • Goals • JSGF ↔ IHD • XML-SRGS ↔ IHD • Determination of equivalence • Grammar minimization • Final Architecture XML ABNF IHD JSGF Language Model Grammar Conversion Page 3 of 10

JSGF/XML-SRGS → ABNF • JSGF → ABNF JSGF ABNF <A>=ab*; <A>: : =ab* • Trivial • Similar in syntax and structure to ABNF • XML-SRGS → ABNF • Harder than JSGF • Different in syntax and structure from ABNF • Requires enumeration of certain repeat attributes XML-SRGS ABNF <item repeat=‘ 1 -2’> a b </item> <S>: : =(ab)|(abab) <item repeat=‘ 2 -’> a b </item> <S>: : =abab(ab)* Language Model Grammar Conversion Page 4 of 10

JSGF/XML-SRGS → ABNF • XML-SRGS → ABNF (continued) • Different weighting mechanisms (weight and repeat-prob attributes) a <item repeat=“ 0 -” repeat-prob=“. 45”> b </item> <one-of> <item weight=“. 4”>c</item> <item weight=“. 6”>d</item> </one-of> Language Model Grammar Conversion Page 5 of 10

ABNF → BNF • Normalized BNF • Consists of rules of the following formats: • (RULE_NAME): : =(TERMINAL), (NON_TERMINAL) ABNF • (RULE_NAME): : =(NON_TERMINAL) • (RULE_NAME): : =ε 1. Break rule into multiple rules at each top-level alternation. Recurse on each rule. • Complicated 2. • Accomplished using a recursive algorithm that extracts sets of normalized BNF rules from a set of ABNF rules For each concatenation, Kleene star, or Kleene plus, extract a set of left symbols and a set of right symbols. 3. For n left symbols and m right symbols, create n x m connecting rules. • ABNF → BNF Language Model Grammar Conversion Page 6 of 10

BNF ↔ IHD • BNF ↔ IHD • Each arc translates to a normalized BNF • Terminals correspond to nodes; concatenations correspond to arcs BNF RS→R 0 R 3→C, R 3 RS→R 1 R 3→C, RT R 0→A, R 3 RT→ε R 1→B, R 3 Language Model Grammar Conversion IHD Nodes Arcs (S, 1) (2, 3) 1: A (S, 2) (3, 3) 2: B (1, 3) (3, T) 3: C Page 7 of 10

BNF → JSGF/XML-SRGS • BNF → JSGF/XML-SRGS • Rule-by-rule • Trivial XML-SRGS <rule id=“a”> a <ruleref uri=“#b”/> </rule> BNF <A>: : =a. B <B>: : =b. B <B>: : =ε Language Model Grammar Conversion JSGF <A>=a. B; <B>=b|b. B; <rule id=“b”> <one-of> <item> b <ruleref uri=“#b”/> </item> <ruleref special= “NULL”/> </item> </one-of> </rule> Page 8 of 10

Software Tools • ISIP Network Converter • Console tool to perform conversions to and from arbitrary grammar formats • ISIP Network Builder • Java-based graphical tool to design grammars as finite state machines • Can exports grammars to JSGF, XML-SRGS, ABNF, and IHD • ISIP Language Model Tester • Console tool for testing of grammars • Can generate valid sentences in a given grammar • Can parse sentences and determine if accepted by a given grammar. Language Model Grammar Conversion Page 9 of 10

Summary • Future Work • Web-based front-end to speech recognition software • Mobile speech recognition • Public Domain Toolkit • Contains language model conversion tools • Public domain – available for download Language Model Grammar Conversion Page 10 of 10
- Slides: 11