Special Projects in Speech Signal Processing WellKnown Application

  • Slides: 45
Download presentation
專題研究 語音訊號處理專題 (Special Projects in Speech Signal Processing) 李琳山

專題研究 語音訊號處理專題 (Special Projects in Speech Signal Processing) 李琳山

Well-Known Application Example of Speech and Language Technologies – Speaking Personal Assistant • Examples

Well-Known Application Example of Speech and Language Technologies – Speaking Personal Assistant • Examples • • • Special Questions: Weather in New York next week ? Who is the president of US ? What did he say today ? How can I go to National Taiwan University ? Short messaging, personal scheduling, etc. Output Speech Signals Speech Synthesis Language Generation Speech Recognition Language Understanding Input Speech Signals Wikipedia Information Retrieval – 唐詩宋詞, 出師表… – 說個笑話… Dialogue Manager Knowledge Graph • Examples: – Siri (Apple), Google Now (Google), Cortana (Microsoft) Question Answering Machine Translation

現場展示 電視新聞瀏覽器 (2006) (Broadcast News Browser) Summary News. Video

現場展示 電視新聞瀏覽器 (2006) (Broadcast News Browser) Summary News. Video

Speech Recognition Machine mimics human brains Deep Learning Signal Modeling Speech Signal Machine Learning

Speech Recognition Machine mimics human brains Deep Learning Signal Modeling Speech Signal Machine Learning Not only used in speech, but many other fields Text 今天天氣很好

Spoken Content Retrieval With the success of speech recognition, voice search becomes popular.

Spoken Content Retrieval With the success of speech recognition, voice search becomes popular.

Spoken Content Retrieval Spoken Content Lectures Broadcast Program Multimedia Content

Spoken Content Retrieval Spoken Content Lectures Broadcast Program Multimedia Content

Spoken Content Retrieval Find the lectures related to “deep learning” Text annotation is not

Spoken Content Retrieval Find the lectures related to “deep learning” Text annotation is not needed user

Speech Summarization Retrieved Audio File 1 hour long Summary 10 minutes Select the most

Speech Summarization Retrieved Audio File 1 hour long Summary 10 minutes Select the most informative segments to form a compact version

Dialogues • Machine can interact with human. • Usually based on some rules Machine

Dialogues • Machine can interact with human. • Usually based on some rules Machine interact with human without pre-defined rules. US President user Is it related to “Obama”?

Computer Assisted Language Learning l Every one needs to learn one or more languages

Computer Assisted Language Learning l Every one needs to learn one or more languages in addition to the native language – One-to-one tutoring most effective but with high cost – Computer can be your language tutor

Towards Language Understanding • Can machine understand human language? • Possible with deep learning

Towards Language Understanding • Can machine understand human language? • Possible with deep learning 魯蛇 = loser Machine learn to understand human language via reading the posts on PTT “天” : “地” = “溫拿” : “魯蛇” (These are true examples. )

Massive Open On-line Courses (MOOCs) • Enormous on-line courses

Massive Open On-line Courses (MOOCs) • Enormous on-line courses

Today’s Retrieval Techniques

Today’s Retrieval Techniques

Today’s Retrieval Techniques a 1 a 2 a 3 a 4 c 1 c

Today’s Retrieval Techniques a 1 a 2 a 3 a 4 c 1 c 2 b 1 c 3 b 2 Each filled circle represents a lecture. b 3

Today’s Retrieval Techniques Go through all the paths. Focus on one path. a 1

Today’s Retrieval Techniques Go through all the paths. Focus on one path. a 1 X X c 1 c 2 a 3 Which lectures should I watch? c 3 b 1 b 2 a 4 b 3 learner

Learning Map • Constructing a learning map to visualize all related lectures in different

Learning Map • Constructing a learning map to visualize all related lectures in different courses • Learning map: directed graph a 1 a 2 a 3 a 4 c 1 c 2 c 3 b 1 b 2

Learning Map • Constructing a learning map to visualize all related lectures in different

Learning Map • Constructing a learning map to visualize all related lectures in different courses • Node: a set of lectures in the same topic a 1 a 2 a 3 a 4 a 1 a 2 c 1 c 2 c 3 c 1 c 2 a 3 c 3 b 1 b 2 a 4 b 2 The black circles are nodes.

Learning Map • Constructing a learning map to visualize all related lectures in different

Learning Map • Constructing a learning map to visualize all related lectures in different courses • Link: suggested learning order a 1 a 2 a 3 a 4 a 1 a 2 c 1 c 2 c 3 c 1 c 2 a 3 b 1 b 2 a 4 b 2 c 3 b 1

Summary Retrieval Speech Recognition Summarization Machine Learning & Deep Learning Dialogue Machine Understands Human

Summary Retrieval Speech Recognition Summarization Machine Learning & Deep Learning Dialogue Machine Understands Human Language Computer Assisted Language Learning

Speech Signal Processing – Processing of Double-Level Information • Speech Signal Level 今 天

Speech Signal Processing – Processing of Double-Level Information • Speech Signal Level 今 天 天 • Signal Samples • Processing 的 氣 Algorithms 非 Chips or Computers • Linguistic Structure 常 好 • Linguistic Knowledge Level 今天的 天氣 Lexicon Grammar 今天 的 非常 好

Interspeech 2005, Lisbon, Portugal

Interspeech 2005, Lisbon, Portugal

Interspeech 2006, New York (Central Park)

Interspeech 2006, New York (Central Park)

ICASSP 2006, Toulouse, France

ICASSP 2006, Toulouse, France

SLT 2006, Aruba, South Caribbean

SLT 2006, Aruba, South Caribbean

Tokyo, Japan (with Prof. Furui)

Tokyo, Japan (with Prof. Furui)

Tokyo, Japan (with Prof. Sagayama)

Tokyo, Japan (with Prof. Sagayama)