How Ideas Evolve into Speech A Computer Animation

  • Slides: 45
Download presentation
How Ideas Evolve into Speech - A Computer Animation Derek J. SMITH, CEng, CITP

How Ideas Evolve into Speech - A Computer Animation Derek J. SMITH, CEng, CITP Centre for Psychology University of Wales Institute, Cardiff smithsrisca@btinternet. com http: //www. smithsrisca. co. uk/

As presented to the 9 th Annual Conference of the Consciousness and Experiential Psychology

As presented to the 9 th Annual Conference of the Consciousness and Experiential Psychology Section of the British Psychological Society St. Annes’s College, Oxford 18 th September 2005

Copyright Notice: This material was written and published in Wales by Derek J. Smith

Copyright Notice: This material was written and published in Wales by Derek J. Smith (Chartered Engineer), Senior Lecturer in Cognitive Science and Informatics at University of Wales Institute, Cardiff. It forms part of a multifile elearning resource, and subject only to acknowledging Derek J. Smith's rights under international copyright law to be identified as author may be freely downloaded and printed off in single complete copies solely for the purposes of private study and/or review. Commercial exploitation rights are reserved. The remote hyperlinks have been selected for the academic appropriacy of their contents; they were free of offensive and litigious content when selected, and will be periodically checked to have remained so. Copyright © 2005, Derek J. Smith (Chartered Engineer). Publication was by Power. Point presentation on 18 th September 2005, running offline with inactive hyperlinks. This online version, complete with activated hyperlinks, comes to you for follow-up private study. Paragraphs rendered feint form part of the fuller narrative but were not unduly emphasised at time of presentation.

ABOUT THE AUTHOR • Derek Smith graduated as a psychologist in 1972, but is

ABOUT THE AUTHOR • Derek Smith graduated as a psychologist in 1972, but is now bi-professional as both psychology lecturer and systems engineer. • During the 1980 s he was with British Telecom, Cardiff, where he specialised in the design and operation of very large "semantic network" databases. Since 1991 he has taught psycholinguistics and neuropsychology to Speech and Language Therapy undergraduates. • The essence of a computer database under interrogation is the quasilinguistic "linearisation" of fragmented conceptual memory. Since this is also the essence of human speech production, Derek is fascinated by the possibility that the mind is a biological database. • For a gentle introduction to what goes on inside semantic network databases, see Smith (1998), Smith (2005), or click here.

PLAN OF ATTACK • This paper looks at the positioning of conscious experience within

PLAN OF ATTACK • This paper looks at the positioning of conscious experience within the complex of processing modules involved in speech praxis, with a view to identifying the critical information flows and supporting memory types. • The paper is organised into two sections, one long and introductory, and the other more focused but exploratory. • In Section 1, we familiarise ourselves with the modules and stages of spoken language processing, firstly in static box-and-arrow diagram format, and secondly in computer animation. We shall be paying special attention to the role of the “speech act” in communication, and the consequent need for interaction between semantic and pragmatic command systems. • In Section 2, we then take a closer look at the distribution of different memory types, encoding systems, and feedback circuits, and demonstrate how the act of animating those circuits can generate highly specific research questions. An appeal for a detailed cybernetic analysis of language production is made, for it is long overdue.

SECTION 1 THE MODULES AND STAGES OF SPOKEN LANGUAGE PROCESSING

SECTION 1 THE MODULES AND STAGES OF SPOKEN LANGUAGE PROCESSING

SPEECH PRODUCTION STAGES (1) LORDAT’S (1843) FIVE-STAGE MODEL http: //www. smithsrisca. co. uk/PSYlordat 1843.

SPEECH PRODUCTION STAGES (1) LORDAT’S (1843) FIVE-STAGE MODEL http: //www. smithsrisca. co. uk/PSYlordat 1843. html • In 1843 the French neurologist Jacques Lordat identified five postideational processing stages within speech production, and described the first of these as “isolating” the idea to be expressed. • The four subsequent processes then co-operate in shaping the final spoken output. • Lordat's analysis was recently converted into box-and-arrow format by Lecours, Nespoulos, and Pioger (1987), as now shown …. .

 • Here is Lecours et al's graphical representation of Lordat's analysis. • The

• Here is Lecours et al's graphical representation of Lordat's analysis. • The initial ideational stage is shown as "THINKING IN GENERAL". • Each stage receives a coded message from the one before, adds to it in some clever way, and then passes it on to the one after. BUT NOTE THE DIFFICULTY COUNTING STAGES WHEN ONE STAGE IS ALLOWED TO CONTAIN NESTED SUB-STAGES. MORE ON THIS AS WE GO.

AND NOTE ALSO THAT IF WE COUNT IDEATION AS A PROCESSING STAGE AS WELL,

AND NOTE ALSO THAT IF WE COUNT IDEATION AS A PROCESSING STAGE AS WELL, IT MAKES SIX STAGES IN TOTAL. . .

SPEECH PRODUCTION STAGES (2) TWO "SHAPES" TO THE DIAGRAMS • Lordat’s explanatory schema was

SPEECH PRODUCTION STAGES (2) TWO "SHAPES" TO THE DIAGRAMS • Lordat’s explanatory schema was duly incorporated into a number of later 19 th century aphasiological models, and his analysis is of distinct historical significance today because with surprisingly few alterations it is still with us. • Unfortunately, there has never been a standard form of the supporting diagram. Different authors draw things different ways and at different levels of detail. • Nevertheless, diagrams tend to come in only two basic shapes, namely “Ashaped” (with the “clever” bits – the mind’s “higher functions” - at the top) or “X-shaped” (with the clever bits at the central cross-over point). • Here are two of the early A-shaped ones (we'll be seeing one of the X-shaped ones later on) …. .

SPEECH PRODUCTION STAGES (3) LICHTHEIM’S (1885) TWO-LAYER MODEL http: //www. smithsrisca. co. uk/PSYlichtheim 1885.

SPEECH PRODUCTION STAGES (3) LICHTHEIM’S (1885) TWO-LAYER MODEL http: //www. smithsrisca. co. uk/PSYlichtheim 1885. html • Here, from the Golden Age of Aphasiology, is a three-module twolevel model of the totality of language processing. Speech perception takes place in module A, speech production in module M, and understanding in module B [B = "Begriff", the German word for "understanding"]. • Note that Lichtheim's B and M stages are doing exactly the same job as Lordat's six stages! AGAIN NOTE THE DIFFICULTY COUNTING STAGES WHEN ONE STAGE IS ALLOWED TO CONTAIN NESTED SUB-STAGES WITHOUT EVEN SHOWING THEM.

SPEECH PRODUCTION STAGES (4) LICHTHEIM’S (1885) TWO-LAYER MODEL http: //www. smithsrisca. co. uk/PSYlichtheim 1885.

SPEECH PRODUCTION STAGES (4) LICHTHEIM’S (1885) TWO-LAYER MODEL http: //www. smithsrisca. co. uk/PSYlichtheim 1885. html • Note also that all three modules are major consumers of LTM resources. Thus Module B needs to be functionally located either within, or close to, the mind's "semantic network", whilst Modules A and M need instant access to non-semantic word stores. However. . . • . . . box-and-arrow diagrams like these often leave the actual memory stores implicit, and thus under-specified. NOTE THE TWO MODALITY-SPECIFIC WORD STORES (OR "LEXICONS"), BOTH SUBORDINATED TO THE CENTRE FOR HIGHER FUNCTION

SPEECH PRODUCTION STAGES (5) KUSSMAUL’S (1878) FOUR LEXICON MODEL http: //www. smithsrisca. co. uk/PSYkussmaul

SPEECH PRODUCTION STAGES (5) KUSSMAUL’S (1878) FOUR LEXICON MODEL http: //www. smithsrisca. co. uk/PSYkussmaul 1878. html NOTE THE FOUR SEPARATE WORD STORES (OR "LEXICONS") IN THIS CONTEMPORARY ALTERNATIVE TO LICHTHEIM'S MODEL. THESE, TOO, ARE ALL SUBORDINATED TO A HIGHER PROCESSING CENTRE.

SPEECH PRODUCTION STAGES (6) • Speech production was then largely ignored until UCLA's Victoria

SPEECH PRODUCTION STAGES (6) • Speech production was then largely ignored until UCLA's Victoria A. Fromkin reawakened interest in it as a study area in the early 1970 s (Fromkin, 1971). • The most popular modern models of speech production come from the Max Planck Institute's Willem Levelt and the University of Arizona's Merrill F. Garrett. • Like Lordat, Fromkin identified ideation plus five post-ideational processing stages, as follows …. .

SPEECH PRODUCTION STAGES (7) FROMKIN’S FIRST THREE STAGES • Stage 1 – Pre-Lexical Semantics:

SPEECH PRODUCTION STAGES (7) FROMKIN’S FIRST THREE STAGES • Stage 1 – Pre-Lexical Semantics: Decides the meaning to be conveyed. Code not known, but preverbal. • Stage 2 – Pre-Lexical Syntax: Decides the grammatical skeleton of the sentence. Code not known, but preverbal. • Stage 3 – Lexical: Selects the necessary “content words” (i. e. nouns and verbs) from the mental lexicon, thus making ideas verbal for the first time. • It’s worth remembering these three stages as a unit, because they give us our ability with LANGUAGE.

SPEECH PRODUCTION STAGES (8) FROMKIN’S LAST THREE STAGES • Stage 4 – Prosody: Adds

SPEECH PRODUCTION STAGES (8) FROMKIN’S LAST THREE STAGES • Stage 4 – Prosody: Adds in emotionality via intonation pattern. Code not known, but mediated by the hindbrain. • Stage 5 – Phonology: Decides the final syntax and word morphology. Phonemic code. • Stage 6 - Final Sound Production: Commits concrete sounds - "allophones“ – to the motor nerves for respiration, phonation, and articulation. • We’re not really concerned with these three stages in this paper, because they are all post-semantic. They take units of language and convert them into SPEECH.

SPEECH PRODUCTION STAGES (9) NORMAN’S (1990) THREE-LAYER MODEL http: //www. smithsrisca. co. uk/PSYnorman 1990.

SPEECH PRODUCTION STAGES (9) NORMAN’S (1990) THREE-LAYER MODEL http: //www. smithsrisca. co. uk/PSYnorman 1990. html THE LANGUAGE LEVEL LICHTHEIM'S MODULE A, BUT NOW WITH SUBSTAGES OF SPEECH PERCEPTION LICHTHEIM'S MODULE B, BUT NOW WITH VARIOUS SUBDIVISIONS OF FUNCTION LICHTHEIM'S MODULE M, BUT NOW WITH SUBSTAGES OF SPEECH PRODUCTION

SPEECH PRODUCTION STAGES (10) Or to put it all a little more vividly. .

SPEECH PRODUCTION STAGES (10) Or to put it all a little more vividly. . .

SPEECH PRODUCTION STAGES (11) • The result is a mental champagnecascade …. . •

SPEECH PRODUCTION STAGES (11) • The result is a mental champagnecascade …. . • …. . with ideas pouring down from the top …. . • …. . words being added on the way down …. . • . . . the ideas plus the words give your language. . . • …. . sounds are then added below that. . . • . . . and “linear” speech emerges at the bottom.

SPEECH PRODUCTION STAGES (12) Or more correctly. . .

SPEECH PRODUCTION STAGES (12) Or more correctly. . .

SPEECH PRODUCTION STAGES (13) BECAUSE WE DON'T REALLY KNOW WHAT GOES ON UP HERE

SPEECH PRODUCTION STAGES (13) BECAUSE WE DON'T REALLY KNOW WHAT GOES ON UP HERE

SPEECH PRODUCTION STAGES (14) And the big problem, of course, is. . .

SPEECH PRODUCTION STAGES (14) And the big problem, of course, is. . .

SPEECH PRODUCTION STAGES (15). . . what is the true nature of our "champagne"

SPEECH PRODUCTION STAGES (15). . . what is the true nature of our "champagne" - the ideation at the top of the cascade? And who (or what) the heck is pouring it! BECAUSE WE DON'T REALLY KNOW WHAT GOES ON UP HERE

STATE-OF-THE-ART PSYCHOLINGUISTIC MODELING • Norman (1990) is more or less state-of-the-art amongst the A-shaped

STATE-OF-THE-ART PSYCHOLINGUISTIC MODELING • Norman (1990) is more or less state-of-the-art amongst the A-shaped diagrams. • The PALPA (Kay, Lesser, and Coltheart, 1992) is a typical X-shaped psycholinguistic diagram. • It derives from the 19 th century four-lexicon models, via earlier modern modeling efforts by John Morton, Andrew Ellis (e. g. Ellis, 1982), Karalyn Patterson, John Marshall, etc. • It is characterised by having all the higher functions located in the middle of the diagram, not at the top. Here it is …. . • To see a brief history of psycholinguistic models of this genre, click here.

The PALPA (1) http: //www. smithsrisca. co. uk/PSYkayetal 1992. html • All input channels

The PALPA (1) http: //www. smithsrisca. co. uk/PSYkayetal 1992. html • All input channels are now at the top, and all output channels at the bottom. Ideation is in the centre box, and speech praxis is the bottom left processing leg. THIS IS WHERE ALL THE PHILOSOPHICALLY MYSTERIOUS THINGS HAPPEN

The PALPA (2) http: //www. smithsrisca. co. uk/PSYkayetal 1992. html NEXT WE'RE GOING TO

The PALPA (2) http: //www. smithsrisca. co. uk/PSYkayetal 1992. html NEXT WE'RE GOING TO SEE THIS SPEECH PRODUCTION LEG IN CLOSE UP. . .

The PALPA (3) http: //www. smithsrisca. co. uk/PSYkayetal 1992. html • Here is the

The PALPA (3) http: //www. smithsrisca. co. uk/PSYkayetal 1992. html • Here is the bottom left quadrant of the full PALPA diagram. • The similarity with the speech output legs of the Lichtheim, Kussmaul, and Norman diagrams should now be apparent. The focus of the present paper is this - WHAT IS IDEATION, AND HOW DO IDEAS MAKE IT OUT OF THE CENTRE BOX AND DOWN THE FIRST ARROW?

The PALPA (4) http: //www. smithsrisca. co. uk/PSYkayetal 1992. html IN FACT, ONE OF

The PALPA (4) http: //www. smithsrisca. co. uk/PSYkayetal 1992. html IN FACT, ONE OF THE MAIN STRATEGIC GOALS OF COGNITIVE SCIENCE HAS ALWAYS BEEN TO OPEN UP THIS BLACK BOX. THIS MEANS MODELING ALL THE HIGHER FUNCTIONS SHOWN ON THE NORMAN (1990) DIAGRAM, PLUS SPECIFYING WHERE THE MIND'S SEMANTIC NETWORK DATABASE MIGHT BE SITUATED AND HOW IT MIGHT CONTRIBUTE TO SAID HIGHER FUNCTIONS. THIS, IN TURN, REQUIRES SEPARATING OUT THE SUBSYSTEMS FOR CONSCIOUSNESS, SEMANTICS, AND PRAGMATICS.

SPEECH ACTS AND IDEATION • One of the keys to unravelling higher functions is

SPEECH ACTS AND IDEATION • One of the keys to unravelling higher functions is to include speech acts in our modeling. • Unfortunately, speech praxis is so complex in this respect that it has recently spawned its own science – “pragmatics” - with its own very powerful theory Speech Act Theory (Austin, 1962; Searle, 1969). • Speech Act Theory studies not just the words people use, but the units of intention – the “speech acts” which preceded those words. Pragmatics is thus the science of the first down arrow on diagrams like the PALPA …. . • Each speech act is (a) calculated to achieve some discrete behavioural "perlocutionary" effect, but (b) has not yet been fully formed lexically or grammatically. The code is preverbal - perhaps “sprites” or ideograms of some sort.

ANIMATED PALPA – SMITH (2000) http: //www. smithsrisca. co. uk/PALPA. avi • So what

ANIMATED PALPA – SMITH (2000) http: //www. smithsrisca. co. uk/PALPA. avi • So what might a speech act look like? Where do these all-important sprites come from, where do they go, and what happens to them when they get there? • And are they (or the feedback generated during their processing) involved in consciousness? • To get a better idea of the process, we need to see the static flow diagram “in motion”. So here, from Smith (2000), is sentence production at about one third natural speed, for the specimen sentence “The Redcoats are coming” …. . • Technical NB: If accessing this presentation over the Internet you should note that the latest versions of Power. Point no longer play this video from within the presentation. To get around this problem, simply click here to download the corresponding. avi file and view it using your MS Media. Player or equivalent.

ANIMATED PALPA – SMITH (2000) KEY POINTS • Watch out for. . . •

ANIMATED PALPA – SMITH (2000) KEY POINTS • Watch out for. . . • . . . the central functional separation of awareness, understanding, and will, closely associated with affective processes. • . . . the converging flow of semantic and pragmatic icons onto the primary sentence construction process, and the parallel movement of the affective icon onto the lower speech production process. • . . . the need for constant signal acknowledgement and onward transmission. • . . . the number of alternative feedback routes for said acknowledgements to take. • . . . the need for interrupt-resend mechanisms.

SECTION 2 THE MEMORY TYPES, THE ENCODING SYSTEMS, AND THE FEEDBACK CIRCUITS IN SPOKEN

SECTION 2 THE MEMORY TYPES, THE ENCODING SYSTEMS, AND THE FEEDBACK CIRCUITS IN SPOKEN LANGUAGE PROCESSING

THE BASIC PROBLEM (1) • There are five related problems with modeling the cognitive

THE BASIC PROBLEM (1) • There are five related problems with modeling the cognitive system. . . • Firstly, the system being modeled just won't keep still. • Secondly, when it moves very quickly. • Thirdly, when it moves quickly we can neither see, nor conceptually keep up with, what it's doing at a reductionist level. • Fourthly, when we slow it down or look closely at it we lose sight of what it's doing at a holistic level. • Finally, there is no single explanatory science. . .

THE BASIC PROBLEM (2) • . . . in fact, the following disciplines all

THE BASIC PROBLEM (2) • . . . in fact, the following disciplines all have something to say about where the true secrets of cognition lie. . . Anatomical Neuroscience, Artificial Intelligence, Clinical Neurology, Clinical Neuropsychology, Clinical Psychology, Cognitive Palaeontology, Comparative Ethology, Consciousness Studies, Cybernetics, Epistemology, Linguistic Philosophy, Mental Philosophy, Neuroethology, Physical Anthropology, Physiological Neuroscience, and Psycholinguistics • We have highlighted the science whose voice has not yet matched its potential contribution, namely cybernetics, the study of control systems in the abstract. • The following screens show where a little cybernetics might make a lot of difference. . .

FEEDFORWARD CONTROL • Let us look again at the PALPA's speech production leg as

FEEDFORWARD CONTROL • Let us look again at the PALPA's speech production leg as published. • Note that all the arrows are "feedforward" information flows. They pass content, together with instructions on what to do with it, down to lower modules.

FEEDBACK CONTROL (1) • There are no "feedback" information flows on this diagram. (We

FEEDBACK CONTROL (1) • There are no "feedback" information flows on this diagram. (We instantly know this because none of the arrows point up the screen. ) • So no module can communicate problems back to the module above it. • This makes for an extremely inefficient real-time information processing architecture, so let's add some up arrows. . .

FEEDBACK CONTROL (2) • Now we have allowed for the "feedback" of the success

FEEDBACK CONTROL (2) • Now we have allowed for the "feedback" of the success or failure of any component of the speech production process. • Note the multiple "concentric" feedback loops, both "antidromic" and indirect. [Antidromic = back up the down channel, and possibly even back up the down neuron. ] • BUT WHICH TYPE OF FEEDBACK IS BEST? OR DO WE, PERHAPS, JUST NEED AS MUCH OF IT AS WE CAN GET OUR HANDS ON?

FEEDBACK CONTROL (3) • We may gain additional insight into what is involved by

FEEDBACK CONTROL (3) • We may gain additional insight into what is involved by looking at feedback in the A-shaped diagram. • It's everywhere! Even on the INPUT leg! • Note especially the difference between KR and the aforementioned control interrupts.

FEEDBACK CONTROL (4) • Here is the pay-off. . . FEEDBACK MECHANISMS ARE MAJOR

FEEDBACK CONTROL (4) • Here is the pay-off. . . FEEDBACK MECHANISMS ARE MAJOR CONSUMERS OF SHORT -TERM MEMORY, SO, IN GETTING THE CONTROL LOOPS RIGHT, YOU GET THE BULK OF THE MEMORY REQUIREMENTS RIGHT AS WELL. • Here are some of the memory stores required to support the arrows already specified. . . • C = cacheing buffers • E = efference copy

FEEDBACK CONTROL (5) • . . . which presents us with a number of

FEEDBACK CONTROL (5) • . . . which presents us with a number of opportunities for both conscious and unconscious experience. HEARING YOUR OWN VOICE HEARING YOUR INNER SPEECH THE UNCONSCIOUS SENSE OF LUCIDITY WHICH COMES WHEN YOU FIND ALL THE WORDS YOUR IDEAS DEMAND A SORT OF "MOT JUSTE EFFECT" THE UNCONSCIOUS SENSE OF LUCIDITY WHICH COMES WHEN YOUR TONGUE DOES WHAT IT'S TOLD. NOTE THAT IF PHONO PROCESSING FAILS EVEN MOMENTARILY, THE INTERRUPT/RESEND MECHANISMS NEED TO BE INVOKED IN AN UPWARDS CASCADE. CONFIRMATION THAT THE WORLD IS REACTING AS REQUESTED

FEEDBACK CONTROL (6) SPECIFIC RESEARCH ISSUES • What is the down-module STM retention time

FEEDBACK CONTROL (6) SPECIFIC RESEARCH ISSUES • What is the down-module STM retention time when a block of instructions is received, and what is the nature of the code used? • What is the nature of the down-module processing carried out on those instructions? • What is the nature of the feedback loops in force? • Are there differences in the up-module processing of the antidromic and reafferant feedback types? • The cacheing and efference copy activities should already be visible in the functional neuroimaging literature, but, without an adequate reference model to go by, risk being misinterpreted as artifacts.

CONCLUSION THE ARGUMENT IN A NUTSHELL • We have been looking at the positioning

CONCLUSION THE ARGUMENT IN A NUTSHELL • We have been looking at the positioning of conscious experience within the complex of processing modules involved in speech praxis. • We began by familiarising ourselves with the modules and stages of spoken language processing, firstly in static box-and-arrow diagram format, and secondly in computer animation. • In the animation we recognised the “speech act” as a major feedforward instruction stream, and considered how and where this stream would interact with our semantic and awareness systems. • A specific appeal for a cybernetic analysis of language production was then made, supported by a closer look at the distribution of different memory types in the different feedback circuits. • Finally, we demonstrated how the act of animating those circuits can generate highly specific research questions, such as whether copies are taken and whether feedback is direct (antidromic) or indirect.

REFERENCES • • • Austin, J. L. (1962). How to do Things with Words.

REFERENCES • • • Austin, J. L. (1962). How to do Things with Words. Oxford: Oxford University Press. Fromkin, V. A. (1971). The non-anomalous nature of anomalous utterances. Language, Vol. 47, pp. 27 -52. Lecours, A. R. , Nespoulos, J. L. , and Pioger, D. (1987). Jacques Lordat or the Birth of Cognitive Neuropsychology. In Keller, E. and Gopnik, M. (Eds. ), Motor and Sensory Processes in Language. Hillsdale, NJ: Erlbaum. Lichtheim, L. (1885). On aphasia. Brain, 7: 433 -484. Lordat, J. (1843). Leçons tirées du cours de physiologie de l'année scolaire 1842 -1843. Journal de la Société de médecine pratique de Montpellier, 7: 333 -353; 7: 417 -433, and 8: 1 -17. [But reviewed in detail in Lecours, Nespoulos, and Pioger (1987). ] Searle, J. R. (1969). Speech Acts: An Essay in the Philosophy of Language. Cambridge: Cambridge University Press. Smith, D. J. (1998). Commentary on "Cortical Activity and the Explanatory Gap" by J. G. Taylor. Consciousness and Cognition, 7: 214 -215. Smith, D. J. (2000). A slow-motion video analysis of information feedback in a computer-animated psycholinguistic model. Computer-animated poster presented 10 th April 2000 at the Tucson 2000 Towards a Science of Consciousness conference, University of Arizona, Tucson, AZ. Smith, D. J. (2005). On database keys, with an application to the Praxisproblem. In Callaos, N. , Lesso, W. , and Palesi, M. (Eds. ), The 9 th World Multi-Conference on Systemics, Cybernetics, and Informatics, July 10 -13, 2005 - Orlando, Florida, USA (Volume IV). Orlando, FL: International Institute of Informatics and Systemics.