Dialogues
C: I want you to tell me the names of the fellows on the St. Louis team.
A: I'm telling you. Who's on first, What's on second, I Don't Know is on third.
C: You know the fellows' names?
A: Yes.
C: Well, then, who's playing first?
A: Yes.
C: I mean the fellow's name on first.
A: Who.
C: The guy on first base.
A: Who is on first.
C: Well what are you askin' me for?
A: I'm not asking you — I'm telling you. Who is on first.
Who's on First — Bud Abbott and Lou Costello's version of an old burlesque standard. [JM]
[ walken version ]
Dialogue is characterized by turn-taking; Speaker A says something, then speaker B, then speaker A, and so on.
Having a turn (or "taking the floor") is a resource to be allocated; what are the processes involved in this allocation? How do speakers know when it is the proper time to contribute their turn?
It appears that conversation and language itself are structured in such a way as to deal efficiently with this resource allocation problem. One source of evidence for this is the timing of the utterances in normal human conversations. [JM]
Turn-taking behavior is generally studied in the field of Conversation Analysis (CA). Sacks et al. (1974) argued that turn-taking behavior is governed by a set of "turn-taking rules." These rules apply at a transition-relevance place, or TRP; places where the structure of the language allows speaker shift to occur.
A version of the turn-taking rules
simplified from Sacks et al. (1974) by Jurafsky
[JM]
Example Definition:
At each TRP of each turn:
Rule (a) implies that there are some utterances by which the speaker specifically selects who the next speaker will be.
The rules imply that transitions between speakers don't occur just anywhere; the transition-relevance places where they tend to occur are generally at utterance boundaries.
The term speech act is generally used to describe illocutionary acts rather than either of the other two types of acts specified by J. Austin in 1962. [JA]:
Searle (1975b) suggested a modified taxonomy with all speech acts classified into one of five major classes:
Component architecture of a conversational agent [JM]
------------------- ------------------------
--> |Speech recognition| --> |Nat. Lang. understanding| ---.
|__________________| |________________________| |
V
---------------- ------------.
|dialogue manager| <---> |task manager|
|________________| |____________|
_______________ _______________ |
<-- |text-to-speech| <-- |Nat. Lang. | <----------------
| synthesis | | generation |
--------------- --------------
Architecture of a generator portion of a dialogue system [Walker and Rambow 2002].
what to say How to say it ------------ --------------------------------------- content Sentence Surface Prosody Speech planner --> Planner -> Realizer -> Assigner --> Synthesizer