«Far East Journal of Experimental and Theoretical Artificial Intelligence Volume 1, Issue 2, 2008, Pages 87-125 Published online: August 12, 2008 This ...»
(QUESTION (EXPLAIN *RAP*)) A hedged answer may express uncertainty. The input understander recognizes that the student is hedging when the input contains an adverb like “probably” or “maybe”, or ends with one or more question marks as in “CO???” Another frequently attested hedge in our human tutoring sessions is variations of “I think”, as in the examples below.
K20-st-35-5: SO I THINK THAT SV GOES UP.
K20-st-37-1: SINCE CC IS NOT CHANGING THEN I WOULD
THINK NO CHANGE IN SV.
K20-st-37-4: D THEN RAPI I THINK SV I.
In transcripts of human tutoring, “how about” might be ambiguous between a marker for hedged answer and a marker for an initiative. In
this example the student uses “How about” to hedge an answer:
K25-tu-52-2: But what determines the volume of blood in the central venous compartment?
K25-st-53-1: How about co?
But our expert tutors often use “How about” to ask a question:
K11-tu-65-2: How about the influence of a change in CO on RAP?
K11-st-66-1: Ico-Ivenous pressure.
MICHAEL S. GLASS and MARTHA W. EVENS Hedge language is stripped out before the sentence is analyzed further. Michael and Rovick decided after the first eight transcribed human tutoring sessions that they should stop responding to hedges, but in fact they are observed to sometimes give more explicit explanations and more enthusiastic positive acknowledgments when students hedge . Recently Pon-Barry et al.  have shown that responding to student uncertainty in the SCoT tutor improves learning outcomes.
Beyond recognizing hedges as an input phenomenon, CIRCSIM-Tutor does not take hedges into account in its planning, but this might be an opportunity for improvement. In fact, we have seen very few hedges and student initiatives in our trials with CIRCSIM-Tutor so far , but we may possibly see more as the system’s natural language capabilities improve.
Processing with a cascade of finite state transducers
Cascaded finite state transducers have often been used in information extraction tasks; a good example is FASTUS . Finite state machines are popular because they are fast and modular. Their running time is linear in the length of the input, while most algorithms for context-free grammars are slower. When transducers are cascaded, each machine produces an output that is some modification of the input. For example, a common model is to construct a transducer that takes input with the parts of speech marked, looks for noun phrases, and outputs the same string with markers at the beginning and end of each noun phrase. This output might then be input to a transducer that looks for coreference between the phrases found by the previous machine. The output of the last machine in the sequence is used for constructing the input understander’s result, which is passed back to the planner.
Figure 4 is an example of a non-deterministic Mealy machine  finite state transducer, a graph of state nodes connected by arcs. An arc consists of a label, which is matched against a symbol from the input string, and a string of output symbols to emit when the label is matched.
For example, in Figure 4 one arc matches any noun from the input and emits the same word, the next arc matches “is” and emits nothing, and
EXTRACTING INFORMATION FOR AN ITS 103another arc matches verb participles (designated “v-ing” and “v-en”). This fragment of transducer will, for example, transform “pressure is decreasing” to “pressure decreasing”. Each machine has a comparison function that is used for matching input symbols against arc labels. The interpreter for these machines, a LISP function, iterates over the input string, making state transitions, and collecting emitted symbols until the input string has been consumed. The machines are non-deterministic since it is possible for more than one arc or even none to match in a given state. The interpreter mechanism maintains a set of all possible routes, where each route contains a possible current state together with the list of symbols emitted en route to the state.
The question at hand partly determines which finite state transducers process the student’s utterance, thus extracting the information needed for answering the question. For example, to process the answer to a question about the relationship between two variables, machines for recognizing variable names and relationships are employed.
If the question was about the qualitative change in a variable, the variable name machine is used in conjunction with a qualitative change machine.
CIRCSIM-Tutor’s language understanding issues can be enumerated by looking at the finite state machines we built to handle each one. We describe the function of many of these here.
The finite state machine in Figure 4 deletes finite forms of the verb “to be” in student inputs like “SV is not changed” and “it is SV”. We need a special machine to perform copula deletion in CIRCSIM-Tutor because one of the important domain parameters is named “Inotropic State”, usually abbreviated “IS”. We cannot rely on case to distinguish the abbreviation “IS” from the copula “is”. This machine will not alter the input “is increased” or “sv increased”, but “sv is increased” will be changed to “sv increased” and “is is increased” becomes “is increased”.
This machine, the first machine in many of the cascades, illustrates the style of the finite state transducer. After it runs, succeeding machines can assume that any occurrence of “is” is a reference to Inotropic State, not a finite verb.
MICHAEL S. GLASS and MARTHA W. EVENS
Figure 4. Finite state machine for copula deletion.
Since CIRCSIM-Tutor is designed to teach qualitative causal reasoning it spends much of its time discussing qualitative changes in parameters. Several finite state machines in concert detect whether the student is talking about such a change. One machine looks for parameter names, another for qualitative changes (up, down, or no change), another for combinations of the two. The meaning attributes are emitted as shown in Table 2. For example, the word “afterload” is translated into “MAP” (short for Mean Arterial Pressure). Any word not selected in this process is dropped from the string and ignored in further steps of the translation process.
The negation machine looks for a negation followed by a qualitative change and combines the two so “doesn’t change” is transformed into “neg + change” in the parameter extraction step and then into “nochange” in the next stage. Another machine tries to recognize whether a relationship between two parameters is being described as direct or inverse.
Sometimes students use “D” and “+” to indicate that a relationship is direct, sometimes these same symbols are used for “decrease” and “increase” qualitative changes respectively. The polysemy is resolved because the question at hand determines which transducer is applied to the student’s utterance---the relationship or the qualitative change recognizer.
EXTRACTING INFORMATION FOR AN ITS 105
The topic of neural control of variables is the central issue in the problems that CIRCSIM-Tutor’s students solve. It is important that the system recognize a fairly fine grained variety of student inputs here, since in human tutoring we observe a lot of attention to the details of neural control. The neural mechanism finite state machine recognizes answers to the question “By what mechanism is X controlled?” It looks for an optional parameter name plus anything that can be matched to the mechanism ontology shown in Figure 5. So it recognizes “sympathetics” and “TPR neural” and “TPR controlled by nervous system” as meaning that total peripheral resistance is a neural variable.
MICHAEL S. GLASS and MARTHA W. EVENS
Figure 5. Ontology of mechanism answers.
Another purpose of the neural control ontology is to recognize student answers that are correct, and thus should not be contradicted, but nevertheless the human tutors respond to fine distinctions in the student’s physiological language . In this example the tutor subtly
changes the student’s answer “sympathetic vasoconstriction” to “neural”:
EXTRACTING INFORMATION FOR AN ITS 107K11-tu-49-3: How is TPR controlled?
K11-st-50-1: Sympathetic vasoconstrictionK11-tu-51-1: Right. TPR is primarily under neural control.
Although the input understander’s ontology is capable of identifying these linguistic near misses, the planner has no response for them at this time.
Producing the logic form and checking for errors Unless a student initiative was recognized and reported back to the planner, the result of the input understanding process is typically a representation of an answer to the question most recently asked. For
example the question represented by:
causes the input understand to emit back to the planner a form:
(ANSWER (AFFECTED-BY var ((varlist)))), where var and varlist contain information extracted from the student’s answer. The output of the final transducer is assembled into the required logic form, using a different subroutine for each question.
Otherwise, if the input understander did not extract an answer to the question at hand, it attempts conversational repair with a message that explains what kind of input the system is expecting [30, 31]. The earlier version of CIRCSIM-Tutor responded to unrecognized input with “I am sorry. I did not understand you. Please rephrase”. This response did nothing to repair the problem. In the current regime, dialogues such as
the following are common:
T: Is the relationship from Stroke Volume to Cardiac Output direct or is it inverse?
T: Didn’t recognize directly or inversely related.
A complete list of these messages is given in Table 6 below along with the frequency of occurrence of each message in the trials with students.
MICHAEL S. GLASS and MARTHA W. EVENS
CIRCSIM-Tutor was evaluated by having a class of medical students use it. In this paper we evaluate the performance of the new input understanding component by illustrating the types of student input phenomena encountered and seeing how well it responded. We also briefly note learning gains, which are more fully evaluated elsewhere [30, 54].
Michael and Rovick arranged to test the system in a regularly scheduled laboratory in their first-year physiology class at Rush Medical College in November, 1999. The students first took a paper pre-test for thirty minutes. Then they worked with CIRCSIM-Tutor for a full hour.
Afterwards, they took a post-test and filled out a questionnaire about their reactions to the experience, also on paper. To validate our two test instruments used in pre- and post-testing, half the students used test A as a pre-test and half used test B as pre-test, with the tests switched for post-testing. Of 42 students who participated 14 people worked together in pairs on the machines but took the pre- and post-test individually, resulting in 35 computer sessions and 42 pre-tests and post-tests.
Evidence of learning gain In the majority of the sessions (21 out of the 35) all 8 procedures supported by CIRCSIM-Tutor were completed. All but 4 sessions involve at least 5 completed procedures. The distribution is shown in Table 3.
Comparison of the pre-test and post-test results in Table 4 shows that the students indeed learned some physiology from their experience with CIRCSIM-Tutor. The first sections of the pre-test and post-test ask the student to recall all twelve physiological relationships and whether they are directly or inversely related, yielding a top score of 24. The second sections of each test asked for 21 predictions (for 7 variables at 3 different stages) as a result of a perturbation. This part of the test is closest to the problem-solving task in the tutoring sessions. The maximum number of correct predictions is 21.
In the entire group of 42 students, there is a significant improvement from a mean score of 13.52 to 16.33 on the relation task (Section 1), with a one-tailed t-test result of p = 0.024. The group shows an improvement from a mean score of 13.40 to 17.00 on the prediction task (Section 2), with a one-tailed t-test result p = 0.000036.
To check that pre- and post-testing were comparable, students who used test A as a pre-test are shown above the line Table 4 and those who used the same test as a post-test are shown below. The pre-A, post-B group showed significant improvement in both the relation and prediction tasks, with means increasing from 13.52 to 17.4 ( p = 0.02) and 13.17 to 17.29 ( p = 0.004 ), respectively. The pre-B, post-A groups showed significant improvement on the prediction task with mean rising from
13.56 to 16.8 ( p = 0.002 ), but improvement on the relation task was not significant with mean rising from 13.52 to 15.6 ( p = 0.15). Despite the possibility that the two tests were not perfectly matched, we can conclude that CIRCSIM-Tutor is an effective learning tool.