Speech and Language Technologies for STEM (Science, Technology, Engineering and Mathematics) Education

Abeer Alwan, Maxine Eskenazi, Diane Litman, Martin Russell, Klaus Zechner

SLTC Newsletter, July 2012

Over recent years education has become an established application area for speech and language technology. The first STiLL (Speech Technology in Language Learning) took place in Marholmen in Sweden in 1998, and 2006 saw the creation of the ISCA SIG (Special Interest Group) on Speech and Language Technology in Education (SLaTE). Currently, SLaTE is dominated by applications in literacy and language learning, and a study of the technical programs of the biennial SLaTE workshops, from 2007, 2009 and 2011, reveals just a small handful of contributions outside this area. However, it seems likely that there are compelling applications of speech and language technology in other areas of learning, and particularly in the STEM subjects (Science, Technology, Engineering and Mathematics). This article reviews some existing work on applications of speech and language technologies in STEM education, and previews the Special Session on “Speech Technologies for STEM” at Interspeech 2012 in Portland.


It is well known that the effectiveness of educational software can be greatly enhanced by the incorporation of speech and language technology. Since speech is a natural way to communicate, it is accepted by students as a valid means of interacting with tutoring and learning systems. The potential advantages of speech will already be very familiar to readers of this newsletter: It enables those whose hands are busy to maintain interaction with the tutoring system (for example, when learning to manipulate something in a science laboratory); it allows students to gesture with their hands while they interact with the system; it promotes a larger number of interactions; it enables much more fluid and natural feedback in the learning process (for example, when students are asked to perform a think-aloud exercise); it enables those whose eyes are busy, such as in a fast-paced educational game, to have richer interaction with the system and it has been shown to increase learning; it enables those who cannot yet read to fully interact, instead of relying on passive work or on the use of icons; and it enables non-native elementary students to have a more fulfilling learning experience.

Applications of Speech and Language Technology in STEM Education

Some groups are already working on applications of speech and language technology in STEM education. An early example, in the domain of circuit design, is described in (Smith, 1996). More recently, "My Science Tutor" is an intelligent science tutor that is being developed for children aged between 7 and 11, which allows the child to communicate with the virtual tutor using speech (Ward et al. 2006). Initial results suggest that the use of a conversational multimedia virtual tutor promotes student engagement, interest and motivation. However, for automated interactive literacy and language tuition, the advantages of speech and language technology extend beyond the provision of a natural and intuitive interface. For example, these are the key enabling technologies for automatic pronunciation verification. Similarly, speech can also be more than an interface for interactive STEM tutors. Speech provides a window into the learner's emotional state, which can be used to infer key factors such as the learner's degree of engagement and uncertainty. This, in turn, can influence the system's response to a correct or incorrect answer from the learner. For examples in physics tuition and shipboard damage control, see (Forbes-Riley and Litman, 2011) and (Pon-Barry et al, 2006), respectively.

Systems that use text for natural language interaction with STEM tutoring systems are more common than than those that use speech. The relative educational benefits of speech and text-based interaction with intelligent tutoring systems that support natural language dialogues are discussed in (D'Mello et al. 2011). Text-based intellignet STEM tutors have been investigated for physics (for example, Chi et al. 2011, Katz et al. 2011, Vanlehn et al. 2007, Katz et al. 2007), computer science (for example Kersey et al. 2010) and thermodynamics (for example, Rose et al. 2004)

Interspeech 2012 Special Session on Speech Technologies for STEM

The relative lack of significant activity in this area led to a successful proposal for a Special Session on “Speech Technologies for STEM” at Interspeech 2012 in Portland, Oregon (to be held on Tuesday 11th September 10:00-12:00). This session is intended to serve as a starting point for exploration of this new direction, revealing what has been learned so far about the use of speech and language in education that can be applied across learning disciplines, demonstrating what has been done so far in the area of STEM learning, and charting out possible research directions for future work. The special session will be sponsored by the U.S. National Science Foundation (NSF) as a vehicle to explore the possibility of a new research initiative based on its findings.

Eight papers will be presented at the Special Session, including papers that address some of the issues concerning the application of speech and language technology in STEM education that are discussed above. In addition, the session will cover the more general use of speech technology and spoken dialogue systems for education, intelligent tutoring systems using speech, the development of spoken language resources for educational applications, and the proper assessment of speech and speech technology applications.

Further information

For more information, see:

Abeer Alwan is with the Speech Processing and Auditory Perception Laboratory in the Electrical Engineering Department at UCLA, USA (email: alwan@ee.ucla.edu); Maxine Eskenazi is Principal Systems Scientist at the Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA (email: max+@cs.cmu.edu); Diane Litman is with the Department of Computer Science at the University of Pittsburgh, USA (email: litman@cs.pitt.edu ); Martin Russell is in the School of Electronic, Electrical and Computer Engineering at the University of Birmingham, UK (email: m.j.russell@bham.ac.uk); Klaus Zechner is a Managing Senior Research Scientist at the Educational Testing Service (email: kzechner@ets.org)