For assignment two, we performed a group design but each contributed an individual paper relevant to the design topic at hand. Each group was to design a cellphone application. We were given a list of topics, but I volunteered our group to design a novel idea: a coffee-finding search application.
Because I’m a computational linguist by prior education and experience, I decided that we should design an application that allows for fleximodal interaction — that is, an interaction design that allows for users to flexibly switch between modes of conversation (depending on where the user is in the conversational task). The two modes are SMS text and speech. With no further fanfare, here is an abstract and link to the paper. Also, a simple automata that captures the essence of the concept.

Please cite this paper, if you reference material from it.
Abstract
Theories of conversation apply very well in this era of short messages and rapid exchange on mobile devices. In particular, conversational models for information-seeking dialogues have advanced near to the point of commercialization (Larsson and Villing 2007). Such models apply well to both speech and text messaging interaction on cellphones. By allowing the user to interact in short messages using whichever channel is most efficient at that point in the interaction, may provide a means for both improving the efficiency of search and relevance of search results — as well as improving the experience of search on cellphones.
In designing a novel fleximodal cellphone interaction, I studied both Goog411 and Tellme.com voice interactive systems. Here are my conclusions. If they make no sense, read the paper! I’ll follow-up with sample transcripts in my next post.
Despite limitations of VoiceXML, text messaging formats, and the availability of Location-Based Services (LBS), it is possible to make fairly significant improvements to both the efficiency of location search as well as the relevance of search returns. Below are a number of specific observations made by closely examining Tellme and Google voice and SMS location search.
- Avoid presuming that a user’s search criteria is based solely on location. Both Tellme and GOOG-411 voice systems appear to make this presumption. Users may need review results in spatial or social contexts before requesting directions to one or another.
- Extend conversational interaction by using dialogue context such that users can incrementally filter search results and pursue multiple search alternatives without losing the context of the broader search. There is a hint of this in both the SMS systems provided by Tellme and Google. However, once a user selects a particular location, the query is considered complete. There is no way for the user to back up to his original goal. Given limitations of VoiceXML, it is possible this may prove to be serious technical challenge. This should, however, be quite achievable in SMS interaction
- Allow for semantic accommodation and thus enable a highly efficient text search capability. This should include resolving pronominal reference as well as allowing for the user to add additional information that provides answers to yet unanswered questions.
- Make few assumptions about whether a user should use text or voice at any point during an interaction. The interaction should feel seamless and the user choose the mode depending on his cognitive state and the task at hand. The user should not feel as if interaction styles or task plans have changed.
- Improve the relevance of search by incorporating user-contributed data (e.g., ratings, comments, photos) and allowing users to search and sort along these criteria.
- Use GPS location data dynamically during conversational interaction, instead of passing a user off to a separate GPS mapping service. Such information can be used in a variety of ways. For example, to recognize when a user has reached a destination. For now, naive users are averse to installing custom software and this is not necessarily the best option.
- There is much potential in increasing social services by enabling seamless sharing of location with friends by auto SMS updates and a shared map.
- Provide for an online chat recommender system. If a user is in a new city, he may wish to consult with locals. It may be useful to include a web chat capability to speak with other users more directly. It may also be useful to mine web chat conversation for messages pertaining to a particular query response.
- Work with other vendors and standards bodies to address limitations of VoiceXML and SMS technologies.
- Encourage the development and use of a threaded conversation management tool for both SMS and voice messages. Currently, cellphone providers see these two modalities as separate and it’s not possible to integrate the two in a single log.
- Finally, though this was not discussed above, to ensure that the interaction style and strategies are consistent across modalities, generating both the VoiceXML and SMS dialogue recipes from a single source may be valuable to ensuring user experience, dialogue design, and capabilities development are locked in step.
0 comments ↓
There are no comments yet...Kick things off by filling out the form below.
Leave a Comment