Mobile Voice - Voice UI for the Blind

Won Dr. James R. Chen Award for Outstanding Master Project at UC Berkeley

w/ Wenqin Chen and Pi-tan Hu
 

Mole-bile Voice is a framework for designing and developing usable voice interfaces for the blind. It is an ongoing master thesis project, which I collaborated with two Berkeley students, Wenqin Chen and Pi-Tan Hu.

 

Ideation

The project started as an opened exploration of using technology to solve a blind user’s challenge. We brainstormed many need areas and then conducted contextual inquiries with blind volunteers.

 
 

We asked them how they exercised, cooked, used social media, and used fitness trackers, and observed them using a desktop computer, a smartphone, an MP3 player, a braille typewriter, a refreshable braille display, and a dollar bill value reader. At the end, we decided to focus on the following challenge - Blind smartphone users have a hard time navigating their mobile apps.

For blind users, the navigation inside a smart phone app is achieved through voiceover, which provides a read-aloud of all elements on the screen. Here is an example of our user searching for flight information using the Orbitz iPhone app with voiceover. The user had much difficulty locating the functionalities she needed.

 

PROPOSED SOLUTION

We hypothesized that a well-designed voice UI mobile apps would provide a better usability experience for blind users. To validate this hypothesis, we prototyped a Uber voice application (many blind people use Uber for transportation), and conducted within-subject usability testing between Uber voice and Uber GUI with voiceover (the existing way of interaction). Employing the extreme sampling technique, we tested with three blind users. The testing result was that all three users preferred the voice app over the GUI app, with two users strongly preferred and one user weakly preferred. 

 

OUR PRODUCT

Our goal is to create a voice UI framework that defines usable design patterns and provides a SDK for easy development. Such a framework will encourage more voice UI application to be created, and thus benefit the blind population.

Competitive Analysis

We evaluated the Siri and Amazon Echo, two primary voice services currently available:

 

Siri                                              Echo
Good                                          Good
Limited to 1-2 interactions         Limited to 1-2 interactions
Good                                          Good
No voice presentation               Low Control & Freedom

VUI Feature
Question Answer
Procedural Task
Data Entry
Data Presentation

 

Based on this evaluation, we decided that our framework will use the current technology for question answering and data entry, but improve the technology on procedural task and data presentation. 

 

VUI Model for UBER

We created this demo procedural flowchart for the Uber app using our Mobile Voice design framework:

 

 

User Research

We're currently conducting heuristic evaluations and usability testings on the Voice UI model above. 

Voice UI Heuristics

Because the current heuristic evaluations are based on Graphic UI, we interpreted the 10 heuristics into Voice-UI-related heuristics:

1. Visibility of System Status - "Where am I", "Where can I go next?" (Context of where you are, e.g. step 1 of 5, step 2 of 5)
2. Match Between the System and the Real World (Questions use everyday language)
3. User Control and Freedom (Exit, undo/redo, escape, go back, help, avoid making assumptions)
4. Consistency and Standards (Keeping names/labels consistent between different parts of interface; Does it follow standards of existing VUI apps like Echo and Siri?)
5. Error Prevention (Questions are unambiguous and easy to understand)
6. Recognition Rather than Recall (Also be aware of memory overload and working memory constraints)
7. Flexibility and Ease of Use (Shortcuts e.g. "take me home from here", user-specified custom commands and custom actions)
8. Aesthetic and Minimalist Design (Questions are simple; tone and language have personality)
9. Help Users Recognize, Diagnose, and Recover from Errors (Error handling is crucial. If a user does not understand their options, or is distracted, you must negotiate with them their status, what their options are, and what they can do next)
10. Help and Documentation (help/introduction for advanced features like setting up custom commands or explaining the flow for the first time; tutorials/explanations should be accessible easily from universal help command at any time)

 

Future Steps

In the next few months, we will finish an interactive design console that helps voice UI designers create flowcharts like the above, and we will also develop an SDK that converts the flowchart into working code. Stay tuned as exciting things happen!