Voice Assistant · Wizard-of-Oz · Aging

A Voice Assistant to Help Older Adults Navigate Mobile Interfaces

Query taxonomy across 3 studies + Wizard-of-Oz evaluation showing 88% adoption and 77% immediate recovery

Published at CHI 2023 — "Where is history": Toward Designing a Voice Assistant to help Older Adults locate Interface Features quickly · Yu, Parde & Chattopadhyay

Abstract

Abstract

We explore a voice assistant that helps older adults locate UI features by verbal query. Through three studies — a think-aloud study with 19 older adults, a baseline with 12 younger adults, and a Wizard-of-Oz experiment (n=15) — we contribute a taxonomy of 5 query types and show that 87% of older adults used the voice assistant when lost, recovering immediately 77% of the time.

Keywords: voice assistant, older adults, mobile interface, Wizard-of-Oz, query taxonomy, accessibility, CHI 2023

1. Query Taxonomy (Studies 1 & 2)

How Older Adults Verbalize Interaction Problems

In an exploratory think-aloud study (19 older adults, 16 unique apps), we asked participants to speak their thoughts aloud whenever the interface caused confusion. 111 questions were collected. Using inductive open coding, we identified five query types:

TypeDescriptionExampleWh?
ValidationVerify whether doing the right thing"Should I click the 'View Cart' option?"Y/N
Informational–DirectedClear next action, unclear which feature"How do I find an address on this map?"how
Informational–UndirectedNo clear intuition about next feature"Which one to choose?"what/which
NavigationalWhere is a particular feature?"Where is history?"where
ConceptualAbout general app workflow, not UI location"Why is it just processing?"why
Table 2. Taxonomy of questions asked by older adults during mobile UI exploration (CHI 2023).

Compared to younger adults (n=12), older adults asked significantly more validation questions (U=46, p=0.004, r=0.6) and more directed informational questions (U=53, p=0.007, r=0.53). The most frequent question types in order: validation (37.8%), directed informational (22.5%), navigational (18%).

Design Insight

"Where" and "how" questions — the most common UI-related types — map directly onto a voice query input: users already phrase their problem as a question the assistant can answer.

2. Design Probe

Just-in-Time, Just-in-Place Voice Assistant

The proposed system works as follows: at any point during app exploration, the user invokes the voice assistant. It processes the speech query using a Transformer-based keyword extractor (APE model, 85% accuracy), then reads the current screen's UI structure via Android's Assist API, extracts keywords using TF-IDF, and highlights up to 3 matching UI elements using ConceptNet similarity (F1 = .8).

Voice assistant design probe diagram
Figure 1. The design solution: (a) user invokes voice assistant mid-task, (b) keyword set extracted from speech query, (c) keyword set extracted from current UI components, (d) top 3 matching features highlighted on-screen. (CHI 2023)

Key design constraints: limited to the current screen only (no automatic navigation to other pages); highlights up to 3 features as a balance between guidance and visual clutter; operates at the platform level, not as an in-app feature.

WoZ visual highlights on mobile web
Figure 2. Visual highlights shown to participants during the WoZ study. (Left) Filter + search highlighted for "Is there a search function?" (Right) Arrow indicating scroll for "Where is a senior discount?" (CHI 2023)

Wizard's Interface

The wizard used a custom GUI to add highlights and arrows in real-time via drag-and-drop. A pre-generated question list per task allowed rapid responses without perceptible delays.

Audio tones confirmed query receipt and highlight rendering. Highlights disappeared as soon as participants resumed interaction.

3. Wizard-of-Oz Study

5 Tasks · 2 Websites · 15 Older Adults

Study tasks used two feature-rich mobile websites: choosechicago.com (tourism) and transitchicago.com (public transit). Conducted remotely via video call (COVID-19).

15participants, Mdn 65 years, IQR = 8
931total interaction events tagged in video corpus
33voice questions asked across all sessions (Mdn = 2)

Interactions were coded using 9 event types including: unique/non-unique/successful/off-task selection, revert, cycle, retry, slip, and questioning. Participants were marked "getting lost" when making consecutive non-unique or off-task selections.

4. Results

88% Used the Assistant · 77% Recovered Immediately

88%of participants used the voice assistant (13 of 15)
63%of "getting lost" followed by a voice query (26 of 41)
77%immediate recovery after VA use (20 of 26)
Two examples of user interaction paths with voice assistant
Figure 3. Two interaction examples. Red step = getting lost. P6 asked a question near the start of Task 1 (A); P8 invoked the VA after lengthy exploration (Task 2, B). Both recovered immediately after the VA highlighted the relevant feature. (CHI 2023)

Bonus Finding

Participants indirectly learned from VA outputs. Features highlighted in one task were reused in subsequent tasks — P11: "VA made me realize that the search function is quite useful." This was not a design goal but emerged organically.

VA use was significantly more frequent after getting lost than when not lost (Z=2.3, p=0.01, r=0.71). Among those who used the VA, it was used significantly more than self-exploration when recovering (Z=2.4, p=0.009, r=0.86).