The cheapest way to prototype a voice skill is to pretend to be Alexa. We call it prototalking. Here’s how it works:
- Using a cell phone or speaker phone, dial up a person sitting in another room
- (Set the phone to speaker phone mode)
- Pretend the phone is a voice-enabled device of your choice. Say stuff like “Alexa, what time is it?”
- The person in the other room responds as if they are Alexa (or your target avatar). They might say “The time is 3:32 pm.”
- Try testing out a skill that you are designing, and have the person follow the script. Say things like “Alexa, tell Standup to start the meeting” (this was for a skill we were designing to streamline stand-up meetings)
- Listen to your colleague awkwardly respond according to the script you designed
That’s it. What prototalking is not: rocket science. What prototalking is: like scribbling an interface on a sheet of paper and pretending to tap a screen.
Creating quick prototypes is important for any design exercise, digital or otherwise. There are about three hundred and eighty-five thousand tools for digital designers to create prototypes. Service designers have been bodystorming and building cardboard furniture for years. Designing for voice needs equally scrappy prototyping hacks. Fortunately, we all have a cell phone. Problem solved.
Designing for voice is tricky in part because it’s the wild west out there. Customers don’t really have well-defined expectations for voice apps or “skills” because frankly most of them suck and/or don’t do much, and only a very few get used anyway. Conventions are weak or non-existent. There are some best practices, for sure, and you definitely want to follow those. But in general, as a profession, UX designers don’t have a lot of mature techniques for designing in this new medium.
Try prototalking. It’ll help you in a bunch of ways:
- Pace/rhythm — hearing (and waiting for) a response is a better way to evaluate the pace of the conversation than reading a script or a flowchart
- Is this genuinely useful? — testing a skill in a simulated context can give you a sense of whether something works, or just seems inane
- Feedback — real users can give pretty good impressions of a voice solution even if it’s your colleague reading a script
Here are some tips to help with making a great prototalk:
- Use the same awkward/slow pace that the real device uses. Practice pausing for about the right number of seconds before responding, etc.
- Simulate all the pieces. If the skill plays music, play some music through the speakerphone. If the skill does something interesting like post to a Slack channel, make a Slack account for your skill and manually post to the Slack channel on cue.
- Use the script as designed. You’re looking for rough edges. Test the design, don’t improvise. Remember you are a not-intelligent robot. If the tester doesn’t say the wake word, or hesitates too long to issue the command, or doesn’t use the right syntax, give a realistic response (most likely an annoying non-answer).
We’re not done exploring new techniques and best practices for the design community as we create great voice-enabled experiences. We’ll share what we come up with. Please do the same.