Frictionless Speed / Zero Intent Skills

Teams had taken four runs down the 1,450 meter (almost 1 mile) long track, made of reinforced concrete, covered in ice. Still, their combined times were separated by less than a blink of an eye. The ice on the track varied, the colder and harder, the faster the sleds would go. A sled’s metal blades that touch the ice were coated and polished to minimize friction. At the start of each run however, athletes first had to overcome static friction to move and accelerate the sled, weighing between 170 kg and 210 kg (375–463 lbs).

Alexa, who won Bobsleigh at the 2010 Winter Olympics in Vancouver, Canada?


Excuse the pun but it’s those effortless requests that Alexa performs without any friction that make this voice user experience so delightful. Here are some of my favorites:

MeAlexa, play my favorite songs

AlexaYour liked songs from Wolf’s Spotify [.. music starts playing.]

MeAlexa, set a timer for ten minutes.

[music fades ..] AlexaTen minutes, starting now [.. music volume increases to previous level]

MeAlexa, what’s the weather this afternoon.

[music fades ..] AlexaThis afternoon, you can look for clear skies and sun, with temperatures around 59 degrees Fahrenheit. [.. music volume increases to previous level]

Admittedly, not everything happens this smoothly on the Alexa, especially when 3rd party skills get invoked. American Express for instance has a skill that can be invoked with ‘Amex’ a nickname term for the popular credit card.

MeAlexa, open Amex

AlexaWelcome back, to get started please say the four digit pin you set up when enabling this skill.

Me: 9 3 4 7

AlexaGreat you are all set. You can say things like

  • check my membership rewards balance
  • when is my payment due date
  • or make a payment

what would you like to do?

Amex is one of the best Alexa Skills but still reminds of the despised IVR systems (Interactive voice response, a technology that allows a computer to interact with humans through the use of voice and DTMF tones input via a keypad). Just like pressing 0 multiple times sometimes offered a shortcut with IVRs, experienced users know this new shortcut:

MeAlexa, ask Amex for my account balance.

AlexaWelcome back, to get started please say the four digit pin you set up when enabling this skill.

Me: 9 3 4 7

AlexaGreat you are all set. Your remaining statement balance for your blue cash preferred is zero dollars and your total balance is 779 dollars and 90 cents. What would you like to do next?

Invoking a 3rd party skill on the Alexa requires the wake-up word (Alexa) followed by one or more invocation words (Amex) that identify the skill. The 3rd party skill will then take over, most likely, greeting me and relaying some ideas about how to proceed.

After a while, once I have memorized some options, it’s more efficient to say the wake-up word, invocation word, and intent all in one longer sentence, like Alexa, ask Amex for my account balance.” Still, this all is far from the smooth and frictionless experience mentioned above. Feels like someone put sand on an icy bobsled track.


Zero Intent Skills

Reducing the number of actions required to complete a task, is one of the 4 Golden Rules of UI Design, sometimes we call this “speed to benefit”, and this rule seems to apply to VUI (voice user interfaces) even more.

I think it’s the absence of an invocation word and a succinct objective that makes communication with Alexa enjoyable, like demonstrated above:

  • Alexa, play my favorite songs
  • Alexa, set a timer for ten minutes
  • Alexa, what’s the weather

However, during the development and declaration of a 3rd party Alexa Skill, a two-word invocation needs to be provided (don’t know how American Express got away with just one). Imagine we would use ‘account balance’ for the invocation of a custom skill.

The ‘account balance’ skill is not versatile but very single minded. It just tells the balance of one account. The skill does not declare any custom intents nor entities (aka slots,) which is why I coined it a Zero-Intent Skill.

Among all my financial accounts, I consider my checking account to be most important. It’s used for direct deposit, credits cards get automatically paid, and also two utilities withdraw automatically, pretty scary, I know.

Demo

MeAlexa, account balance

Alexa: Let me check .. the balance of your checking account is three thousand one hundred and seven dollars.

👂 Listen to this example

While hearing “Let me check …” the actual query gets executed. For simplicity, I’m looking for the 1st checking account in the dataset. Before sending the result to the speech synthesiser, the skill rounds the amount down. I.e., I won’t hear any cents, all in the interest of smoothness and speed. I also think the amount becomes more memorizable that way. I don’t know of a way to learn about an account balance with less friction or in less than 10 seconds.


Privacy

I know that some have their Amazon Echo devices in areas where privacy can be an issue and I am well aware of the tension between convenience and speed on one side and privacy and security on the other. The voice pin that the American Express skill has implemented only slightly protects the access to my account, until a “potential attacker” hears me saying it out loud.

In the current COVID-19 pandemic environment, touching surfaces is discouraged and face recognition doesn’t work all that well, when wearing a face mask. All this may work in the favor of voice biometric and my hope is that in a not too distant future, “voice-id” will be as prevalent for authentication as touch-id and face-id are today. Just saying “Alexa, Account Balance” would be enough to authenticate me as the owner of my account.

Leave a Reply