The Path to the CUI is Heavily Mined and Booby-Trapped

Cross-Post from my article in Chatbots Magazine

The concept of the Conversational User Interface (CUI) is not really new. Wolfgang Wahlster of the German Research Center for AI, DFKI, wrote 12 years ago in his paper on Conversational User Interfaces:

“Conversational user interfaces allow various natural communication modes like speech, gestures and facial expressions for input as well as output and exploit the context in which an input is used to compute its meaning. The growing emphasis on conversational user interfaces is fundamentally inspired by the aim to support natural, flexible, efficient and powerfully expressive means of human-computer communication that are easy to learn and use.”

Conversational User Interfaces Have a Chance to Become the Next Platform

Several trends support the idea that voice or textual user interfaces have a shot at becoming the next platform after mobile.

  1. Mobile devices have by nature a smaller screen, which makes typical GUI menu hierarchies harder to use.
  2. Just saying what you want, can be a powerful and direct approach, which can dramatically simplify interactions, especially when considering context awareness. “OK Google, Navigate Home”
  3. Messaging apps are among the most popular applications on mobile devices; they are used for real-time text-chats between two or more users. Understandably, companies try to go where their customers are, trying to create a presence inside these messaging applications. Messaging app providers would gladly build their apps into platforms, envisioning to get filthy rich on transaction fees. But don’t confuse the popularity of messaging apps as an endorsement of chatbots — the once popular landlines didn’t imply IVRs (Interactive Voice Response Systems) were all the rage either.
  4. Companies like Microsoft, which have no foothold in the mobile ecosystem (no mobile OS or popular mobile app) see the CUI as a new chance to (re-)gain influence, providing services necessary for a Conversational User Interface; i.e., speech recognition, intent extraction, natural language understanding, and speech synthesis.

Definition of the term “conversation” by Merriam-Webster
“oral exchange of sentiments, observations, opinions, or ideas“

Uttering commands like “What’s the weather in Mountain View tomorrow” or “Navigate Home” hardly qualifies as a conversation. Corporate bots living inside messaging platform like Facebook Messenger, very often mimic the much hated behavior of IVRs. While there is some back and forth, it doesn’t quite qualify as a conversation either.

Having a Conversation or Operating a Virtual Vending Machine?

1–800 Flowers bot integrated into Facebook Messenger

Providing an alternative to calling their 1–800 number, this chatbot represent a virtual vending machine that wants to be operated by text messages, but is this really representing or at least hinting at a platform shift?

Defining ‘Platform’ Depends on Your Viewpoint

A software or content developer may loosely define platforms simply as Desktop, Web, Phone, and now Bots. An end-user on the other hand, may have a slightly different view. Along time ago, he transitioned from text and keyboard to icons and mouse, then to the touch-screen of his smart phones, and now to something they may require him, to use his voice. A user may very informally think of those platform shifts, as going from type to click to touch, and now talk.

Platform shift don’t happen over night. It took about 10 years to bring the GUI into the mainstream. At the beginning of a platform shift, it seems that different possible futures exist; like branches on tree. Many of us however, only see the direct path we are currently on.

The transition from text to a graphical UI had an evolutionary dead end. While it adopted colors, menus, and windows, it still used mainly the keyboard’s cursor keys for navigation and a character based display. This dead-end was evidently an evolution from DOS, but the revolutionary pixel based graphics and mouse pointer eventually triumphed.

At the time, this was not as obvious as we now belief. For instance, John C. Dvorak wrote in 1984: “The Macintosh uses an experimental pointing device called a ‘mouse’. There is no evidence that people want to use these things. Why would I want this?”

When the iPhone launched in 2007, the touch interface was revolutionary, but doubted by many. Even three years later, many, and almost all Motorola smart-phones (where I was working at the time), were on the evolutionarypath from Palm and Blackberry devices, and came with a keyboard. Mike Lazaridis (RIM CEO), said on Nov. 1, 2007: “Try typing on a touchscreen on an Apple iPhone, that’s a real challenge.”

I know this is controversial, especially to those, who are already heavily invested in chatbots, but at the beginning of this platform shift, there once again seems to be an evolutionary and a revolutionary future. This time it’s type vs talk.

There is the notion that a Voice-User-Interface is just a chat-bot that has a speech-recognizer and synthesizer attached. I don’t think that’s quite right.


I couple month ago, I publish this graphic in an article on medium, showing how user-interfaces have expanded in their dimensions.

We are all used to the traditional 2D environment (think desktop or mousepad.) With Virtual Reality devices however, we are now able to explore inside a three dimensional user interface.

On the other side we have chatbots, which are re-using many visual widgets from the 2D desktop, but forcing the user into a single dimension, allowing only to scroll back into the history.

The Voice User Interface is what I call a “Zero-D” experience, without visual cues, leaving the progress of a conversation to the user’s memory and ingenuity. It offers however the most frictionless form of communication.

On Chatbot Magazine:

Leave a Reply