Talks

If you’re interested in having me present these talks to your team or user group, please get in touch
wolf@wolfpaulus.com


The Conversational User Interface Is a Minefield

SpeechTEK 2017, April 24-26, Washington, D.C.

With VR/AR user interfaces, we leave behind the 2D communication of the mouse pad as we move toward saying exactly what we mean or want. Once used to “talking” with Siri or Alexa, users tend to use voice elsewhere. Chatbots seem to be a transitional step to a future where we “talk” to services. This presentation focuses on voice vs. chat user interfaces and introduces attributes of good chatbot user interfaces.


The path to the CUI is heavily mined and booby-trapped

Conversational Interaction Conference 2017, January 30-31, San Jose, CA

Don’t misinterpret the popularity of messaging apps as a glowing endorsement of chatbots. No one ever claimed that IVRs were popular, just because people ordered landlines phones.
While they can benefit greatly from each other, there is no need to create a dependency between the Conversational User Interface and Machine Learning. I.e., it is not hard to imagine how a Conversation User Interface can be put to good use with a currently existing service infrastructure. However, a bunch of cruddy IVR style bots is the shortest path to nuking this nascent opportunity. This talk tries to identify those use cases that truly work in a conversation UI, by providing customer benefit and delight. An all-out effort to not fall into the trap of re-creating the much hated IVR experience.


Patterns for Natural Language Applications – beyond declaring User Intents

Mobile Voice Conference 2016, April 11-12, San Jose, CA

Simple patterns, like adaptive greeting, randomness, maintaining context, or predictive follow-up, can make an already good Voice User Interface spectacular.


Bridging the gap between Speech Recognition and Business Logic

SpeechTEK Conference 2015, August 17-19, New York City, NY
Mobile Voice Conference 2015, April 20, 2015, San Jose, CA

Speech Recognition is readily available for integration into mobile and IoT projects and products. It’s affordable, mostly accurate, and incredibly fast, when streamed. Context-free recognizers require post recognition processing, i.e. dictionaries, string-to-sound language encoders, etc. Web services like AIML/Panadorabots, Wit.ai, and Api.ai, and Amazon’s Alexa Skill-Kit support an declarative approach to define rules to identify user intent and entities. Let’s try to make sense of all of this.


Emotional Prosody

Mobile Voice Conference, March 3rd-5th, 2014, San Francisco, CA

Voice-enabled mobile application provide users access to information instantly, naturally, and almost effortlessly. Simple Voice-Commands however, have failed to gain traction, probably because it’s hard to remember the exact utterance of a command phrase. Instead, more lenient and flexible, conversational style software agents have been more successful.

When it comes to communicating results back to the user, a text response often seems enough. Still, to provide a truly hands-free, eyes-free user experience, a text response needs to be synthesized and played through the phone’s speaker. The quality of the speech synthesis is determined by many factors, including sound quality (sampling rate, dynamic range), prosody (rhythm, stress, and intonation of speech) and maybe less obviously, by Emotional Prosody, conveyed through changes in pitch, loudness, timbre, speech rate, and pauses. This talk will share some ideas, concepts, and the technology needed, to build a prototype implementation that synthesizes text that was augmented with emotional values.


Voice-Enabling Chatbots

Chatbots 3.3 Conference, March 23, 2013, Philadelphia, PA

You don’t have to put you ear on the ground, and still, can literally hear it coming. The broad introduction of Voice User Interfaces, allowing the interaction with mobile devices through voice, may become the biggest advancement in user interface design since the transition from text-based to graphical user interfaces.


Voice User Interface

How to voice-enable your mobile application

Android Speech Recognition and Text-To-Speech – How to voice-enable your mobile application “What does a weasel look like?” – We are taking a closer look at Android’s Speech-To-Text (STT) and Text-To-Speech (TTS) capabilities – and will develop and deploy three small apps, each a little more capable, and finally walk through the steps of building a voice controlled assistant.

Android uses Google’s Speech-To-Text engine in the cloud but has Text-To-Speech capabilities baked right into Android since Android 2.0 (Donut), using SVOX Pico with six language packages (US and UK English, German, French, Italian and Spanish).

While Speech Recognition, Interpretation, and Text-To-Speech Synthesizer are addressed by phone equipment- and OS makers, the core problem of how to capture knowledge and make it accessible to smart software agents is ignored and all service like SIRI or Google Voice Actions remain closed, i.e. not easily extendable with 3rd party information/knowledge.

 


.. see slides of some of my previous talks.

Get In Touch.

If you are interested in working together, or in having me present one of my talks to your team or user group, please do get in touch.