Videos

I believe that some concepts are communicated best through video clips or short films. Enjoy some short HD films that I have created over the last few months and years.

Onward to Conversational Applications

When in the nineties, companies started publishing their websites, they needed people with a new skill set: Webmasters. Today, this dated job title has morphed into a broad field of tech employment, including graphic design, search engine optimization, and content strategy. The separation of concerns has led to a better and faster process of building modern web experiences.
This talk explores how much of this can already be observed in the development process of Conversational and Voice User Interfaces and how a new approach to API design may make the “Webmasters for VUI” obsolete.
Shallow CUIs and VUIs are today’s goofy static web pages that need to be replaced by full-blown conversational applications.


Open Sky Network – Alexa Skill

Airplanes periodically broadcast their position and velocity and other information on the 1090 MHz radio frequency. This Alexa skills uses this data to provide interesting information about the airplanes above your vicinity. E.g. after launching the skill you can ask for the closest, fastest, or lowest flying airplane. On devices with a screen, you may even see a picture of that plane.


Striving for likability

Recent research shows that when communicating emotions, your voice matters more than your words. I.e., not WHAT you say, but HOW you say it, the linguistic and paralinguistic cues, most influences the emotion that is communicated when you talk. Interestingly, those emotions are more accurately perceived in a voice-only interaction, when compared to multi-modal.
What else? Genderless bots and an approach that is using machine learning models, including “Neural Word Embeddings”, to detect bias, before a bot relays it to its users.


Cuboid

A short while ago, I wrote about running Micro Python on an ESP32 (HUZZAH32) chip. More recently, I have added a triple axis accelerometer (LIS3DH), a battery, wireless charging, and put it all into a very nice case.


Mumbler

Let’s make sure the skills or bots we are building respond kindly, considerately and empathically if warranted, and thereby truly deserve a user’s politeness. The number of “Thanks” a bot hears, may tell if you are on the right track.


Prosperity Light

Imagine a solution that counters the paralyzing effect of messaging overload. A dedicated and delightful indicator helping you to archive your most important financial goal. We call it Prosperity Light. Success often begins with declaring a goal and visualization helps pursuing it.

Imaging for a second you needed a car or small-business loan. A good credit score would be all important. But it’s a backward looking indicator and simply monitoring it might not be enough. Credit Utilization is consequential. Staying within 30% of your card’s limit will improve a score quickly. A light’s color changes from blue to red the closer you get to the 30 % threshold ..


Just having a little fun with Alexa and an Huzzah board

This little thing can even do Python, as I have shown here: https://wolfpaulus.com/micro-python-esp32/


The closest thing to the science fiction – Getting CUI right

“The Amazon Echo is the closest thing to the science fiction concept of interacting with computers, in part because it responds to a voice inquiries so quickly” says William Meisel, executive director of the Applied Voice Input/Output Society (AVIOS) and president of TMA Associates.
It is not hard to imagine how a Conversational User Interface could be put to good use, with an currently existing service infrastructure. In fact, take a look at this conversation, where a freelance photographer interacts with the Amazon Echo, which on his behalf accesses the QuickBooks Self-Employed.
By the way, I find that the best voice user interfaces are those that make have me say very little.


Alexa, take Me For A Ride

This video is about Voice User Interfaces


Raspberry Pi 2 – Translator

Short demo video shows the Raspberry Pi running a translator app, using web services from Google and Microsoft for speech recognition, speech synthesis, and translation.
More details can be found here: wolfpaulus.com/jounal/embedded/raspberrypi2-translator/


My Interview with an Avatar

This video is about something new and cool. On which side of the uncanny valley it resides is for you to decide.


Speech Recognition – Coming of Age

Most people read text faster, than listening to it, read aloud. And many type text faster than dictating it.
But even in traditional computing, Voice has been used successfully, for short-cutting complex navigation trees or formulating unstructured queries.

A more than ever diversified mobile device landscape, requires us to rethink established UX patterns. For this next generation of mobile devices, the use of Voice for input and output, will be significant. UI-Widgets, Tables, and Forms, simply don’t work on wear-ables and in-car systems.

Let’s have a look, at how a “hands free – eyes free” User Experience could look like, even for a task that is not necessary a favorable showcase: “Capturing Receipt Information.”

Listing for voice commands stays dormant, until the user says “OK Intuit”, this hot-word activates the app’s on-device speech recognizer, which is optimized for recognizing about a dozen menu items and commands. However, if the application detects that it was open through the Google Voice launcher, it automatically activate the on-device recognizer.

On device recognition is performed by an Open Source Toolkit developed by Carnegie Mellon University. It’s highly accurate only for a few dozen words, but preferably suitable for in-application navigation. On application launch, complex dynamic grammars that are unique to every customer, are uploaded to a LumenVox recognition server in our private Cloud.

Capturing the four required data fields for an expense, happens in a two step process: Step one asks for Amount and Payee, secondly, we ask for Payment Method and Category.

While, Speech Recognition as come a long way, it’s far from perfect, and if the recognition confidence is too low, we intelligently ask again, but including the information we already got right. At we end, we present a summary and ask for confirmation.


Cora, Taught to Amaze

Hi, my name is Cora, I hope you still remember me. Last time, I helped you finding out what the balance of your checking account was, and helped you answer if you could afford eating out that night. I also provided you with up to the minute stock quotes, all using natural language.

So here we are, a couple weeks later, and I am back. You can now tell me to “Open Preferences” and I will show you this application’s preferences screen, or tell me to “Shut Down” and I will quit the Cora application for you. And about the stock quotes, you do not have to know the ticker symbols anymore, I got that covered now; and while we are talking about stock quotes, I would love to show you something really cool.

Please, allow me to show you how a dual screen experience could help you, doing your taxes next time.


Mobile NFC Card Readers and Terminals

We present two ideas that could eventually allow small and micro merchants to use mobile phones to accept payments from NFC Credit Cards or Sticker. Both approaches hide most of the involved complexity behind an elegant solution, allowing small merchants to not miss a single sale


Google Wallet

Tap & Go a.k.a PayPass, is a new simple way of paying. PayPass is a payment method that lets you make purchases without having to swipe your card or provide your signature. A simple tap with a card, key fob, or mobile phone is all it takes to pay at checkout.
So this Saturday morning, I took paying with a mobile phone to the test – the only method of payment available to me was the Google Wallet application on a Samsung Nexus S Android Phone running on Sprint’s 4G Network.
Google Wallet can be linked to a Citi MasterCard, or like I did, used as a prepaid card, funded with any of my existing credit cards.