Wolf Paulus

Journal

Navigation Menu

Speech Synthesis and the Quality of Voice

Posted by on Mar 1, 2013 in Android

When listing to the radio or a podcast, while driving to work, I don’t think I imagine how the person I’m listing to, looks like. Still, if later, I happen to see them for the first time, in a picture or video, I often find myself surprised.

A verbally responding mobile application has many obvious advantages. For instance, users don’t have to decipher tiny fonts on small displays, in fact, they don’t have to look at the display at all. Just like colors and typography contribute considerably to the look and feel of an application, so does the voice quality for a voice enabled mobile application.

There are at least three different approaches to synthesize text.

There might be a Text-To-Speech module built into the OS, or a separately installed Text-o-Speech engines can plug-in to the OS’s Text-To-Speech module.
Secondly, instead of requiring a separate install, a synthesizer and voices can be packaged and shipped with the application.
Lastly, a web-service can be used, to synthesize text. The advantage of this, would be a more predictable and consistent voice quality, comparatively independent from the hardware and operation system used on the mobile client.

Read More

Artist on Android w/ Voice Recognition

Posted by on Feb 21, 2013 in Android

Read More

E*Trade Mobile – Voice Commands

Posted by on Jan 27, 2013 in Android

ETrade provides a great mobile app experience on iPhone, iPad, Android, Windows Phone, and Blackberry. I think it’s almost expected that the feature-set provided by the dedicated native mobile applications are not quite the same. The Windows version and especially the one for Blackberry fall far behind what ETrade has to offer on Android and iOS.

For instance, in April 2012, Speech Recognition was first added to their iPhone (not iPad) mobile application and later to the Android app as well; allowing the user to request stock quotes, options chains, company information, or to launch the stock order, just by using voice.

“Investors are becoming more accustomed to interacting with voice-enabled technology, and we’re proud to be one of the first to offer this innovative feature to our mobile users,” said Michael Curcio, President, ETRADE Securities. “By integrating voice technology, ETRADE provides a mobile experience unlike any other – creating a state-of-the-art and convenient approach to navigation.”

ETrade uses speech recognition and speech synthesis software provided by Nuance Communications, Inc. The application is feature-packed, comes as an 11 MB download, and is not, what you would call a thin client. Only a very few of those features however, are accessible through Voice Commands.

Read More

Sh!t we dislike – I’m Watch

Posted by on Jan 23, 2013 in Hardware

The broad introduction of Voice User Interfaces, which allow humans to interaction with computers through voice/speech, may be the next revolution, when it comes to User Interface Engineering and represent an even bigger change, when compared to the transition from text-based to graphical user interfaces. A Voice User Interfaces seems to be especially attractive on smaller devices, like small phones or watches that don’t allow for many buttons or have a touch-screen that is just not big enough, to allow for comfortable interactions.
pebbles
The first generation of watches that connect to smartphone via Bluetooth is now available, most of them are not equipped with a microphone and speaker, i.e. don’t provide a good platform for a Voice User Interface.

  • Pebble, E-Paper Watch for iPhone and Android.
  • Smartwatch, Sony Android watch.
  • MOTOACTV, Motorola GPS Fitness Tracker with MP3 Player.
  • NEXD, The Next-gen Android watch.
  • i’m Watch, i’m SpA, Smartwatch connects via Bluetooth to a phone.
Read More

Grand Slam

Posted by on Nov 28, 2012 in Life

A Grand Slam in Baseball, is a home run hit, when each of the three bases is occupied by a runner, thus scoring four runs. The Grand Slam of Ultrarunning is recognition for those who complete in four of the oldest 100 mile trail runs in the U.S. And in Tennis, the Grand Slam describes the four Grand Slam tournaments, also called Majors, the most important annual tennis events.
A few years back, we came up with our very own definition of a Grand Slam, the Code Camp Grand Slam, which would recognize those who had spoken at four distinct Code Camps, during one calendar year. After having fallen short for the last two years, speaking at only three venues, Tom and I finally completed the four Code Camps in a single year Grand Slam.

Read More