My name is Wolf Paulus, a photographer, hiker, hacker, technologist, based in Ramona, California. I am focusing on embedded, mobile, and open source technologies and help accelerating the discovery and adoption of emerging mobile technologies; inspire and innovate, emphasizing mobile and wearables, voice user interfaces, speech recognition and synthesis, and natural language understanding.
I created the Java-based open source XUL Engine SwixML, which Sun’s CTO called “The strongest straightforward design of declarative UI implementations”.
SwixML introduced ideas, which are now heavily re-used in Google’s Android SDK. (Graphical User Interfaces are described declaratively in XML documents that are parsed and rendered into UI widgets hierarchies at runtime.)
But I have created much more that I’m extremely proud of.
A lot of my work evolves around early technology prototyping. Still, I’m trying to put some ideas into real-world mobile applications.
Take a look at Artist on Android, the Horsemen of Speech Recognition, or other apps that I have published under the Techcasita Productions label in Google’s play store.
Most mobile applications consume some sort of cloud service. Speed is extreme important for Voice User Interfaces to work well, which means you want to do as much as possible on-device. However, speech recognition accuracy and speech synthesis quality often require a cloud-based implementation. Cloud services that I have recently implemented include speech synthesis, aggregation, AIML based natural language understanding, and text summarization, including simple sentiment analysis.
I’m appointed to the advisory committee for the Mobile App Development Certificate at the University of California, Irvine, and occasionally speak at conferences and user groups on topics ranging from Embedded Technology to Declarative Programming, emphasizing UI Generation at Runtime, and everything Voice User Interface related of course.
Take a look at some slides from my most recent talks.
April 11-12, Mobile Voice Conference 2016, San Jose, California
Mobile Voice Conference 2016 in San Jose, California
On April 11-12, I will be speaking at the Mobile Voice Conference 2016 on “Natural Language for Developers – beyond declaring User Intents”
Moderator: Alexander Rudnicky, Research Professor, School of Computer Science, Carnegie Mellon University
Natural Language for Developers – beyond declaring User Intents – Wolf Paulus, Engineer, Intuit
Driving Natural Language Interaction Using Knowledge Representation – William Meisel, President, TMA Associates
Many new concepts that I implement in mobile applications, are communicated best through video clips or short films.
Take a look at some high quality short HD films that I have created over the last few months and years.
“Amateur Professionalism”, a concept used since 2004, describes an emerging sociological and economic trend of people pursuing amateur activities to professional standards. That pretty much describes how I look at my photography work today.
If you like, take a look at some of my photos and the stories behind them, at http://ramonaphoto.com
The Applied Voice Input Output Society (AVIOS) and TMA Associates organize the annual Mobile Voice Conference, which this year took place at the Westin in San Jose, California on April 11 and 12.
Recognizing that speech recognition, speech synthesis, as well as language interpretation has matured, the Mobile Voice Conference 2016 focused on Language User Interfaces and explored trends towards Conversational User Interfaces.
While WIMP (Windows Icons Mouse and Pointer) based user interfaces are getting complexer and more complicated (e.g. force-touch), a language user interface (using text or voice input) can often deliver results more conveniently, i.e., easier and faster. Prominent examples of such language user interfaces are Siri, Cortana, GoogleNow, or Alexa and while promising, their rule based (verses learning) nature exposes their limits rather quickly.
If you want to add none trivial speech output to your application, no matter if it’s a desktop, web, or mobile app, you need to find a way to convert text into speech (TTS) and eventually provide it in a sound format (like MP3) that can be played back on end-users’ devices. While some operating systems come with TTS capabilities built-in, the quality of the voice sound may vary more than you like, and an user experience spanning multiple OSes and platforms, almost always justifies or even requires the deployment of a TTS Web service.
[ISO 500 | 27mm | f/2.8 | 1/1000sec]
This photo was taken, using a multi-copter, but arriving at photos like this is harder than you might think. It takes a lot practice and luck to frame aerial shots. The vibrations of the copter require to set an unusually high shutter speed, which dictates to some degree, f-stop and ISO (sensitivity). Not exactly knowing what the camera is seeing and focusing on, when it’s up there in the air, means taking lots and lots of shots and hoping for the best.