Inspire and innovate, emphasizing voice user interfaces, speech recognition and synthesis, NLU, and AIML.
I am focusing on embedded, mobile, and open source technologies and help accelerating the discovery and adoption of emerging mobile technologies.
I created the Java-based open source XUL Engine SwixML, which Sun’s CTO called “The strongest straightforward design of declarative UI implementations”.
SwixML represents ideas that today are heavily re-used in Google’s Android SDK. (Graphical User Interfaces are described declaratively in XML documents that are parsed and rendered into UI widgets hierarchies at runtime.)
But I have create so much more software, I’m extremely proud of.
A lot of my work evolves around early technology prototyping. Still, I’m trying to put some ideas into real world mobile applications.
Take a look at Artist on Android, the Horsemen of Speech Recognition, or other apps that I have published under the Techcasita Productions brand in Google’s play store.
Most mobile applications consume some sort of cloud service. Speed is extreme important for Voice User Interfaces to work well, which means you want to do as much as possible on-device. However, speech recognition accuracy and speech synthesis quality often requires to implement these services in the cloud.
Related services that I have recently implemented as cloud services include aggregation, text summarization, and sentiment analysis.
I’m appointed to the advisory committee for the Mobile App Development Certificate at the University of California, Irvine, and occasionally speaks at conferences and user groups on topics ranging from Embedded Technology to Declarative Programming, emphasizing UI Generation at Runtime, and everything Voice User Interface related of course.
Have a look at some slide from my most recent talks.
Many new concepts that I implement in mobile applications, are communicated best through video clips or short films; and I’m not talking about simple screen grabs.
Take a look at some high quality short HD films that I have created over the last few months and years.
Amateur professionalism, a concept used since 2004, describes an emerging sociological and economic trend of people pursuing amateur activities to professional standards. Well … that pretty much describes how I look my photography today.
If you like, take a look at some of my photos at http://ramonaphoto.com
As most of us, who still maintain their own web presence, instead of totally giving in to FaceBook or posting exclusively on Google+ maybe, I’m never really satisfied with the layout and look and feel of my site.
I don’t remember the exact date, but I published my 1st HTML on a server running at the University of Marburg Germany, in fall 1995. I still remember the URL, which appears on the WayBackMachine as early as May 2nd 1997.
Yesterday was a big or maybe even a huge day, for voice user interfaces.
Microsoft introduced us to Cortana and Amazon introduced the Fire TV box, which includes a remote control, supporting Voice input. Considering that both are not 1st to market (SIRI, ComCast’s x1-xfinity), their entry further validates VUIs.
FireTV’s voice recognition seems so good that when Jon Fortt was demonstration it today on live television (CNBC), he dared to ask it for “Pawn Stars” http://video.cnbc.com/gallery/?video=3000264102
Wedbush analyst Shyam Patil wrote that Nuance Communications was likely powering Amazon.com’s Fire TV voice search, while reiterating a neutral rating and $15.00 price target on Nuance. (Nuance traded today for $17.59)
The consolation, wrt Speech Recognition and Synthesis seems to continue. Apple has acquired speech recognition pioneer Novauris last year, but this had not been announced until today. One of the biggest differentiators about Novauris in terms of the competitive landscape, is that they operated in both the embedded (i.e. on-device, like OpenEars, PocketSphinx) and server space (like LumenVox, Nuance), and they also owned the core engine.
“NovaSearch doesn’t carry out recognition at the word or sequence-of-words level, but rather identifies complete phrases from start to finish by matching them against a potentially huge inventory of possible utterances. This enables it to assemble information about what has been spoken over utterances of virtually any length and take near-optimal decisions.”
The fourth annual Mobile Voice Conference took place at the Hyatt Fisherman’s Wharf, San Francisco, on March 3rd-5th, 2014.
Opening the Mobile Voice Conference, Robert Weideman, GM and Executive VP at Nuance, stated in his keynote address that building an intelligent multichannel virtual assistant, delivering personalized customer service via a human-like conversational interface, built on their Nina technology, requires a collaborative effort from Nuance’s consulting team and the client’s engineering team. Nuance would probably be doing 40% of the work. While he thinks that the Nina platform would eventually evolve and be made available as an SDK, it isn’t something he sees happening anytime soon. Knowing Robert, from the days we were both working at Cardiff Software, (where he served as VP Marketing), those numbers and facts are absolutely trustworthy.
Advancements at an alternative approach for building an adaptive spoken dialogue systems, namely AIML 2.0 (Artificial Intelligence Markup Language), were explained by Mike McTear, author of the just released “Voice Application Development for Android” book and Professor at University of Ulster Belfast Ireland (Spoken Dialog Systems), where the Loebner Prize 2013 / Turing Test competition took place. Mike mentioned to me that tools for efficiently training bots, i.e. automatically feeding information into knowledge bases, is actively being worked on and should give AIML a real boost.