Thursday, November 10, 2011

Siri's Deeper Voice (UK version)


Siri's voice is a large part of the subtle way the digital assistant is meant to connect itself to users.  In the UK, that Siri voice is actually male and the original voice behind Mr. Siri has spoken out. Turns out he laid down the five thousand sentence voice tracks six years ago for a previous product later acquired by Nuance.

Saturday, October 22, 2011

Siri: The First 40 Years

In different times, Siri could just be the name of another Malawi child adopted by Madonna or Angelina and splashed on the cover of magazines wearing little designer jeans riding in a Land Rover car seat.  Not this Siri.  Although our Siri had humble roots and is also now in the hands of deep-pocked famous parents, she has a complex family tree that actually goes back several decades and includes some of the best upbringing and education money can buy. 


Today, Siri is almost 43 years old, yet in most ways really an infant.  Now that lady's older voice starts to make more sense....

Seeing the Forest through the Family Trees
Siri was actually born long before Apple adopted her last year for $200 million.  Her birth parents are in some ways really an amalgamation of top inventors, researchers, universities as well as organizations like SRI, NASA and ARPA.

It really takes a village to raise a future digital personal assistant.

Dad in 1968
But some would say that she was really born on December 9, 1968 when her virtual father, inventor Douglas Engelbart, introduced her roots at a small computer conference in San Francisco. That memorable 90-minute presentation (original flyer to the right) was not focused on Siri as she is today, but instead showcased the concepts which have shaped how we interface and interact with computers today.  Looking now more like an old newsreel of a TomorrowLand exhibit at DisneyWorld, it was also the first public demonstration of the computer mouse (later licensed to Apple for $40,000).  It was also the debut of other key innovations in use today such as hypertext, object addressing and dynamic file linking.

Siri was essentially formed that day within those proof of concepts and spent the following years in obscurity as those same ideas helped build out the hardware and software she would someday need to flourish.  Engelbart's focus at that time was on how computers could augment tasks and solve problems for humans.  He continues that goal even today.  At the time this involved various input devices, but actual voice recognition was still far off.

Considering that there was no internet at the time, Engelbart and his team were true pioneers and were formulating the ideas that would be needed to later harness the massive amounts of data that would be available decades later - via the web and subsequently on mobil devices.  That future would end up being Siri's real calling and formally launch her into the limelight almost four decades later.

The Student becomes the Teacher
Fast forward to around 2003, when The Defense Advanced Research Projects Agency (DARPA) awards a five year project to SRI to "revolutionize how computers support decision-makers" using a personal assistant approach. This project was part of DARPA's PAL project (Perceptive Assistant that Learns)  and was visioned as a way to help the military better manage multiple information sources  within their command-and-control environments.

CALO
That project became CALO (Cognitive Assistant that Learns and Organizes) within SRI and thus began Siri's re-birth as the project searched for ways to gather and maintain information in a learning-based environment.  The teams on CALO included some of the best AI scientists, including researchers from Stanford, Yale, MIT, Harvard and Carnegie Mellon.  Not a bad set of teachers for their budding new student, Siri.  But she was not just ready to come out just yet.

After a few years as the DARPA interest in the project waned, SRI began to look at the commercialization of the technology coming out of the CALO project. With the recent advent of the growing mobil market, the project was combined with another SRI project called Vanguard which looked to enhance the ease of use of data services.  These data services were predicted to overtake the declining voice services market within the telephony industry - how right they were.

So Siri's official launching pad was set – a digital assistant to make using the burgeoning data services sector easier to navigate and use.

I'm Sorry HAL, I'm Afraid We Can't Do That
This new combined team at SRI was now working on the technology it internally called HAL. Yes, that HAL from Arthur C. Clarke’s 2001: A Space Odyssey.  That was changed soon after however, owing in part to the fictional computers tendency toward homicide, to the more sedate "Active Technologies".


As part of the spinoff plan, a venture firm was called in and began the process of raising funding for the new company, to be called "Siri", a nod to her birth place.  In October of 2008, Siri, Inc. raised $8.5 million and in 2009 another $15.5 million.  That’s now $24M towards her upbringing plus the $150 million plus of taxpayer money already invested at SRI.

The Apple of her Eye
Siri began making the rounds being shown at computer conferences and the advent of the smartphone helped give more form to her function.  Interestingly, Vlingo was initially used as the speech recognition engine when first built, prior to the eventual switch to Nuance (the VR part could be changed again if needed). The Apple iPhone 3GS was used as the mobile platform to launch the first app of Siri on February 4, 2010. Within three short months, she was quickly adopted by her current parents - Apple.  She was bought for somewhere estimated around $200 million and when added to the initial costs of her upbringing, bring the tab easily over the half a billion mark. The original Siri app was eventually put on ice as the technology was re-worked into the iPhone's base architecture.

Siri's future is now in the hands of Apple with many of the same development team who raised her alongside her now as Apple employees.  Her recent debut on the iPhone 4S have some convinced the "S" is for Siri, as she has the potential to dramatically change how we use technology once again – just what she set out to do over 40 years ago.

Her parents sure think she can.  They should know.

Wednesday, October 19, 2011

"My hovercraft is full of eels"


Although built on an excellent voice recognition engine and using it's servers, Siri sometimes need a little help to understand what you are trying to say. At times it can feel like this infamous Monty Python sketch which coined this posts title.  It is unclear how much of the engine learns over time from your speech, but she purports to adapt to your speaking as you increase her use. For now, some things can help ease the learning curve between man and machine...girl. Like her carbon-based counterparts, Siri sometimes acts like she only hears what she wants to hear. Although light years from the handwriting recognition of the Apple Newton and much better than early cellphone voice dialing attempts ("call Cindy"...."now calling Mindy"), Siri however still needs some teaching or cheating to understand some of your commands.

Since many other applications use the same Nuance inspired VR engine (which was also born at SRI where Siri was also hatched before sold to Apple in 2010), some of those experiences can help make the relationship with Siri work even better.  Apps like Dragon Dictation can provide some history of the "nuances" of the VR engine and many of the commands seem to work.  Here is also a list of commands that should work and give you some ideas.

There are also some other things that can help.

Siri can be easily coaxed or tricked into thinking she is right when the situation arises.  As an example, when referring to a stored contact "Costco", Siri had a little trouble at first recognizing the name as spoken. She did not know what "cosco" was when it was enunciated.  The store was correctly spelled and saved in the Contacts file, so what is up Siri?  Well, being her literally perfect self, she was mostly right for being a bit confused. When many people pronounce the name of the large membership warehouse store, they pronounce it as "coss co", not as it is spelled, "cost co".  The solution was to either pronounce it as spelled with the "t" properly voiced, or do a little slight of tongue.

The Contacts card has a couple of helpful things which can solve this problem without having to change your improper diction.  Simply edit the Contact name and Add a field and either chose to add a Nickname which is entered as you say it, "cossco".  You can also chose to add it phonetically in the provided Phonetic field for first name.  Now she can understand your mangled use of the language and you don't need to misspell the contact name.  Siri can also be corrected on the fly by actually retyping the misinterpreted name when written back by Siri for your approval or clarification.


It also appears that steady commands with no hesitation work best, so as always with dictation systems, know what you are going to say before asking her to listen to you.  She can be a bit impatient.

Location, Location, Location


One of the best examples of Siri's power is her ability to accomplish multiple tasks based on a single request.  The use of location-based reminders using Geo-fencing demonstrates this useful feature fairly well. This is the ability to ask Siri to remind you of something when you arrive or depart a known location (ex. "remind me to get milk at Acme"). The requirement for this feature takes a little extra set-up for anything besides Home or Work as a place and has tended to be somewhat trickey for many who are trying to use it.

The fist thing is to obviously make sure Location Services are turned ON in Siri.  Second, is to make sure that iCloud settings (Settings -> iCloud) has Reminders set ON.



Next, under Settings -> "Mail, Contacts, Calendar", scroll all the way down the page and make sure the Reminders "Default List" is set to use Reminders under iCloud.

Now for some important tweaks to Contacts. It is important that places you want to refer to as a location are properly set-up in your Contacts.  Besides the obvious things like the name and address of the place ("Acme Food"), it appears that also setting up a nickname is important for Siri to use that contacts location for reminders.  This is accomplished by editing the contact and then scroling down to "Add a Field" and clicking.

This will bring up fields you can select. Choose the Nickname field at the bottom of the first section and enter the name you will use to reference that location ("Acme", "grocery store", etc.).  This is how Siri will know this is the place or contact you are asking to be located.  Save the edited contact and wait briefly as it is uploaded to iCloud, which appears to be fairly quickly, but may not be immediate enough to work instantly.  It can take minutes at times.

Now you are ready to have Siri be the nagging reminder you need based on where you are at the moment with your phone on you.


Some tips: Speak your commands in a fluid voice without hesitations or breaks.  If you say, "remind me to get bread..." and then hesitate before saying where and when, Siri will then jump in and ask you when you want the reminder.  You then would need to say, "at Acme", however it seems this is more prone to failure than when she accepts it in a single statement.  Instead, say, "remind me to get bread at Acme" in one flowing command and you can then avoid having to add that additional clarification.  If successful, she will then show you the Reminder and ask for a confirmation or cancellation.  If you have several different food stores you shop at, may want to make sure you use a nickname that you will remember.  Using "grocery store" would be fine only if you live in a town with only one grocery store or only frequent that one store for food. It also is wise to see what Siri is using when you speak the location name.  It could be a simple case of pronunciation, which is easily solved. 


Tuesday, October 18, 2011

Calling All Assistants


Nothing looks more unseemly than someone yelling into a cell phone held out in front of them tilted away, as though they were examining a radioactive rock just dropped from space. For some reason, speaking commands to computing devices has evolved from learned behavior which presumes that one must hold the device at arms length, tilt and then speak as though giving directions to a partially deaf, geriatric foreigner.

Although handy for showing off Siri's features, this communication form is best used for those public displays where a group is treated to what is usually a failed attempt at showing off voice recognition on your new phone.  If the invited on-lookers walked away, she would of course suddenly understand what you said.
 

Thankfully, Siri can be accessed and commanded without having to hold the iPhone out and yell at it or hold it perpendicular to your face.  She can discretely be called up from her slumber by simply raising the phone to your ear as though answering an incoming call when the phone is off lock.  After a second, the phone will beep letting you know Siri is ready for a new command.  A command that can be spoken into the phone as though you were on a call, thereby not annoying those around you (anymore than they may already be).

This feature can also be set and defeated in Siri settings, under General.  I would set "Raise to Speak" on for those times when you want to impress yourself and her and not call attention to yourself to others.