The Coming Wave of Gadgets That Listen and Obey

From: http://www.nytimes.com/2008/01/27/business/27proto.html?_r=2&oref=slogin&oref=slogin

Published: January 27, 2008

INNOVATION usually needs time to steep. Time to turn the idea into something tangible, time to get it to market, time for people to decide they accept it. Speech recognition technology has steeped for a long time: Mike Phillips remembers that in the 1980s, when he was a Carnegie Mellon graduate student trying to develop rudimentary speech recognition systems, “it seemed almost impossible.”

Now, devices that incorporate speech recognition are starting to hit the mass market, thanks to entrepreneurs like Mr. Phillips. He is the chief technology officer and a co-founder of the Vlingo Corporation, an 18-month-old start-up in Cambridge, Mass., that is selling services to cellular carriers and other software companies that want to give their customers the ability to let their mouths do the walking — and the searching.

Vlingo’s service lets people talk naturally, rather than making them use a limited number of set phrases. Dave Grannan, the company’s chief executive, demonstrated the Vlingo Find application by asking his phone for a song by Mississippi John Hurt (try typing that with your thumbs), for the location of a local bakery and for a Web search for a consumer product. It was all fast and efficient. Vlingo is designed to adapt to the voice of its primary user, but I was also able to use Mr. Grannan’s phone to find an address.

The Find application is in the beta test phase at AT&T and Sprint. Consumers who use certain cellphones from those companies can download the application from vlingo.com.

Mr. Phillips has spent more than 15 years in the trenches at companies that nourished speech recognition. In 1994, he was one of the founders of Speechworks, which made early interactive voice-response systems, the now-ubiquitous automated services that answer when we call a company. In 2000, Speechworks was acquired by ScanSoft, which five years later bought Nuance Communications, keeping Nuance as the name. Mr. Phillips left that year to work at M.I.T. as a visiting researcher.

In 2006, he and a colleague from ScanSoft, John Nguyen, started Vlingo because they thought that speech recognition technology, cellular networks and phones were all becoming powerful enough to allow voice navigation systems on cellphones. “We couldn’t have done this five years ago,” he says.

Now, Mr. Phillips is in a race for market share. Another start-up, Yap Inc., based in Charlotte, N.C., is running a beta test of its service, which is similar to Vlingo’s but already has text messaging. Igor and Victor Jablokov, Yap’s co-founders, decided to start the company because they saw their teenage sister text-messaging while in a car.

She wasn’t driving at the time, but Igor Jablokov says cellular companies tell him in meetings that two-thirds of their teenage customers have either sent or read a text message while behind the wheel.

Big companies are also attracted to this market. Nuance started its Nuance Voice Control system last August, the same month that Vlingo’s appeared. Nuance’s system is in use at Sprint and Rogers Communications and can be downloaded to 66 models of hand-held phones, with many more on the way.

Microsoft is a significant potential competitor, thanks in part to its purchase of TellMe Networks last March. TellMe offers a speech-driven search application for cellphones that is available to customers of AT&T — only those who were part of Cingular before the merger — and Sprint. TellMe’s system is built-in on the new Mysto phone from Helio, a mobile phone operator started by Earthlink and SK Telecom, and is the engine for 1800call411, a free directory information service.

Over all, speech recognition was a $1.6 billion market in 2007, according to Opus Research, which predicts an annual growth rate of 14.5 percent over the next three years. Dan Miller, an analyst at Opus, said that companies that have licensed speech recognition technology would probably see faster revenue growth, as more consumers used the technology. The cellphone market holds the most potential, given its billions of phones, but cellular providers are still working out the business model for such services.

Igor Jablokov, Yap’s chief executive, says that he wants his application to be supported by advertising, but that the carriers with whom he is negotiating, which he declined to name, want to charge customers for the service.

To be sure, speech recognition technology has been available on personal computers since 2001 in applications like Microsoft Office, but few people use it. But in cellphone and other markets, speech recognition “is on the cusp of a curve,” says Bill Meisel, editor of Speech Strategy News, an industry newsletter.

Speech recognition, already used in high-end G.P.S. systems and luxury cars from Cadillac and Lexus, is now spreading to less expensive systems and cars — witness those slapstick Ford Sync commercials, featuring vignettes like one showing a young woman who approaches her office building and says “door open,” expecting it to respond the way her car does. It doesn’t, and she and her coffee cup smack directly into it.

Sync was developed by Microsoft and Ford, and based on Nuance technology. And the speech technology chief at I.B.M. Research, David Nahamoo, says the company has an automotive customer testing speech recognition to help drivers find songs quickly while driving — no more pushing buttons.

Then there’s SimulScribe, a New York company that is one of several businesses using speech recognition to convert voice mail into e-mail. “Voice recognition has finally hit the point where someone like ourselves can take it over the hump for specific applications,” says James Siminoff, SimulScribe’s chief executive.

James R. Glass, a principal research scientist at the Computer Science and Artificial Intelligence Laboratory at M.I.T., says speech technology “is going to end up everywhere speech can be useful.” He says machines will keep improving their ability to recognize the way humans naturally talk, even if they have strong accents, and that the technology will find myriad new uses.

THIS doesn’t mean that people will always choose to speak. Genevieve Bell, director of user experience at the digital home group of Intel, says people are unlikely to want to use speech recognition to handle their finances, at least in public spaces. It also may not work well in the living room.

Ms. Bell jokes that if she could, she would yell “cricket!” at the television anytime she walked into a room, so her favorite sport would appear on the screen.

Even a digital expert like her cautions that some people may never be satisfied with the quality of speech recognition technology — thanks to a steady diet of fictional books, movies and television shows featuring machines that understand everything a person says, no matter how sharp the diction or how loud the ambient noise. But soon we will be able to speak our minds to many of our machines, and have them obey our commands.

Michael Fitzgerald writes about business, technology and culture. E-mail: [email protected].

你可能感兴趣的:(Microsoft,service,application,Office,search,System)