The machine had been delivered two days ago on her first adult birthday. She had said, "But father, everybody - just everybody in the class who has the slightest pretensions to being anybody has one. Nobody but some old drips would use hand machines - "

The salesman had said, "There is no other model as compact on the one hand and as adaptable on the other. It will spell and punctuate correctly according to the sense of the sentence. Naturally, it is a great aid to education since it encourages the user to employ careful enunciation and breathing in order to make sure of the correct spelling, to say nothing of demanding a proper and elegant delivery for correct punctuation."

Even then her father had tried to get one geared for type-print as if she were some dried-up, old-maid teacher. But when it was delivered, it was the model she wanted - obtained perhaps with a little more wail and sniffle than quite went with the adulthood of fourteen - and copy was turned out in a charming and entirely feminine handwriting, with the most beautifully graceful capitals anyone ever saw. Even the phrase, "Oh, golly." somehow breathed glamour when the Transcriber was done with it.

--Isaac Asimov, Second Foundation - 1953



Here at AnandTech, we do our best to cover the topics that will interest our readers. Naturally, some topics are of interest to the vast majority of readers, while others target a more limited audience. At first glance, this article falls squarely into the latter category. However, when we think about where computers started and where they are now, and then try to extrapolate that and determine where they are heading in the future, certainly the User Interface has to play a substantial part in making computers easier to use for a larger portion of the population. Manual typewriters gave way to keyboards; text interfaces have been replaced by GUIs (mostly); and we have mice, trackballs, touchpads, and WYSIWYG interfaces now. Unfortunately, we have yet to realize the vision of Isaac Asimov and other science fiction writers where computers can fully understand human speech.

Why does any of this really matter? I mean, we're all basically familiar with using keyboards and mice, and they seem to get the job done quite well. Certainly, it's difficult to imagine speech recognition becoming the preferred way of playing games. (Well, some types of games at least.) There are also people in the world that can type at 140 wpm or faster -- wouldn't they just be slowed down by trying to dictate to the computer instead of typing?

There are plenty of seemingly valid concerns, and change can be a difficult process. However, think back for a moment to the first time you saw Microsoft's new wheel mouse. I don't know how other people reacted, but the first time I saw one I thought it was the stupidest gimmick I had ever seen. I already had a three button mouse, and while the right mouse button was generally useful, the middle mouse button served little purpose. How could turning the middle mouse button into a wheel possibly make anything better? Fast forward to today, and it irritates me to no end if I have to use a mouse that doesn't have a wheel. In fact, when I finally tried out the wheel mouse, it only took about two hours of use before I was hooked. I've heard the same thing from many other people. In other words, just because something is different or you haven't tried it before, don't assume that it's worthless.

There are a couple areas in which speech recognition can be extremely useful. For one, there are handicapped people that don't have proper control over their arms and hands, and yet they can speak easily. Given how pervasive computers have become in everyday life, flat out denying access to certain people would be unconscionable. Many businesses are finding speech recognition to be useful as well -- or more appropriately, voice recognition. (The difference between speech recognition and voice recognition is that voice recognition generally only has to deal with a limited vocabulary.) As an example, warehousing job functions only require a relatively small vocabulary of around 400 words, and allowing a computer system to interface with the user via earphones and a microphone can free up the hands to do other things. The end result is increased productivity and reduced errors, which in turn yields better profitability.

Health Considerations
POST A COMMENT

38 Comments

View All Comments

  • Googer - Saturday, April 22, 2006 - link

    BMW 7 series Speech recognition is about 50-75% accurate (my guess) and some users have more luck with it than others. Reply
  • Googer - Friday, April 21, 2006 - link

    I think you should re-benchmark these on a system that is not overclocked. Overclocking may have contibuted to errouneous test results. It is possible that some of the benchmarks could have been better on a normal system. Also I am surprised this was not tested on a Intel Syststem. Prehaps one of the programs may benefit from the Netburst Architeture with or with out dual core.


    Also I would love to download the Dication and Normal Voice wav files, so I can understand the differance between them. Thanks for the article, it came in perfect time; Someone who is handicaped was asking me about this last night.
    Reply
  • JarredWalton - Friday, April 21, 2006 - link

    I'll see about putting up some MP3s of the wave files -- of course, that will open the door for all of you to make fun of how I speak. LOL

    In case this wasn't entirely clear in article, this was all done on my system that I use every day for work. It's overclocked, and it's been that way for six months. I run stress tests (Folding at Home -- on both cores) all the time. I would be very surprised if the overclock has done anything to affect accuracy, especially considering that I did run some tests on a couple other systems that were not overclocked, and basically removed them from this article because they would have simply taken more time to put in the article, and they didn't give me any new information.

    It's pretty obvious that neither of these algorithms benefit from multiple processing cores -- HyperThreading, dual core, SMP, whatever. I also wasn't sure how much interest there would be from people in this topic, but if a lot of people want to know how this runs on Intel systems I could go back and look at one. One thing worth noting is that SysMark 2004 does include Dragon NaturallySpeaking version 6.5 as one of the tests. Of course, the results are buried in the composite scores.
    Reply
  • JarredWalton - Friday, April 21, 2006 - link

    MP3 links available:

    http://www.anandtech.com/multimedia/showdoc.aspx?i...">http://www.anandtech.com/multimedia/showdoc.aspx?i...

    Note that DNS only uses WAV files (AFAICT), but uploading 45MB WAV files seems pointless. Convert them to WAVs if you want to try them with Dragon.
    Reply
  • Googer - Saturday, April 22, 2006 - link

    Excellant job on the dictation/wav files, you are a very good reader and have a nice clear and concice voice. ;ThumbsUP) Reply
  • stelleg151 - Friday, April 21, 2006 - link

    Cool article. I hope that voice recognition continues to improve, for I think it could be incredibly useful for areas like HTPC, or as you said messenging while doing other things (gaming).
    Reply
  • Zerhyn - Friday, April 21, 2006 - link

    Have you ever tried out speech recognition and been underwhelmed? To you yearn to play the role of Scotty and call out..

    ?
    Reply
  • PrinceGaz - Friday, April 21, 2006 - link

    Yes, that was the first thing I noticed before I even started reading the article. Maybe they used speech-recognition software to enter that.

    I think they should have an editor (or at least let another contributor read what others have written) who has to approve an article before it goes live as the current number of tyops is unforgiveable ;)
    Reply
  • JarredWalton - Friday, April 21, 2006 - link

    I'm doing my best to catch typos before anything goes live, but after being up all night trying to finish off this article, I went to post and realized I didn't have a title or intro. So, I put one in using Dragon, but my diction goes to put when I'm tired, as does my eyesight and proofing ability. One typo in a 44 word intro (I didn't proof/edit it at all) isn't too bad for the software. Bad for me? Maybe, but mistakes do happpen. :) Reply
  • johnsonx - Friday, April 21, 2006 - link

    One nice thing about Dragon, despite the high CPU utilization shown in the article, is that it will run quite happily with very lowly systems. I have a customer who uses it all day long on PentiumIII-850's with only 512Mb RAM (the max for those particular systems). The heaviest user there recently upgraded to a low-end Sempron64 with a gig of RAM, and he says the overall system is far more responsive (of course), but Dragon's operation isn't radically better; it worked great on the PIII, and works great now.
    Reply

Log in

Don't have an account? Sign up now