February 21, 2011

I'll Take Deep QA for $1000, Alex

Last week, after three years and over $100 million in research and development by IBM, we saw the 3-day advertisement/gameshow that was "Watson vs the best (human) Jeopardy players ever".

Predictably, Watson won.  And equally predictably, it did surprisingly well with some questions and surprisingly poorly with others.  I found it fascinating television.

I suspect it's difficult for folks who aren't in the software business to get any sense of how difficult it must have been to develop Watson.  This is software that can take the quasi-natural language surrounding the Jeopardy category names and clues, quickly search an internal database, come up with the top three responses, calculate a probability of correctness for each, then (if the top probability is high enough) press the buzzer and give the response.  All of this faster, and more accurately than the best people to ever play the game. copyrightjoestrazzere

I watched it with the same awe that I watch computer-animated graphics in games and movies.  Smart people, creating interesting systems.

If the goal was to attract attention to IBM, then certainly this was a success.  But I wonder why the audience (in the studio built at IBM facilities) was packed with suits, rather than younger folks?  Wouldn't IBM want to use this opportunity to attract (presumably younger) smart new talent to their company?  To me, seeing a room full of mostly VP-types clapping loses some of that "this would be a cool place to work" aura that could have happened.  Do you think the studio audience would have had a slightly different demographic had Google developed the Jeopardy champ instead of IBM?  I do.

Anyway, well done IBM.  Now that you've conquered chess and Jeopardy, what's next?  Obviously Wheel of Fortune is too simple, and there's too much chance involved.  Probably not Are You Smarter Than a 5th Grader.  How about American Idol?  Maybe Celebrity Apprentice?  Perhaps Dancing with the Stars?

"There's no shame in losing to silicon,"
- Ken Jennings 
"What is Toronto????"
- Watson, in the Final Jeopardy category "US Cities" 
"The challenge is over.  Watson, Ken Jennings and Brad Rutter concluded their final round of Jeopardy and the winner was... resoundingly, humankind."
- IBM 
"The opening line of the sales pitch is already written:  "This uses the same technology that beat Ken Jennings on Jeopardy!"  "
- Hellerman Baretz Communications 
"How would I test something like that?"
- me 
"It was Watson's "human attributes" that make him so compelling,"
- Joanna Weiss, The Boston Globe. 
"The big winner in this contest was science."
- St. Petersburg Times 
"I for one welcome our new computer overlords"
- Ken Jennings 
I wish IBM had named the project "Deep Q&A", rather than "Deep QA".  I have a hard enough time differentiating those terms for actual humans I meet.

This article originally appeared in my blog: All Things Quality
My name is Joe Strazzere and I'm currently a Director of Quality Assurance.
I like to lead, to test, and occasionally to write about leading and testing.
Find me at http://strazzere.blogspot.com/.


  1. Joe,

    Priceless... great post buddy. But I guess we could finally say that this was "Automagic, and not just Automation".


  2. Thanks, Jim.

    I agree that this is not just Automation.

  3. I'd like to point out that Watson did not do all of the processing that the humans did.

    The humans had to interpret the symbols presented to them (the answers read by Alex or displayed on the screen).

    The text was piped electronically directly to Watson.

    There was no chance for it to misread the question or hear it incorrectly, and it did not expend any effort in converting the symbols into the actual information.

  4. ??? Perhaps I don't understand your point?

    The humans had input fed to them use two senses (reading and hearing) at the same time, in their native language.

    Watson did indeed have to interpret the text symbols fed to it - text clearly isn't the native language of computers, and certainly not text in the format used in Jeopardy.

    And that clearly required processing effort to convert.

  5. The point is that speech recognition or optical character recognition components would require additional processing to turn the symbols into knowledge queries.

    The humans had to go Get Symbols > English > Internal Processing. Watson just had to go English > Internal Processing.

    As a point of order, English is not the native language of the brain's processing, either.

  6. Yes, speech recognition or OCR would have required different processing than text input.

  7. And it is an additional step that the humans performed which Watson did not.

    Advantage: Watson.