IBM 是如何训练「沃森」人工智能平台的?

Google 可以依赖搜索引擎和大量应用的交互来训练自己的人工智能系统,

但 IBM 并没有那么多面向消费市场的应用和产品,那沃森人工智能系统是如何训练出来的?




Huge amount of unstructured and semistructured data that is publicly available is feed into the Watson database, just like what a search engine does to build its index. This phase is offline, i.e. it is done before taking the jeopardy show.

2, 电视节目直播的时候,真人选手看到问题的同时,问题以文本的形式发送(输入)Watson。[Thanks to Marcus, see comment for the link]At the show, the questions are sent in text form to Watson, the same time human players see them.

3, 文本形式的问题作为搜索请求,在数据库中搜索,就像在Google里搜索一样。只有几百个最佳答案得以保留。

The questions in their text form are used as a search query to search the database, like you search it at Google. And only hundreds of the best search results are kept.

4, 搜索结果,和问题一起,被用来在数据库中重新检索支持证据。

The search results, together with the question, are used to retrieve support evidence from the database.



Each search result, when answering its question, now forms a hypothesis. This hypothesis is then evaluated on the retrieved evidence.

And the answer is scored on many dimensions.

6, 使用合并算法,这些高纬度问题被排位,然后其中的某一个就赢了。

The hi-dimension scored answers are ranked using some merge algorithm, and then someone will win.


对happy jeopardy的问题。



If Watson is confident enough with its final answer, it will try to answer that question. Of course, convert the answer into a question to happy jeopardy.


That said, Watson is a complicated system that each phase described above adopts various of algorithms. And the system runs on a parallel platform in order to give the answer as soon as possible.

更多信息,Google “Deep QA”。

For further information, Google DeepQA.

你可能感兴趣的:(IBM 是如何训练「沃森」人工智能平台的?)