What is Oracle experiment?

问题:
I have read a paper about machine learning and it contains an Oracle experiment to compare between his study and another study? But it does not seem to be so clear what is Oracle experiment?

答案:
An "oracle" is an imaginary entity that always gives the right answer. An oracle experiment is used to compare your actual system to how your system would behave if some component of it always did the right thing.
For example, in the NLP domain, let's assume you built a parser that takes part-of-speech (POS) tagged sentences as input. In the real world, you would have to run real sentences through an actual POS tagger. This tagger would probably produce results with accuracy above 90%, but less than 100%. Since the accuracy of your parser depends on the accuracy of the incoming tags, your parser's performance will be negatively affected by this loss.
In order to see how well your parser would perform if the POS tagger was perfect, you could run an experiment with an oracle tagger. In this experiment, you would replace the real POS tagger with a program that knows the actual POS tags for the sentences, thus always returning tag results with 100% accuracy.
So, if your parser gets 85% accuracy in an experiment with a real tagger, and 90% in an experiment with an oracle tagger, then you know that 5% of your performance loss is directly due to the mistakes of the tagger.

“神谕”是一个总是能给出正确答案的假想实体。oracle实验用于比较您的实际系统,以及如果系统的某些组件总是正确运行时系统的行为。
例如,在NLP领域,假设您构建了一个以词性(POS)标记的句子作为输入的解析器。在现实世界中,您必须通过实际的POS标记程序运行真实的句子。这种标记器可能会产生准确率在90%以上,但低于100%的结果。由于解析器的准确性取决于传入标记的准确性,因此解析器的性能将受到这种损失的负面影响。
为了看看在POS标记器完美的情况下解析器的性能如何,可以使用oracle标记器进行一个实验。在这个实验中,您将用一个知道句子的实际POS标记的程序替换真正的POS标记,因此总是以100%的准确性返回标记结果。
因此,如果您的解析器在使用真正的标记器的实验中获得了85%的准确性,在使用oracle标记器的实验中获得了90%的准确性,那么您就知道5%的性能损失是直接由于标记器的错误造成的。

你可能感兴趣的:(What is Oracle experiment?)