斯坦福NLP笔记9 —— Weighted Minimum Edit Distance

为什么需要计算加权的最小编辑距离

  1. 拼写检查中,某些字符更容易被误拼

  2. 生物计算机科学中,某些插入和删除更可能发生

斯坦福NLP笔记9 —— Weighted Minimum Edit Distance_第1张图片

通过上述表格可看出,元音之间被误拼的概率高,譬如a被误拼成e、u的概率很高,而a几乎不可能被误拼成b


加权的最小编辑距离

此时的初始化和递归关系变为:

斯坦福NLP笔记9 —— Weighted Minimum Edit Distance_第2张图片

也就是说此时还需要两个额外的矩阵来存储不同位置的del、ins值。


动态规划一词的来历

"I spent the Fall quarter (of 1950) at RAND. My first task was to find a name for multistage decision processes. An interesting question is, Where did the name, dynamic programming, come from? The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named Wilson. He was Secretary of Defense, and he actually had a pathological fear and hatred of the word research. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term research in his presence. You can imagine how he felt, then, about the term mathematical. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? In the first place I was interested in planning, in decision making, in thinking. But planning, is not a good word for various reasons. I decided therefore to use the word “programming”. I wanted to get across the idea that this was dynamic, this was multistage, this was time-varying I thought, lets kill two birds with one stone. Lets take a word that has an absolutely precise meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an adjective, and that is it's impossible to use the word dynamic in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It's impossible. Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to. So I used it as an umbrella for my activities."

大意就是说,Bellman当时在RAND工作,而这个公司又受美国空军管辖,而当时空军的头头Wilson先生病态地讨厌“研究”这个词,于是Bellman不敢把他的算法取名为什么计算,而选择了规划(programming)这样一个词。而dynamic是为了准确描述算法的特点,即多级结构等。


你可能感兴趣的:(斯坦福NLP笔记9 —— Weighted Minimum Edit Distance)