有限自动机

阅读更多

有限自动机

摘自《 Finite Automata 》by Mark V. Lawson
刘建文略译( http://blog.csdn.net/keminlau)

Published in: book Finite Automata

Introduction

The theory of finite automata is the mathematical theory of a simple class of algorithms that are important in computer science. Algorithms are recipes that tell us how to solve problems; the rules we learn in school for adding, subtracting, multiplying and dividing numbers are good examples of algorithms. Although algorithms have always been important in mathematics, mathematicians did not spell out precisely what they were until the early decades of the twentieth century, when they started to ask questions about the nature of mathematical proof. Perhaps motivated in part by the mechanical devices that were used to help in arithmetical calculations, mathematicians wanted to know whether there were mechanical ways of proving theorems.

有限自动机理论是一类简单类算法的数学理论,这些算法虽然简单但对计算机科学很重要。算法是指导我们解决问题的处方,就像我们在学校里学习的算术规则就是一个很好的例子。尽管算法在数学中一直存在而且很重要,但是数学家在20世纪之前并不知道算法具体是什么,直到他们开始拷问数学证明的本质才开始清晰。可能得益于早期发明的辅助算术运算的机械装置,数学家们想知道能不能用同样的机械方式来实现定理的证明。

By mechanical, they did not intend to build real machines that could prove theorems; rather they wanted to know whether it was possible in principle to prove theorems mechanically. In mathematical terms, they wanted to know whether there was an algorithm for proving the theorems of mathematics. The answer to this question obviously depended on what was meant by the word 'algorithm.' Numerous definitions were suggested by different mathematicians from various perspectives and, remarkably, they all proved to be equivalent. By the end of the 1930s, mathematicians had a precise definition of what an algorithm was, and using this definition they were able to show that there were problems that could not be solved by algorithmic means. It was perhaps no accident that, as mathematicans were laying the foundations of the theory of algorithms, engineers were constructing machines that could implement algorithms as programs. Algorithms and programs are just two sides of the same coin.

事实上,他们并不是想去建造一部真能自动地进行定理证明的机器;他们想知道理论上这种机械证明是否可行。从数学的角度上讲,就是有没有证明数学定理的算法。很明显这个问题的答案依赖于“算法”的定义。很多数学家从不同的角度给出了不同的算法定义,而这些定义现在已经被证明是等价的。到20世纪30年代后期,数学家们已经给出了算法的准确定义,并利用这个定义证明了存在算法不能解决的问题。在数学家们正为算法理论构建基础的同时,工程师们在建造能以程序文本形式实现算法的机器,二者有必然的联系,算法与程序是一物两面。

Mathematicians were initially interested in the dividing line between what was and what was not algorithmic. But simply knowing that a problem could be solved by means of an algorithm was not the end of the story. Certainly, this implied you could write a program to solve the problem, but did not imply your program would run quickly or efficiently. For this reason, two approaches to classifying algorithms were developed. The first classified them according to their running times and is known as complexity theory, whereas the second classified them according to the types of memory used in implementing them and is known as language theory. In language theory, the simplest algorithms are those which can be implemented by finite automata, the subject of this book.

数学们起初感兴趣的是算法的有与无的界限,然而,简单地知道问题是否有解决的算法并不完事了。因为”一个问题是算法可解“虽然意味着你能写一个程序来解决这个问题,但不意味着你的程序就会快速而有效地解决问题。缘于此,(KEMIN:定性完后进一步定量研究一个问题)研究出两种对算法进行分类的办法。第一种是根据算法所需的运行时间分类,也就是所谓的复杂性理论;第二种是根据算法实现使用的内存种类分类,也就是语言理论。在语言理论,最简单的一类算法就是可以用有限自动机实现的算法,也是本书的主题。

Finite automata were first studied in the 1950s by Stephen Kleene, and found a number of important applications in computer science: for example, in the design of computer circuits, and in the lexical analyzers of compilers. In the 1960s and 1970s, mathematicians such as Samuel Eilenberg, Marcel-Paul Schuumltzenberger, and John Rhodes pioneered the mathematics of finite automata. More recently, other mathematicians have come to appreciate the usefulness of automata in such areas as combinatorial group theory and symbolic dynamics.

有限自动机首次在50年代被Stephen Kleene研究,并发现其在计算机科学在有很多重要的应用,比如,电路设计、编译器的词法分析器。六七年代后,数学家开创了有限自机的数学理论。直接最近,还有一些领域得益于自动机理论,像组合群论和符号动力学。

This book is intended to be an introduction to the mathematical theory of finite automata, assuming as background only a first course in discrete mathematics. The Appendix outlines the prerequisites.

本书可作为有限自动机的数学理论的导论,要求读者有离散数学的背景。

Structure of the book

The book is notionally divided into two parts: Chapters 1 to 6 form the first part, and Chapters 7 to 12 the second; Chapter 7 is the bridge that links them.

PART I. This centres on Kleene's Theorem, the first major result proved about finite automata. I describe two different ways of proving this theorem with Chapter 1 common to both:

本书总体分为两大部分,第一部分(1到6章)主要讲述克林定理的证明;第二部分(7到12章)主要讲述可辨别语言的代数理论……

• The quickest route to proving Kleene's Theorem is the following: Sections 3.1-3.3 for constructions involving non-deterministic automata, Section 5.1 for the definition of regular expressions, and then Theorem 5.2.1 of Section 5.2, which is the proof of Kleene's Theorem itself. Chapter 2, omitting Section 2.5, can be regarded as a collection of examples.

• The route that emphasises algorithms more than proofs is the following: Sections 2.1-2.3 for practice in designing automata, Sections 3.1 and 3.2 for the accessible subset construction, Section 4.1 and Theorem 4.2.1 of Section 4.2 for ε-automata, Section 5.1 for regular expressions, and Section 5.3 for an algorithmic proof of Kleene's Theorem.

Section 2.6 on the Pumping Lemma can be read at any point after Chapter 1. Section 5.4 describes an algebraic technique for converting automata into regular expressions based on solving equations. Chapter 6, on local languages, describes a different way of converting regular expressions into automata from that used in any of the above proofs of Kleene's theorem.

PART II. This centres on the algebraic theory of recognisable languages, the main goals being Schutzenberger's characterisation of star-free languages, and the Variety Theorem of Eilenberg and Schutzenberger. Chapters 7 and 8 have a strong algorithmic flavour, but Chapters 9-12 are increasingly mathematical.

Chapter 7 describes how to find the smallest automaton recognising a given language. Two different techniques are given depending on whether the language is described by an automaton or by a regular expression. If the former, then Section 7.2 describes the algorithm for converting the automaton into a minimal automaton. The theory of minimal automata is developed in Sections 7.3 and 7.4. If you just want the algorithm for minimising an automaton then Sections 7.1 and 7.2 are all you need. If the language is described by a regular expression, then there is a beautiful technique, called the Method of Quotients, which will construct the minimal automaton directly from the expression. Unfortunately, there are 'issues' connected with this method, but I have relegated a discussion of these to the end of Section 7.5.

The minimal automaton is obviously important from a practical point of view, but it is also the starting point of an algebraic technique for studying recognisable languages. This is introduced in Chapter 8. Here the transition monoid of an automaton is defined and an algorithm described for computing it. At the conclusion of this chapter, semigroups and monoids are introduced.

The point of Chapter 8 will only become clear in Chapter 9, when we prove the algebraic counterpart of Kleene's Theorem: a language is recognisable if and only if its syntactic monoid is finite. The syntactic monoid of a recognisable language is isomorphic to the transition monoid of the minimal automaton of the language.

The main result of Chapter 9 tells us that finite monoids may be useful in studying recognisable languages. Because of this, Chapter 10 develops the theory of finite monoids we shall need. We also show how results about recognisable languages, which we previously proved using automata, can also be proved using monoids.

The real justification for studying the syntactic monoid of a recognisable language comes in Chapter 11, where we show that an important class of recognisable languages are characterised by the algebraic properties of their syntactic monoids. Specifically, we prove Schuumltzenberger's Theorem: a language is star-free if and only if its syntactic monoid is aperiodic.

Schuumltzenberger's Theorem opened up a whole new area of research: classifying recognisable languages by means of the algebraic properties of their syntactic monoids. In Chapter 12, we prove the Variety Theorem of Eilenberg and Schuumltzenberger, which provides the template for proving results of this type.

参考

  • http://www.informaworld.com/smpp/section~content=a729200462~db=all~fulltext=743473315~dontcount=true+
  • http://www.ma.hw.ac.uk/~markl/teaching/AUTOMATA/

你可能感兴趣的:(算法,UP,Blog)