C#写的一个词法分析器(编译原理)

        最近编译原理课老师要求做一个词法分析器,现在正在学习C#,所以就用C#做了一个玩玩,初步验证了一下,应该符合老师的要求啦,在这里把代码写出来大家看看啦,有什么不对的地方大家多多指教啊! 

        首先新建了一个C#windows应用程序项目,我的命名为WordAnalysis,

        先说一下老师的要求是怎么样的啦,主要是分析一个类似于Pascal语言的语句,书上要求比较简单,只要求识别DIM,IF,DO,STOP,END,INT关键字、变量(长度不能超过8位)、数字、运算符(=,*,**,+) 和逗号以及括号。分析结果要求以二元式的形式,所以我们要用到结构体来定义一个二元式了。结构体的定义代码如下:

        struct AnalysisResult
        {
            public AnalysisResult(string HelpStr, int code)
            {
                this.helpStr = HelpStr;
                this.AnsCode = code;
            }

            public override string ToString()
            {
                return "("+helpStr+","+AnsCode.ToString()+")";
            }

            public string helpStr;
            public int AnsCode;
        }

        这里我重写了这个结构体的ToString()方法,变为输出形如:($DIM,1)的形式。

        所以我们的二元式就是以下几种可能:
        ($SPACE,0) ($DIM,1) ($IF,2) ($DO,3) ($STOP,4) ($END,5) ($ID,6) ($INT,7) ($ASSIGH, 8)
        ($PLUS,9) ($STAR,10) ($POWER,11) ($COMMA,12) ($LPAR,13) ($RPAR,14) 
        ($ENTER,15) ($ERROR,16)      

        意思书上都有说明,根据英文意思也知道是识别什么的了,现在最重要的就是开始分析了,我这里采取的思路是:先把输入的语句按行分为若干个数组,之后再逐行分析,每行再按照空格划分为若干个数组,再对数组中的每个字符串进行分析,最后把结果输出,如果分析过程中发现了词法错误,那么在输出错误类型。在这里我们有三种错误类型:非法的标识符、错误的表达式、标识符长度最大为8,大家也可以自己再进行扩展。

        现在具体说说是如何对字符串进行分析的,我的分析思路可能有一些繁琐,大家也可以自己再进行优化,这里仅供参考。首先我们判断字符串的长度,如果为1那么很明显不可能是关键字,所以可以减小判断范围,如果为1:依次判断是否为数字、运算符、字母(变量名),否则输出错误信息(非法的标识符)。如何不为1,则进行以下分析:

        首先判断是否含有运算符,如果没有,那么则说明要判断的内容是一个整体,否则为一个表达式。如果没有运算符:依次判断是否为整数、关键字、变量名,否则输出错误信息(非法的标识符)。如果有运算符,则说明是一个表达式,其分析过程将会稍微复杂一些。分析过程我们单独说明:

        首先把要分析的内容转换为一个字符数组,然后从第一个字符开始分析,当我们分析到第n个字符为运算符时,则把前面n-1个字符看作一个整体进行分析,依次判断是否为数字、关键字、变量,否则输出错误信息。然后判断第n个字符是何种运算符,这里有个特殊情况,就是当其为 * 这个运算符时,我们应该再判断第n+1个是否为*,是则输出($POWER,11),再从n+2个字符开始分析,否则输出($STAR,10),再从第n+1个字符开始分析。这时重复前面的步骤,直到分析完所有字符。

         大致思路就是上面所说的,下面我把我实现的具体代码写出来给大家做个参考。

         新建一个类文件,名为:Analysis.cs,代码如下:

using System;
using System.Collections.Generic;
using System.Text;

namespace WordsAnalysis
{
    class Analysis
    {
        private string ansStr;
        private string[] ArrayByLine;
        private List errList;
        private List resultList;

        ///


        ///
        ///

        struct AnalysisResult
        {
            public AnalysisResult(string HelpStr, int code)
            {
                this.helpStr = HelpStr;
                this.AnsCode = code;
            }

            public override string ToString()
            {
                return "("+helpStr+","+AnsCode.ToString()+")";
            }

            public string helpStr;
            public int AnsCode;
        }
        ///


        /// 词法分析类构造方法
        ///

        public Analysis()
        {
            errList=new List();
            resultList = new List();
        }

        ///


        /// 词法分析类构造方法
        ///

        /// 要分析的内容
        public Analysis(string str)
        {
            errList = new List();
            resultList = new List();
            this.ansStr=str;
        }

        ///


        /// 要分析的内容
        ///

        public string AnalysisString
        {
            get { return this.ansStr; }
            set { this.ansStr = value; }
        }

        ///


        /// 将输入内容按行分成数组
        ///

        /// 输入内容
        private void SplitByLine(string splitStr)
        {
            this.ArrayByLine=splitStr.Split('/n');
        }

        ///


        /// 将内容以空格划分成数组
        ///

        /// 要分组的内容
        /// 已经分好的数组
        private string[] SplitBySpace(string splitStr)
        {
            return splitStr.Split(' ');
        }

        ///


        /// 开始分析
        ///

        public void Anaslysis()
        {
            SplitByLine(ansStr);
            for (int i = 0; i < ArrayByLine.Length; i++)
            {
                string[] checkStrings = SplitBySpace(ArrayByLine[i]);
                foreach (string checkStr in checkStrings)
                {
                    if (checkStr != "")
                    {
                        check(checkStr, i + 1);
                        resultList.Add(new AnalysisResult("$SPACE", 0));
                    }
                }
                resultList.Add(new AnalysisResult("$ENTER", 15));
            }
        }

        ///


        /// 分析指定的代码段
        ///

        /// 代码段
        /// 行号
        private void check(string str, int LineCode)
        {
            if (str.Length == 1)
            {
                if(Char.IsNumber(str.ToCharArray()[0])==true)
                {
                    resultList.Add(new AnalysisResult("$INT",7));
                    return;
                }
                else if(CheckOperend(str)!=0)
                {
                    int n = CheckOperend(str);
                    switch (n)
                    {
                        case 8: resultList.Add(new AnalysisResult("$ASSIGH", 8)); break;
                        case 9: resultList.Add(new AnalysisResult("$PLUS", 9)); break;
                        case 10: resultList.Add(new AnalysisResult("$STAR", 10)); break;
                        case 12: resultList.Add(new AnalysisResult("$COMMA", 12)); break;
                        case 13: resultList.Add(new AnalysisResult("$LPAR", 13)); break;
                        case 14: resultList.Add(new AnalysisResult("$RPAR", 14)); break;
                    }
                    return;
                }
                else if (Char.IsLetter(str, 0) == true)
                {
                    resultList.Add(new AnalysisResult("$ID", 6));
                    return;
                }
                else
                {
                    errList.Add(errMessage(LineCode, 1, str));
                    resultList.Add(new AnalysisResult("$ERROR", 16));
                    return;
                }
               
            }
            if (HasOperend(str) == false)
            {
                if (IsNumeric(str.ToCharArray()) == true)
                {
                    resultList.Add(new AnalysisResult("$INT", 7));
                    return;
                }
                else if (CheckKeepWord(str.ToUpper()) != 0)
                {
                    int n = CheckKeepWord(str.ToUpper());
                    switch (n)
                    {
                        case 1: resultList.Add(new AnalysisResult("$DIM", 1)); break;
                        case 2: resultList.Add(new AnalysisResult("$IF", 2)); break;
                        case 3: resultList.Add(new AnalysisResult("$DO", 3)); break;
                        case 4: resultList.Add(new AnalysisResult("$STOP", 4)); break;
                        case 5: resultList.Add(new AnalysisResult("$END", 5)); break;
                    }
                    return;
                }
                else if (IsID(str.ToCharArray()) == true)
                {
                    if (str.Length <= 8)
                    {
                        resultList.Add(new AnalysisResult("$ID", 6));
                        return;
                    }
                    else
                    {
                        errList.Add(errMessage(LineCode, 3, str));
                        return;
                    }
                }
                else
                {
                    errList.Add(errMessage(LineCode, 1, str));
                    return;
                }
            }
            else if(HasOperend(str)==true)
            {
                char[] chars = str.ToCharArray();
                int k = 0;
                for (int i = 0; i < chars.Length; i++)
                {
                    if(IsOperend(chars[i])==true)
                    {
                        if ((i - k) == 0)
                        {
                            int n = CheckOperend(chars[i]);
                            switch (n)
                            {
                                case 8: resultList.Add(new AnalysisResult("$ASSIGH", 8)); break;
                                case 9: resultList.Add(new AnalysisResult("$PLUS", 9)); break;
                                case 10:
                                    {
                                        try
                                        {
                                            char power = chars[i + 1];
                                            if (power == '*')
                                            {
                                                resultList.Add(new AnalysisResult("$POWER", 11));
                                                i++;
                                            }
                                            else
                                            {
                                                resultList.Add(new AnalysisResult("$STAR", 10));
                                            }
                                        }
                                        catch
                                        {
                                            resultList.Add(new AnalysisResult("$STAR", 10));
                                        }
                                        break;
                                    }
                                case 12: resultList.Add(new AnalysisResult("$COMMA", 12)); break;
                                case 13: resultList.Add(new AnalysisResult("$LPAR", 13)); break;
                                case 14: resultList.Add(new AnalysisResult("$RPAR", 14)); break;
                                default: break;
                            }
                        }
                        else
                        {
                            char[] tempChar = TempChar(chars, k, i);
                            if (tempChar.Length == 1)
                            {
                                if (Char.IsNumber(tempChar[0]) == true)
                                {
                                    resultList.Add(new AnalysisResult("$INT", 7));
                                }
                                else if (Char.IsLetter(tempChar[0]) == true)
                                {
                                    resultList.Add(new AnalysisResult("$ID", 6));
                                }
                                else
                                {
                                    errList.Add(errMessage(LineCode, 2, str));
                                }
                                int n = CheckOperend(Convert.ToString(chars[i]));
                                switch (n)
                                {
                                    case 8: resultList.Add(new AnalysisResult("$ASSIGH", 8)); break;
                                    case 9: resultList.Add(new AnalysisResult("$PLUS", 9)); break;
                                    case 10:
                                        {
                                            try
                                            {
                                                char power = chars[i + 1];
                                                if (power == '*')
                                                {
                                                    resultList.Add(new AnalysisResult("$POWER", 11));
                                                    i++;
                                                }
                                                else
                                                {
                                                    resultList.Add(new AnalysisResult("$STAR", 10));
                                                }
                                            }
                                            catch
                                            {
                                                resultList.Add(new AnalysisResult("$STAR", 10));
                                            }
                                            break;
                                        }
                                    case 12: resultList.Add(new AnalysisResult("$COMMA", 12)); break;
                                    case 13: resultList.Add(new AnalysisResult("$LPAR", 13)); break;
                                    case 14: resultList.Add(new AnalysisResult("$RPAR", 14)); break;
                                }
                            }
                            else
                            {
                                if (CheckKeepWord(Convert.ToString(tempChar).ToUpper()) != 0)
                                {
                                    int n = CheckKeepWord(Convert.ToString(tempChar).ToUpper());
                                    switch (n)
                                    {
                                        case 1: resultList.Add(new AnalysisResult("$DIM", 1)); break;
                                        case 2: resultList.Add(new AnalysisResult("$IF", 2)); break;
                                        case 3: resultList.Add(new AnalysisResult("$DO", 3)); break;
                                        case 4: resultList.Add(new AnalysisResult("$STOP", 4)); break;
                                        case 5: resultList.Add(new AnalysisResult("$END", 5)); break;
                                    }
                                }
                                else if (IsID(tempChar) == true)
                                {
                                    resultList.Add(new AnalysisResult("$ID", 6));
                                }
                                else if (IsNumeric(tempChar) == true)
                                {
                                    resultList.Add(new AnalysisResult("$INT", 7));
                                }
                                else
                                {
                                    errList.Add(errMessage(LineCode, 2, str));
                                }
                                int x = CheckOperend(Convert.ToString(chars[i]));
                                switch (x)
                                {
                                    case 8: resultList.Add(new AnalysisResult("$ASSIGH", 8)); break;
                                    case 9: resultList.Add(new AnalysisResult("$PLUS", 9)); break;
                                    case 10:
                                        {
                                            if (tempChar[i + 1] == '*')
                                            {
                                                resultList.Add(new AnalysisResult("$POWER", 11));
                                                i++;
                                            }
                                            resultList.Add(new AnalysisResult("$STAR", 10));
                                            break;
                                        }
                                    case 12: resultList.Add(new AnalysisResult("$COMMA", 12)); break;
                                    case 13: resultList.Add(new AnalysisResult("$LPAR", 13)); break;
                                    case 14: resultList.Add(new AnalysisResult("$RPAR", 14)); break;
                                }
                            }
                        }
                        k = i + 1;
                    }
                }
                if (k == chars.Length - 1)
                {
                    char[] tempChar = TempChar(chars, k, chars.Length);
                    if (CheckKeepWord(Convert.ToString(tempChar).ToUpper()) != 0)
                    {
                        int n = CheckKeepWord(Convert.ToString(tempChar).ToUpper());
                        switch (n)
                        {
                            case 1: resultList.Add(new AnalysisResult("$DIM", 1)); break;
                            case 2: resultList.Add(new AnalysisResult("$IF", 2)); break;
                            case 3: resultList.Add(new AnalysisResult("$DO", 3)); break;
                            case 4: resultList.Add(new AnalysisResult("$STOP", 4)); break;
                            case 5: resultList.Add(new AnalysisResult("$END", 5)); break;
                        }
                    }
                    else if (IsID(tempChar) == true)
                    {
                        resultList.Add(new AnalysisResult("$ID", 6));
                    }
                    else if (IsNumeric(tempChar) == true)
                    {
                        resultList.Add(new AnalysisResult("$INT", 7));
                    }
                    else
                    {
                        errList.Add(errMessage(LineCode, 2, str));
                    }
                }
                return;
            }
        }

        ///


        /// 检查是否为数字
        ///

        /// 要检查的内容
        /// 是:true,否:false
        private bool IsNumeric(char[] chars)
        {
            foreach(char c in chars)
            {
                if (!Char.IsNumber(c))
                {
                    return false;
                }
            }
            return true;
        }

        ///


        /// 检查是否为数字
        ///

        /// 要检查的内容
        /// 是:true,否:false
        private bool IsOneNumeric(char c)
        {
            if (c >= '0' && c <= '9') return true;
            return false;
        }

        ///


        /// 检查是否为字母
        ///

        /// 要检查的内容
        /// 是:true,否:false
        private bool IsLetter(char c)
        {
            if (Char.IsLetter(c)) return true;
            return false;
        }

        ///


        /// 检查是否含有操作符
        ///

        /// 要检查的内容
        /// 含有:true,没有:false
        private bool HasOperend(string str)
        {
            int n = 0;
            char[] operends={'=','+','*',',','(',')'};
            n = str.IndexOfAny(operends);
            if (n >= 0)
                return true;
            return false;
        }

        ///


        /// 检查是否为操作符
        ///

        /// 要检查的内容
        /// 是:true,否:false
        private bool IsOperend(char c)
        {
            int n = 0;
            string str = Convert.ToString(c);
            char[] operends ={ '=', '+', '*', ',', '(', ')' };
            n = str.IndexOfAny(operends);
            if (n >=0)
                return true;
            return false;
        }

        ///


        /// 得到指定范围的新char类型数组
        ///

        /// 原始数组
        /// 起始位置
        /// 结束位置
        /// 新的长度为end-start的char类型数组
        private char[] TempChar(char[] chars, int start, int end)
        {
            char[] tempChar=new char[end-start];
            int n = 0;
            for (int i = start; i < end; i++)
            {
                tempChar[n]=chars[i];
                n++;
            }
            return tempChar;
        }

        ///


        /// 检查操作符类型
        ///

        /// 要检查的字符
        /// 操作符类型编码,为0则不是操作符
        private int CheckOperend(char c)
        {
            if (c == '=') return 8;
            else if (c == '+') return 9;
            else if (c == '*') return 10;
            else if (c == ',') return 12;
            else if (c == '(') return 13;
            else if (c == ')') return 14;
            else return 0;
        }

        ///


        /// 检查操作符类型
        ///

        /// 要检查的内容
        /// 操作符编码,为0则不是操作符
        private int CheckOperend(string s)
        {
            if (s == "=") return 8;
            else if (s == "+") return 9;
            else if (s == "*") return 10;
            else if (s == ",") return 12;
            else if (s == "(") return 13;
            else if (s == ")") return 14;
            else return 0;
        }

        private bool IsID(char[] c)
        {
            if (Char.IsLetter(c[0])==false&&c[0]!='_')
            {
                return false;
            }
            for (int i = 1; i <= c.Length; i++)
            {
                if (Char.IsLetter(c[0]) == false && Char.IsNumber(c[0]) == false && c[0] != '_')
                {
                    return false;
                }
            }
            return true;
        }

        ///


        /// 检查是否为保留字
        ///

        /// 要检查的内容
        /// 返回保留字编码,为0则不是保留字
        private int CheckKeepWord(char[] c)
        {
            string str = Convert.ToString(c).ToUpper();
            int n = 0;
            switch (str)
            {
                case "DIM": n= 1; break;
                case "IF": n= 2; break;
                case "DO": n= 3; break;
                case "STOP": n= 4; break;
                case "END": n= 5; break;
                default: n= 0; break;
            }
            return n;
        }

        ///


        /// 检查是否为保留字
        ///

        /// 要检查的内容
        /// 返回保留字编码,为0则不是保留字
        private int CheckKeepWord(string str)
        {
            int n = 0;
            switch (str)
            {
                case "DIM": n = 1; break;
                case "IF": n = 2; break;
                case "DO": n = 3; break;
                case "STOP": n = 4; break;
                case "END": n = 5; break;
                default: n = 0; break;
            }
            return n;
        }


        ///


        /// 显示错误信息
        ///

        /// 所有错误信息
        public string ShowError()
        {
            string error = "";
            string[] errs = errList.ToArray();
            foreach (string err in errs)
            {
                error += err + "/r/n";
            }
            return error;
        }

        ///


        /// 显示分析结果
        ///

        /// 所有分析结果
        public string ShowResult()
        {
            string result = "";
            AnalysisResult[] ans = resultList.ToArray();
            foreach (AnalysisResult an in ans)
            {
                if (an.AnsCode == 15)
                {
                    result += "/r/n";
                }
                else if(an.AnsCode==0)
                {
                    result += " ";
                }
                else
                {
                    result += an.ToString();
                }
            }
            return result;
        }

        ///


        /// 生成错误信息
        ///

        /// 错误行号
        /// 错误代号
        /// 错误内容
        /// 错误信息
        private string errMessage(int errLine, int errCode, string errStr)
        {
            string errClass="";
            if (errCode == 1)
            {
                errClass = "非法的标识符";
            }
            else if (errCode == 2)
            {
                errClass = "错误的表达式";
            }
            else if (errCode == 3)
            {
                errClass = "标识符长度最大为8";
            }
            resultList.Add(new AnalysisResult("$ERROR", 16));
            return "第"+errLine+"行 "+errClass+":"+errStr;
        }

    }
}



        界面控件如下:

C#写的一个词法分析器(编译原理)_第1张图片

很明显是三个textbox控件,分别名为:InputTxt,AnsResultTxt,和errTxt,后两个为只读控件,一个Button按钮名为OKBtn,基本界面就是这样了。大家也可以根据要求自己改。然后在OKBtn的Click事件中写入以下代码:

            Analysis ans = new Analysis();
            ans.AnalysisString = InputTxt.Text.ToString();
            ans.Anaslysis();
            AnsResultTxt.Text = ans.ShowResult();
            errTxt.Text = ans.ShowError();

之后就可以运行测试了。

总的就是这样了,大家试试看,有什么问题请大家及时提出,谢谢!

你可能感兴趣的:(我的技术文章,C#)