ini文件解析器

自动机是文本匹配文本解析的利器,这里仿造参考文献[1],实现一个ini配置文件解析器,状态机在处理文本解析的工作过程是这样的,不断读取输入的字符,根据当前的状态对字符进行处理,处理的过程主要包括状态的转换等动作,知道处理完毕全部的输入字符。

一般ini文件格式如下:

;this is comment

[section1]

aa = 1

bb = 2

[section2]

cc = 3

dd = 4

在ini文件解析的过程中,共涉及到一下几个状态:

开始状态:是一初始的状态

SectionState:进入到某个section label的状态

KeyState:进入到处理key的状态

ValueState:进入到处理value的状态

CommentStae:进入注释状态

状态转换过程为:

开始状态:

  读入'[',进入SectionState

  读入字母数字字符,进入KeyState

   读入';',进入CommentState

SectionState状态:

  读入']',返回开始状态

KeyState状态:

  读入'=',截取key,并进入ValueState状态

ValueState状态:

  读入‘\n',截取value,并进入初始状态;

CommentState状态:

  读入'\n',进入初始状态

下面是完整程序:

#include <stdio.h>
#include <map>
#include <string>
bool IsAlphabet(char c) {
  if (c >= 'a' && c <= 'z' ||
      c >= 'A' && c <= 'Z' ||
      c >= '0' && c <= '9')
    return true;
  else
    return false;
}
bool IsCommentStart(char c) {
  if (c == ';' || c == '#') {
    return true;
  } else {
    return false;
  }
}
bool IsSectionLabelStart(char c) {
  if (c == '[') {
    return true;
  } else {
    return false;
  }
}
bool IsSectionLabelEnd(char c) {
  if (c == ']') {
    return true;
  } else {
    return false;
  }
}
bool IsKeyEnd(char c) {
  if (c == '=') {
    return true;
  } else {
    return false;
  }
}
bool IsValueEnd(char c) {
  if (c == '\n') {
    return true;
  } else {
    return false;
  }
}
bool IsCommentEnd(char c) {
  if (c == '\n') {
    return true;
  } else {
    return false;
  }
}
bool ParseInit(const std::string& init_buffer, std::map<std::string, std::string>* properties) {
  enum ParseState {
    StartState,
    SectionLabelState,
    KeyState,
    ValueState,
    CommentState
  };
  int offset = 0;
  int start_offset;
  std::string key;
  std::string value;
  ParseState parse_state = StartState;
  while (offset < init_buffer.size()) {
    switch (parse_state) {
      case StartState:
        if (IsSectionLabelStart(init_buffer[offset])) {
          parse_state = SectionLabelState;
          break;
        }
        if (IsAlphabet(init_buffer[offset])) {
          parse_state = KeyState;
          start_offset = offset;
          break;
        }
        if (IsCommentStart(init_buffer[offset])) {
          parse_state = CommentState;
          break;
        }        
        break;
      case SectionLabelState:
        if (IsSectionLabelEnd(init_buffer[offset])) {
          parse_state = StartState;
          break;
        }
        break;
      case KeyState:
        if (IsKeyEnd(init_buffer[offset])) {
          parse_state = ValueState;
          key = init_buffer.substr(start_offset, offset - start_offset);
          start_offset = offset + 1;
          break;
        }
        break;
      case ValueState:
        if (IsValueEnd(init_buffer[offset])) {
          parse_state = StartState;
          value = init_buffer.substr(start_offset, offset - start_offset);
          (*properties)[key] = value;
          break;
        }
        break;
      case CommentState:
        if (IsCommentEnd(init_buffer[offset])) {
          parse_state = StartState;
          break;
        }
        break;
      default:
        break;
    }
    offset++;
  }
  if (parse_state == ValueState) {
    value = init_buffer.substr(start_offset, offset - start_offset + 1);
    (*properties)[key] = value;
  } 
}
int main(int argc, char** argv) {
  std::string init_buffer= "  [section1]  aa = 1 \n bb = 2 \n [section2] \n cc = 3 \n [section3] \n dd = 4 \n ff = 5\n";
  std::map<std::string, std::string> properties;
  ParseInit(init_buffer, &properties);
  std::map<std::string, std::string>::iterator it = properties.begin();
  for (; it != properties.end(); ++it) {
    printf("key: %s, value %s \n", it->first.c_str(), it->second.c_str());
  }
}

为了提供足够的灵活性,我们为条件的判断使用函数来封装,使得修改更加方便。

参考文献

[1]系统程序员成长计划 P188

你可能感兴趣的:(c,properties,String,iterator,ini,buffer)