I'm writing a log collection / analysis application in Python and I need to write a "rules engine" to match and act on log messages.
It needs to feature:
Regular expression matching for the message itself
Arithmetic comparisons for message severity/priority
Boolean operators
I envision An example rule would probably be something like:
(message ~ "program\\[\d+\\]: message" and severity >= high) or (severity >= critical)
I'm thinking about using PyParsing or similar to actually parse the rules and construct the parse tree.
The current (not yet implemented) design I have in mind is to have classes for each rule type, and construct and chain them together according to the parse tree. Then each rule would have a "matches" method that could take a message object return whether or not it matches the rule.
Very quickly, something like:
class RegexRule(Rule):
def __init__(self, regex):
self.regex = regex
def match(self, message):
return self.regex.match(message.contents)
class SeverityRule(Rule):
def __init__(self, operator, severity):
self.operator = operator
def match(self, message):
if operator == ">=":
return message.severity >= severity
# more conditions here...
class BooleanAndRule(Rule):
def __init__(self, rule1, rule2):
self.rule1 = rule1
self.rule2 = rule2
def match(self, message):
return self.rule1.match(message) and self.rule2.match(message)
These rule classes would then be chained together according to the parse tree of the message, and the match() method called on the top rule, which would cascade down until all the rules were evaluated.
I'm just wondering if this is a reasonable approach, or if my design and ideas are way totally out of whack? Unfortunately I never had the chance to take a compiler design course or anything like that in Unviersity so I'm pretty much coming up with this stuff of my own accord.
Could someone with some experience in these kinds of things please chime in and evaluate the idea?
EDIT:
Some good answers so far, here's a bit of clarification.
The aim of the program is to collect log messages from servers on the network and store them in the database. Apart from the collection of log messages, the collector will define a set of rules that will either match or ignore messages depending on the conditions and flag an alert if necessary.
I can't see the rules being of more than a moderate complexity, and they will be applied in a chain (list) until either a matching alert or ignore rule is hit. However, this part isn't quite as relevant to the question.
As far the syntax being close to Python syntax, yes that is true, however I think it would be difficult to filter the Python down to the point where the user couldn't inadvertently do some crazy stuff with the rules that was not intended.
解决方案
Do not invent yet another rules language.
Either use Python or use some other existing, already debugged and working language like BPEL.
Just write your rules in Python, import them and execute them. Life is simpler, far easier to debug, and you've actually solved the actual log-reading problem without creating another problem.
Imagine this scenario. Your program breaks. It's now either the rule parsing, the rule execution, or the rule itself. You must debug all three. If you wrote the rule in Python, it would be the rule, and that would be that.
"I think it would be difficult to filter the Python down to the point where the user couldn't inadvertently do some crazy stuff with the rules that was not intended."
This is largely the "I want to write a compiler" argument.
1) You're the primary user. You'll write, debug and maintain the rules. Are there really armies of crazy programmers who will be doing crazy things? Really? If there is any potential crazy user, talk to them. Teach Them. Don't fight against them by inventing a new language (which you will then have to maintain and debug forever.)
2) It's just log processing. There's no real cost to the craziness. No one is going to subvert the world economic system with faulty log handling. Don't make a small task with a few dozen lines of Python onto a 1000 line interpreter to interpret a few dozen lines of some rule language. Just write the few dozen lines of Python.
Just write it in Python as quickly and clearly as you can and move on to the next project.