Python正则提取日志内容

日志格式如下

[main] INFO com.jzdata.press.core.PressTest - select cs_bill_customer_sk,count(*) from catalog_sales where cs_item_sk =2  group by cs_bill_customer_sk order by cs_bill_customer_sk limit 100; true  2640
[main] INFO com.jzdata.press.core.PressTest - select cs_bill_customer_sk,count(*) from catalog_sales where cs_item_sk =16  group by cs_bill_customer_sk order by cs_bill_customer_sk limit 100; true  282
[main] INFO com.jzdata.press.core.PressTest - select cs_bill_customer_sk,count(*) from catalog_sales where cs_item_sk =13  group by cs_bill_customer_sk order by cs_bill_customer_sk limit 100; true  291
[main] INFO com.jzdata.press.core.PressTest - select cs_bill_customer_sk,count(*) from catalog_sales where cs_item_sk =11  group by cs_bill_customer_sk order by cs_bill_customer_sk limit 100; true  320


需要提取cs_item_sk 以1结尾的并且最后是true的值
代码如下

import re

string = r'cs_item_sk[\s=]*(\d*?1+)\s+.+?true\s*(\d+)$'
# string = r'cs_item_sk'
pattern = re.compile(string)

with open('./src.txt', 'r') as f:
    for line in f.readlines():
        line = line.strip()
        # line = 'where cs_item_sk =997'
        m = pattern.search(line)
        if m is not None:
            print(m.groups())

你可能感兴趣的:(Python正则提取日志内容)