文章的内容与教程:Java模拟定时生成日志到文件 相类似,此文章用Python语言来实现。
a. 新建一个generate_log.py
文件,添加下面代码:
#coding=UTF-8
import random
import time
url_paths = [
"article/112.html",
"article/113.html",
"article/114.html",
"article/115.html",
"article/116.html",
"article/117.html",
"article/118.html",
"article/119.html",
"video/821",
"tag/list"
]
ip_splices = [102,71,145,33,67,54,164,121]
http_referers = [
"https://www.baidu.com/s?wd={query}",
"https://www.sogou.com/web?query={query}",
"https://cn.bing.com/search?q={query}",
"https://search.yahoo.com/search?p={query}"
]
search_keyword = [
"复制粘贴玩大数据",
"Bootstrap全局css样式的使用",
"Elasticsearch的安装(windows)",
"Kafka的安装及发布订阅消息系统(windows)",
"window7系统上Centos7的安装",
"复制粘贴玩大数据系列教程说明",
"Docker搭建Spark集群(实践篇)"
]
status_codes = ["200","404","500"]
def sample_url():
return random.sample(url_paths,1)[0]
def sample_ip():
splice = random.sample(ip_splices,4)
return ".".join([str(item) for item in splice])
def sample_referer():
if random.uniform(0, 1) > 0.2:
return "-"
refer_str = random.sample(http_referers, 1)
query_str = random.sample(search_keyword, 1)
return refer_str[0].format(query=query_str[0])
def sample_status_code():
return random.sample(status_codes, 1)[0]
def generate_log(count = 10):
time_str = time.strftime("%Y-%m-%d %H:%M:%S",time.localtime())
f = open("D:\\python-logs.txt","w+")
while count >= 1:
generate_log = "{ip}\t{localtime}\t \"GET /{url} HTTP/1.1 \" \t{referer}\t{code}".format(
url = sample_url(),
ip = sample_ip(),
referer = sample_referer(),
code = sample_status_code(),
localtime = time_str
)
print generate_log
f.write(generate_log + "\n")
count = count - 1
# URL IP信息 referer 状态码 日志访问时间
if __name__ == '__main__':
generate_log(100)
a. crontab的介绍
用法请参考网站:http://tool.lu/crontab
网站有三种类型的用法: Linux
Java(Spring)
Java(Quartz)
此次以Linux为例,如每一分钟执行一次的crontab表达式:
*/1 * * * *
CRON表达式:
*/1 * * * *
接下来7次的执行时间:
2019-12-30 00:12:00
2019-12-30 00:13:00
2019-12-30 00:14:00
说明:
Linux
* * * * *
- - - - -
| | | | |
| | | | +----- day of week (0 - 7) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
| | | +---------- month (1 - 12) OR jan,feb,mar,apr ...
| | +--------------- day of month (1 - 31)
| +-------------------- hour (0 - 23)
+------------------------- minute (0 - 59)
PS:Linux最快频率只能是分钟级别,如果需要更小,可自己搜索资料解决。
b. Python模拟日志生成使用方法
vi /home/hadoop-sny/shell/log_generator.sh
添加内容:
python /home/hadoop-sny/shell/generate_log.py
赋予日志生成脚本执行权限
chmod u+x log_generator.sh
将generate_log.py
放于/home/hadoop-sny/shell
文件夹中
c. 定时器的设定
crontab -e
然后加入内容:
*/1 * * * * /home/hadoop-sny/shell/log_generator.sh
如果需要修改路径,可修改此行代码:
f = open("D:\\python-logs.txt","w+")
示例:
f = open("/home/hadoop-sny/logs/access.log","w+")
此时可以输date
以查看当前系统的时间
观察一分钟后,所设置的文件是否有日志生成:
tail -200f /home/hadoop-sny/logs/access.log
a. 执行脚本,则可生成100条日志,路径为;D:\\python-logs.txt
:
python generate_log.py
生成了一百条日志:
作者简介:邵奈一
全栈工程师、市场洞察者、专栏编辑
| 公众号 | 微信 | 微博 | CSDN | 简书 |
福利:
邵奈一的技术博客导航
邵奈一 原创不易,如转载请标明出处。