前言:
AWVS web漏洞扫描工具报告默认只能输出pdf与html类型。但现有需求要excel类型的报告。开始使用AWVS api 接口从AWVS scan里爬取漏洞数据输出EXCEL,但此方法有诸多不便之处。巧在近期对Python爬虫技术领域学有小成,得以用武之地。
0x01
所用到的库:bs4 openpyxl re os
爬取AWVS报告的类型:Affected_Items
0x02
为方便批量提取,在这里定义一个函数来判断文件类型,再定义一个函数遍历html文件并放到beautifulsoul里面解析,这里使用lxml解析器:
def endWith(file,*endstring):
array = map(file.endswith,endstring)
if True in array:
return True
else:
return False
def openFile():
file = os.listdir('.')
for k in range(len(file)):
if endWith(file[k],'.html'):
soup = BeautifulSoup(open(file[k], mode='r', encoding='utf-8'), 'lxml')
get_detail(soup)
0x03
对页面进行分析,发现每个漏洞的数据都单独存放在一个table标签里,使用find_all传入一个函数来定位这些table标签。再使用css选择器语法在每个table标签里取子节点tr、td 放到字典里。
def has_border_but_no_class(tag):
return tag.has_attr('border') and not tag.has_attr('class')
def get_detail(soup):
vulnerabilities = {
}
tables = soup.find_all(has_border_but_no_class)
url = soup.select(".ax-scan-summary > tbody > tr:nth-of-type(3) > td:nth-of-type(2)")[0].string
len(list(tables))
for table in tables:
scan_url = url
vl_path = table.select('tr > td > b')[0].string.strip()
vl_name = table.select('tr:nth-of-type(2) > td > b')[1].string.strip()
vl_severity = table.select('tr:nth-of-type(3) > td:nth-of-type(2)')[0].string.strip()
vl_description = table.select('tr:nth-of-type(4) > td:nth-of-type(2)')[0].get_text().strip()
vl_detail = table.select('tr:nth-of-type(7) > td:nth-of-type(2)')[0].get_text().strip()
vl_post = table.select('tr:nth-of-type(8) > td')[0].string
vl_recommendations = table.select('tr:nth-of-type(5) > td:nth-of-type(2)')[0].get_text().strip()
vulnerabilities['url'] = scan_url
vulnerabilities['path'] = vl_path
vulnerabilities['name'] = vl_name
vulnerabilities['severity'] = vl_severity
vulnerabilities['vl_description'] = vl_description
vulnerabilities['detail'] = vl_detail
vulnerabilities['post'] = vl_post
vulnerabilities['recommendtions'] = vl_recommendations
write_xlsx(vulnerabilities)
def write_xlsx(vulnerabilities):
wb = ws.load_workbook("AwvsReport.xlsx")
sheet1 = wb['Sheet']
num = sheet1.max_row
sheet1.cell(row = num+1, column=1, value=vulnerabilities['url'])
sheet1.cell(row = num+1,column = 2,value = vulnerabilities['name'])
sheet1.cell(row = num+1,column = 3,value = vulnerabilities['path'])
sheet1.cell(row = num+1,column = 4,value = vulnerabilities['severity'])
sheet1.cell(row = num+1,column = 5,value = vulnerabilities['vl_description'])
sheet1.cell(row=num + 1, column=6, value=vulnerabilities['detail'])
sheet1.cell(row=num + 1, column=7, value=vulnerabilities['post'])
sheet1.cell(row=num + 1, column=8, value=vulnerabilities['recommendtions'])
wb.save("AwvsReport.xlsx")
def creat_xlsx():
s = 0
wb = ws.Workbook()
ws1 = wb.active
word=['风险目标','风险名称','风险地址','风险等级','风险描述','风险详细','风险请求','整改意见'] #风险参数
for i in word:
s = s + 1
ws1.cell(row =1,column = s,value = i)
wb.save("AwvsReport.xlsx")
GitHub地址:https://github.com/pppppig/AwvsReportConvert