在生活中不可避免会出现色情视频,因此视频的鉴定就成为了我们需要解决的问题,本博客在NSFW项目(见下面参考文献)的基础上面改进了封装,用来检测视频是否是色情视频。首先,这个项目是基于Caffe 的,使用的网络结构是ResNet网络(可以查看参考文献中的论文)。
为了完成对视频的检测,博主使用了FFMPEG,用来从视频中提取帧数,每20秒提取一次图像,当然为了检测更加精确,可以在后面修改间隔时间。
检测分为三个等级,score < 0.2 的表示很安全,socre > 0.8 的表示有很大的可能性是色情的。
最后程序输出:
总共提取检测视频中的图像帧数
socre < 0.2 很安全 safe的数量,占的比重
score >= 0.2 && score <= 0.8 medimum , 介于危险和安全之间的数量,比重
score > 0.8 dangerous, 有很大可能性是色情占的比重
最后我们可以根据dangerous 占的比重可以确定视频是否是色情视频。
废话不多说,下面进入实战环节。
首先就是安装ffmpeg,由于我使用的是Ubuntu 14的系统,安装这个的时候着实费了一点功夫,所幸终于找到了一个好用的源安装成功。
sudo add-apt-repository ppa:mc3man/trusty-media
sudo apt-get update
sudo apt-get install ffmpeg gstreamer0.10-ffmpeg
接下来就是安装caffe了,如果你没有安装过,也没有关系,使用docker就可以了。
安装docker容器:
这里就不多说了。
安装caffe 容器 CPU版:
docker build -t caffe:cpu https://raw.githubusercontent.com/BVLC/caffe/master/docker/cpu/Dockerfile
查看caffe版本:
docker run caffe:cpu caffe --version
下载open_nsfw:
git clone https://github.com/yahoo/open_nsfw
进入工作目录:
cd open_nsfw
在这里需要说明的是启动docker的时候需要我们把工作目录挂载到docker中,例如:
docker run -ti --volume={}:/workspace caffe:cpu bash".format("/home/duan/open_nsfw/
这里先不用着急运行这一步。
接下来就是视频帧提取代码了,每20秒提取一帧,存放于open_nsfw/picture文件夹下面:
# -*- encoding:utf-8 -*-
__date__ = "17/1/16"
__author__ = "duan"
import os
import shutil
from argparse import ArgumentParser
def video_to_frames(video_path, frames_path, step_size = 20):
if not os.path.exists(frames_path):
os.makedirs(frames_path)
else:
shutil.rmtree(frames_path)
os.makedirs(frames_path)
output_file = frames_path + "/out%05d.jpg"
print("ffmpeg -i {} -f image2 {}".format(video_path, output_file))
#extract an image every 20 seconds
# you can also set every 10 seconds, just set fps = fps = 1/10
os.system("ffmpeg -i {} -f image2 -vf fps=fps=1/{} {}".format(video_path, step_size, output_file))
if __name__ == '__main__':
parser = ArgumentParser()
parser.add_argument('--content',
dest='content', help='content image',
metavar='CONTENT', required=True)
parser.add_argument('--step', type=int, default = 20,
dest='step', help='the video step you want use',
metavar='STEP')
options = parser.parse_args()
video_name = options.content # the video name you want to detect
step_size = options.step # the video step you want to use
#video_name = "1994.mp4" # the video name you want to detect
video_path = "./" # the video path, i put the video at current folder
frames_path = "picture"
video_to_frames(video_path + video_name, frames_path, step_size)
# start the docker and set the workspace as "/home/duan/open_nsfw"
# set as your own path
#launch the docker
os.system("docker run -ti --volume={}:/workspace caffe:cpu bash -c \"python video_detect.py\"".format("/home/duan/open_nsfw/"))
python video_detect.py
# -*- encoding:utf-8 -*-
__date__ = "17/1/16"
__author__ = "duan"
import os
import shutil
frames_path = "picture"
files= os.listdir(frames_path)
results = []
import video_nsfw
safe = 0.0
median = 0.0
dangerous = 0.0
for file in files:
if not os.path.isdir(file):
res = video_nsfw.detact("nsfw_model/deploy.prototxt", "nsfw_model/resnet_50_1by2_nsfw.caffemodel", frames_path + "/" + file)
if res < 0.2:
safe += 1
elif res < 0.8:
median += 1
else:
dangerous += 1
results.append(res)
print(len(results))
print("safe count: {}, proportion: {}%".format(safe, round(safe / len(results) * 100, 3)))
print("median count: {}, proportion: {}%".format(median, round(median / len(results) * 100, 3)))
print("dangerous count: {}, proportion: {}%".format(dangerous, round(dangerous / len(results) * 100, 3)))
# -*- encoding:utf-8 -*-
__date__ = "17/1/16"
__author__ = "duan"
import os
import shutil
import numpy as np
import os
import sys
import argparse
import glob
import time
from PIL import Image
from StringIO import StringIO
import caffe
def resize_image(data, sz=(256, 256)):
"""
Resize image. Please use this resize logic for best results instead of the
caffe, since it was used to generate training dataset
"""
img_data = str(data)
im = Image.open(StringIO(img_data))
if im.mode != "RGB":
im = im.convert('RGB')
imr = im.resize(sz, resample=Image.BILINEAR)
fh_im = StringIO()
imr.save(fh_im, format='JPEG')
fh_im.seek(0)
return bytearray(fh_im.read())
def caffe_preprocess_and_compute(pimg, caffe_transformer=None, caffe_net=None,
output_layers=None):
"""
Run a Caffe network on an input image after preprocessing it to prepare
it for Caffe.
"""
if caffe_net is not None:
# Grab the default output names if none were requested specifically.
if output_layers is None:
output_layers = caffe_net.outputs
img_data_rs = resize_image(pimg, sz=(256, 256))
image = caffe.io.load_image(StringIO(img_data_rs))
H, W, _ = image.shape
_, _, h, w = caffe_net.blobs['data'].data.shape
h_off = max((H - h) / 2, 0)
w_off = max((W - w) / 2, 0)
crop = image[h_off:h_off + h, w_off:w_off + w, :]
transformed_image = caffe_transformer.preprocess('data', crop)
transformed_image.shape = (1,) + transformed_image.shape
input_name = caffe_net.inputs[0]
all_outputs = caffe_net.forward_all(blobs=output_layers,
**{input_name: transformed_image})
outputs = all_outputs[output_layers[0]][0].astype(float)
return outputs
else:
return []
def detact(model_def, pretrained_model, input_file):
pycaffe_dir = os.path.dirname(__file__)
#args = parser.parse_args()
image_data = open(input_file).read()
# Pre-load caffe model.
nsfw_net = caffe.Net(model_def, # pylint: disable=invalid-name
pretrained_model, caffe.TEST)
# Load transformer
# Note that the parameters are hard-coded for best results
caffe_transformer = caffe.io.Transformer({'data': nsfw_net.blobs['data'].data.shape})
caffe_transformer.set_transpose('data', (2, 0, 1)) # move image channels to outermost
caffe_transformer.set_mean('data', np.array([104, 117, 123])) # subtract the dataset-mean value in each channel
caffe_transformer.set_raw_scale('data', 255) # rescale from [0, 1] to [0, 255]
caffe_transformer.set_channel_swap('data', (2, 1, 0)) # swap channels from RGB to BGR
# Classify.
scores = caffe_preprocess_and_compute(image_data, caffe_transformer=caffe_transformer, caffe_net=nsfw_net, output_layers=['prob'])
# Scores is the array containing SFW / NSFW image probabilities
# scores[1] indicates the NSFW probability
print("NSFW score: " , scores[1])
return scores[1]
最后运行的时候运行:
python launch_video_detact.py --content 1995.mp4 --step 20
step选项是隔几秒提取的帧数,可以省略,默认20。当然最后的效果也与这个的选取有关。
本文最后检测了一下《肖申克的救赎》,实验结果如下:
总共以每隔20秒的时间提取视频,共检测429帧,因此可以以93.7%的概率确定《肖申克的救赎》非常安全。
当然你可以自己检测色情视频,嘿嘿。
转载请注明:转载自 http://blog.csdn.net/willduan1/article/details/54577351
------------------------EOF------------------------------
参考文献:
https://github.com/yahoo/open_nsfw/blob/master/README.md
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Deep residual learning for image recognition” arXiv preprint arXiv:1512.03385 (2015).
Simonyan, Karen, and Andrew Zisserman. “Very deep convolutional networks for large-scale image recognition.”; arXiv preprint arXiv:1409.1556(2014).
Iandola, Forrest N., Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, and Kurt Keutzer. “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and 1MB model size.”; arXiv preprint arXiv:1602.07360 (2016).
He, Kaiming, and Jian Sun. “Convolutional neural networks at constrained time cost.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5353-5360. 2015.
Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet,Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. “Going deeper with convolutions” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9. 2015.
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks” In Advances in neural information processing systems, pp. 1097-1105. 2012.