Eudemonia_mia

Two-Stream Convolutional Networks for Action Recognition in Videos [Paper Part]

1.Contribution

propose a two-stream ConvNet architecture
- spatial & tmporal
ConvNet trained on multi-frame dense optical flow is able to achieve very good performance
multi-task learn
- can increase the amount of training data
- can improve the performance on both

2.Two-Stream

spatial stream
- action recognition from still video frames
temporal stream
- recognize action from motion in the form of dense optical flow
based on two pathway of human vision
- ventral stream
  - performs object recognition
- dorsal stream
  - recognize motion

3.Video

spatial
- in the form of individual frame appearance
- carry information about scenes and objects depicted in the video
temporal
- in the form of motion across the frames
- conveys the movement of the observer (the camera) and the objects

4.Spatial Stream ConvNet

operates on individual video frames
- effectively performing action recognition from still imagew
some actions are strongly associated with particular objects
- an image classification architecture

5.Optical Flow ConvNets

input
- formed by stacking optical flow displacement fields between several consecutive frames
- explicitly describes the motion between video frame
  - make the recognition easier
    - the network does not need to estimate motion

6.Mean Flow Subtraction

from each displacement field d we subtract its mean vector

7.Architecture

sample a 224×224×2L sub-volume from I and pass it to the net as input
hidden layer configuration same as the spatial net
testing is similar to the spatial ConvNet

8.Optical Flow Stacking

a dense optical flow can be seen as a set of displacement vector fields dt between the pairs of consecutive t & t+1
d_t(u,v)
-the displacement vector at the point(u,v) in the frame t1 which moves the point to the corresponding point in the following frame t+1
d_t^x & d_t^y
- the victor field of horizontal and vertical
- well suited to recognition using a convolutional network
w,h
- width and height frames of a video
I_T(u,v,2k-1) = d^x_T+k-1(u.v)
I_T(u,v,2k) = d^x_T+k-1(u.v)
- u=[1;w] v=[1,h] h=[1;L]
- a ConvNet input volume I_T∈R^{w×h×2L for an arbitary frame T}
  - 2L means input channel
- the channel I_T(u,v,c) store the displacement vector at the location(u,v)
for an arbitary point (u,v)，the channels I_T(u,v,c) encode the motion at the point over a sequence of L frames
- c=[i;2L]

9.Trajectory Stacking

sample along the motion trajectory
I_T(u,v,2k-1) = d^x_T+k-1(P_k)
I_T(u,v,2k) = d^y_T+k-1(P_k)
- u=[1;w] v=[1;h] k=[1;L]
- the input volume I_T₁ corresponding to a frame T
- P_k is the k-th point along the trajectory
  - start at the location(u,v) in the frame T
  - defined by the following recurrence relation
    - p₁=(u,v)
      p_k-1+d_T+k-2(P_k-1) (k>1)
      - I_T stores the vectors sampled at the locations P_k along the trajectory

10.Bi-directional Optical Flow

compute an additional set of displacement fields in the opposite direction
construct an input volume I_t by stacking L/2 forward flows between frame T and T+L/2 and L/2 backward flows between frames T-L/2 and T
the flow can be represented using either of the methods (1) and (2)

11.Relation of The Temporal ConvNet Architecture to Previous Representation

motion is explicitly represented using the optical flow displacement field computed based on the assumption of the intensity and smoothness of the flow

12.Visualisation of Learnt Convolutional Filters

first-layer convolutional filters learnt on 10 stacked optical flows
the visualisation is split into 96 columns and 20 rows
- each column corresponds to a filter
- each row corresponds to an input channel
each of the 96 filters has a spatial receptive field of 7×7 pixels,and spans components of 10 stacked optical flow displacement fields d
some filters compute spatial derivatives of the optical flow
- capture how motion changes with image location
- capture which generalises derivative-based hand-crafted descriptors
  - e.g. MBH
- other filters compute temporal derivatives
  - capture changes in motion over time

13.Multi-task Learning

combine several dataset
aim to learn a (video) representation not only be applicable to the task in question (e.g. HMDB-51 classification),but also to other tasks (e.g.UCF-101 classification)
- additional tasks act as a regulariser , and allow for the exploitation of additional training data
in our case, ConvNet architecture has two softmax classification layer on top of the last fully-connecter layer
- compute HMDB-H classification
- compute UCF-101 scores
- each of the layers is equipped with its own loss function
- the overall training loss is computer as the sum of the individual task’s losses
- the network weight derivatives can be found by back-propagation

14.Implementation details

ConvNets configuration
- all hidden weight layers use the rectification[ReLu] activation function
- max pooling is performed over 3×3 spatial windows with stride 2
- local response normalisation uses the same setting as 《ImageNet Classification with Deep Convolutional Neural Networks》
- different between spatial and temporal ConvNet configuration
  - remove the second nomalisation layer from the latter to reduce memory consumption
training
- spatial net training
  - a 224×224 sub-image is randomly cropped from the selecte frame, then undergoes random horizontal flipping and RGB jittering
  - videos are rescaled beforehand
  - the sub-image is sampled from the whole frame
- temporal net training
  - compute an optical flow volume I for the selected training frame form I, a fixed-size 224×224×2L input is randomly cropped and flipped
- learning rate
  - intially set to 10^-2, the decrease accroding to a fixed schedule, which is kept the same for all training set
  - change to 10^-3 after 50k iterations, training stop after 80k iterations
  - in fine-tuning, the rate is changed to 10^-3 after 14k iterations,stop after 20k iterations
- testing
  - sample a fixed number of frames(25) with equal temporal spacing between them
  - get 10 ConvNet inputs from each of the frames by cropping and flipping four corners and the center of the frame
  - class scores for the whole video are the obtained by averaging the scores across the sampled frames and crops therein
- pre-training on ImageNet ILSVRC-2012
  - pre-train the spatial ConvNet
  - use the same training and test data augmentation[cropping,flipping,RGB jittering]
  - sample from the whole image
- Multi-GPU training
  - derived from the Caffe, contain a lot of modification including parallel training on Multiple GPUs installed in a single system
  - exploit data parallelism ,and split each SGD Batch across several GPUs
    - 3.2 times speed up
- optical flow
  - using the off-the-shelf GPU implementation of 《High accuracy optical flow estimation based on a theory for warping》from the OpenCV toolbox
  - do pre-compute the flow before training
  - the horizontal and vertical components of the flow were linearly rescaled to a [0,225] range and compressed use JPEG to avoid storing the displacement field as float and can reduce the flow size for the UCF-101 dataset from 1.5TB to 27GB

15.Evalution

datasets and evalution protocol
- performed on UCF-101 and HMDB-51
  - UCF-101 contains 13k video
  - HMDB-51 contains 6.8k video of 51 actions
evalution protocol
- the organisers provide 3 splits into training and testing data
- the performance is measure by the mean classification accuracy across the splits
- UCF-101 contains 9.5k training video
- HMDB-51 contains 3.7k training video
- we begin by comparing different architectures on the first split of CUF-101 dataset
- follow the standard evalution protocol & report the average accuracy over three splits on both UCF-101 & HMDB-51
spatial ConvNet
- measure the performance of the spatial stream ConvNet
- choose training the last layer on top of a pre-trained ConvNet
temporal Convnet
- particular measure the effect of
  - using multiple[L={5,10}] stacked optical flows
  - trajectory stacking
  - mean displacement substraction
  - using the Bi-directional optical flow
- use an aggressive dropout ratio of 0.9 to help improve generalisation
  - results
    - stacking multiple(L>1) displacement in the field in the input is highly beneficial
      - it provides the network with long-term motion information
    - mean subtraction is helpful
      - reduce the effect of global motion between the frames
    - temporal ConvNet significantly outperform the spatial ConvNet
      - confirms the importance of motion information for action recognition
    - implement the “slow fusion” architecture of 《Large-scale video classifi-
      cation with convolutional neural networks》
      - amounts to applying a ConvNet to a stack of RGB frames
    - while multi-frame information is important, it is also important to present it to a ConvNet in an appropriate manner
    - multi-task learning of temporal ConvNets
      - train the ConvNet on HMDB-51 is different than on UCF-101
      - multi-task learning performs the best
        
        it allows the training procedure to exploit all available training data
        -two-stream ConvNet
        
        we evaluate the complete two-stream model
        
        combines the two recognition streams
        
        fuse the softmax scores using either averaging or a linear SVM to combine the network
      - conclude
        
        temporal and spatial recognition streams are complementary
        
        their fusion significantly improves on both
        
        6% over temporal and 14% over spatial nets
        
        SVM-based fusion of softmax scores outperforms fusion by averaging
        
        using Bi-directional flow is not beneficial in the case of ConvNet fusion
        
        temporal ConvNet trained using multi-task learning, performs the best both along and when fused with a spatial net

16.Comparison with the State of the Art

both our spatial and temporal nets alone outperform the deep architecture of 《Large-scale video classification with convolutional neural networks》and 《A large video database for human motion recognition》by a large margin
the combination of two nets
- further improves the results
- is comparable to very recent state-of-the-art hand-crafted models
confusion matrix and per-class recall for UCF-101 classification
- worst class is Hammering confused with HeadMessage class and BrushingTeeth class
- reason
  - spatial ConvNet confuses Hammering with Headmaessage, which can be caused by the significant presence of human faces in both class
  - the temporal ConvNet confuses Hammering with BrushingTeeth as both actions contain recurring motion patterns
    - hand moving up and down

17.Conclusion

proposed a deep video classification model with competitive performance, which incorporates separate spatial and temporal recognition streams based on ConvNets
- training a temporal ConvNet on optical flow is significantly better than training on raw stacked frames
- our temporal model does not require significant hand-crafting , despite using optical flow as input
  - since the flow is computed using a method based on the generic assumptions of constancy and smoothness
extra training data poses a significant challenge on its own
-due to the gigantic amount of training data
- multiple TBs
- essential ingredients of the state of art missed in our current architecture
  - local feature pooling over spatio-temporal tubes, centered at the trajectories
    - even though the input(2) captures the optical flow along the trajectories the spatial pooling in our network does not take the trajectories into account
    - explicit handing of camera motion, which in our case is compensated by mean displacement subtraction

广告你哦哦
图片1头http://120.77.37.40/yhml/xiaode/lun1.jpg图片1尾图片2头http://pic2016.ytqmx.com:82/2017/0220/35/06.jpg!960.jpg图片2尾图片3头http://pic2016.ytqmx.com:82/2017/0220/35/15.jpg!960.jpg图片3尾广告1头http://wm.video.baomih
大数据真实面试题---SQL The博宇大数据面试题——SQL 大数据 mysql sql 数据库 big data
视频号数据分析组外包招聘笔试题时间限时45分钟完成。题目根据3张表表结构，写出具体求解的SQL代码（搞笑品类定义：视频分类或者视频创建者分类为“搞笑”）1、表创建语句：createtablet_user_video_action_d(dsint,user_idstring,video_idstring,action_typeint,`timestamp`bigint)rowformatdelimi
Aiseesoft Mac Video Converter Ultimate for Mac(视频转换工具) 過客_fad6
AiseesoftMacVideoConverterUltimateforMac版是一款出色的视频转换工具，允许用户轻松转换和编辑包括4KUHD视频以及在Mac上自制DVD。其惊人的快速转换速度，高质量的输出始终使其成为视频转换器软件的完美选择。本站为您推荐AiseesoftMacVideoConverterUltimateforMac破解版，下载安装即可使用！AiseesoftMacVideoC
2023-5-6晨间日记深海未眠夜未央
今天是什么日子起床：6:00就寝：10:00天气：sunny心情：justsoso纪念日：no任务清单昨日完成的任务，最重要的三件事：改进：1.workedhard2.wrotethediaryinthemorning3.watchedvideosaboutforeignalcultures习惯养成：1.readtwoarticles2.watchedvideosaboutforeignalcul
「经济学人」Streaming-video wars 英语学习社
GameofphonesHBOwillleadAT&T’schallengetoNetflixTimeWarner’scrownjewelmustscaleupwhilemaintainingqualityINLATE2012,justbeforethereleaseof“HouseofCards”,TedSarandos,chiefcontentofficerofNetflix,declared
RAG与LLM原理及实践(16)---RAG 前端技术Flask-socketIO PhoenixAI8 RAG Milvus Chroma 源码及实践前端 flask python socketio RAG
目录背景技术理念RAG结合点实时数据更新与推送实时查询与响应安装使用完整案例说明后端python代码代码解释前端html代码JS代码代码解释总结背景构建RAG系统或别的系统时，如果后端采用的全Python，或者说是以python为主的系统，是很常见的一个选择，因为毕竟python对LLM，图片，乃至其他video等resource的model都有较完善的支撑，为了快速开发出原型，甚至之后的商用，往
视频语言规划硅谷秋水大模型智能体机器学习音视频人工智能计算机视觉机器学习
23年10月来自谷歌、MIT和伯克利分校的论文“videolanguageplanning”。讨论如何利用在互联网规模数据上预训练大型生成模型，在生成的视频和语言空间中实现复杂长范围任务的视觉规划。为此，提出视频语言规划(VLP)，一种由树搜索过程组成的算法，训练(i)视觉-语言模型作为策略和价值函数，以及(ii)文本-到-视频模型作为动态模型。VLP将长范围任务指令和当前图像观察作为输入，并输出
全国离线地图矢量地图矢量数据点线面数据一个比新手旧的新手 bigemap java 开发语言
矢量数据、数据珍贵、谨慎下载同步视频教程：http://www.bigemap.com/video/play2018020621.html专题地图制作视频教程：http://www.bigemap.com/video/play201801172.html矢量测试数据下载:KML（KMZ）格式、DXF（DWG）格式、SHP格式:（请用BIGEMAP直接打开，可另存为SHP，DXF(AutoCAD)等
【HarmonyOS NEXT】List中的播放器组件如何全屏播放 Mayism123 harmonyos
【关键字】List/播放器/全屏【问题描述】List中的一个组件是一个播放器，点击全屏的时候如何让播放器全屏？【解决方案】video组件自带全屏接口requestFullscreen。参考文档地址：https://developer.huawei.com/consumer/cn/doc/harmonyos-references/ts-media-components-video-000000181
Python中用于从图像中提取文本的8大OCR库 woshicver python ocr 开发语言
介绍你是否曾想过你的电脑如何能够从图像中读取文字？这都要归功于一种叫做光学字符识别（OpticalCharacterRecognition,OCR）的技术。在Python中，有一些非常酷的库可以帮助你的电脑理解图片中的文字。从谷歌强大的Tesseract到EasyOCR时髦的深度学习，这些库能够做一些非常了不起的事情。让我们来看看Python中的OCR库，了解这些库是如何将图像转换成可读文字的吧！
HarmonyOS video自定义组件 DaLi Sexy HarmonyOS java 前端数据库 harmonyos
直接上代码import{display,window}from'@kit.ArkUI';@Entry@ComponentstructIndex{controller:VideoController=newVideoController()@StatecurRate:PlaybackSpeed=PlaybackSpeed.Speed_Forward_1_00_X;@StatecurRateName:
Git操作 SofiaT git
来源：最常用的35个Git命令-知乎(zhihu.com)，Git常用基本命令使用详细大全_git命令行-CSDN博客18.回退和rebase_哔哩哔哩_bilibiligit命令https://www.bilibili.com/video/BV1HM411377新手命令（工作区/仓库区）gitconfiggitversiongitinit#初始化空版本gitclone#初始化已有版本gitadd
获取视频长度 AI算法网奇 python基础 python 开发语言
fromdecordimportVideoReadersys.path.insert(0,'/home/model-server/dev/data_platform/processors')fromaestheticimportget_aesthetic_model,get_aesthetic_score_batch_queuefrommytools.utilsimportprint_with_t
python to_excel 生成多个sheet页 Excel自学成才 python excel 开发语言
python相关学习资料：https://edu.51cto.com/video/4102.htmlhttps://edu.51cto.com/video/3502.htmlhttps://edu.51cto.com/video/1158.htmlPythontoExcel生成多个Sheet页作为一名经验丰富的开发者，我很高兴能帮助你学习如何使用Python生成Excel文件并包含多个Sheet页
我们每一个人都有自己的答案潘燕生
昨天晚上睡觉前，哥哥问我一个问题：病毒的DNA是怎么样进入细胞核的。我说我不知道，建议今天上网搜一下。于是一早就在百度视频上搜“病毒的DNA是怎么样进入细胞核的”，无果。他拿起他的手机翻了一下，说找“细胞的暗战”这个视频。这是BBC拍摄的一部科幻纪录片，网上有很多观看的链接。我找了下面这个链接放给哥哥看。https://www.bilibili.com/video/av89127540视频上有弹幕
【Vidu发布】中国首个长时长、高一致性、高动态性Video AI大模型叶锦鲤人工智能
就在昨日（2024年4月27日），北京生数科技有限公司（以下简称“生数科技”）联合清华大学在中关村论坛-未来人工智能先锋论坛上，正式发布中国首个长时长、高一致性、高动态性视频大模型：Vidu。该模型采用生数科技团队原创的Diffusion与Transformer融合的架构U-ViT。据发布会介绍，Vidu不仅支持一键生成长达16秒、分辨率高达1080P的高清视频内容，还能够模拟真实物理世界，拥有丰
AI推介-多模态视觉语言模型VLMs论文速览（arXiv方向）：2024.07.25-2024.08.01 小小帅AIGC VLM论文时报人工智能语言模型自然语言处理 VLM 大语言模型计算机视觉视觉语言模型
文章目录～1.PayingMoreAttentiontoImage:ATraining-FreeMethodforAlleviatingHallucinationinLVLMs2.MTA-CLIP:Language-GuidedSemanticSegmentationwithMask-TextAlignment3.MarvelOVD:MarryingObjectRecognitionandVisi
Unity Apple Vision Pro 开发（七）：UI 交互 + 虚拟键盘 YY-nb #Unity Apple Vision Pro 开发 apple vision pro ui unity
XR开发者社区链接：SpatialXR社区：完整课程、项目下载、项目孵化宣发、答疑、投融资、专属圈子课程试看：https://www.bilibili.com/video/BV1fS421X7fn完整版课程，答疑仅社区成员可见，可以通过文章开头的链接加入社区。课程内容：使用Unity内置的UGUI搭建UI面板在远距离和近距离与UI进行交互UI按钮点击事件的使用调用VisionPro的系统键盘
解锁Python中的人脸识别：Face Recognition库详解与应用码上飞扬 Recognition 人脸识别
在当今的人工智能时代，人脸识别技术已经成为了计算机视觉领域的一项重要应用。无论是在安全监控、社交媒体还是智能设备中，人脸识别都扮演着不可或缺的角色。在众多的人脸识别工具和库中，Python的FaceRecognition库以其简单易用和高效性而备受青睐。本文将深入探讨FaceRecognition库的使用方法、工作原理及其应用场景，帮助你快速掌握这一强大的工具。一、什么是FaceRecogniti
【定位系列论文阅读】-Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition（一）醉酒柴柴论文阅读学习笔记
这里写目录标题概述研究内容Abstract第一段（介绍本文算法大致结构与优点）1.Introduction介绍第一段（介绍视觉位置识别的重要性）第二段（VPR的两种常见方法，本文方法结合了两种方法）第三段（本文贡献）第四段（为证明本文方法优越性，进行的测试以及比较）2.RelatedWork相关工作第一段（介绍早期与深度学习的全局图像描述符）第二段（介绍局部关键点描述符）第三段（局部描述符可以进一
FFCreator: 基于Node.js的高效视频制作库指南江涛奎Stranger
FFCreator:基于Node.js的高效视频制作库指南FFCreator一个基于node.js的高速视频制作库Afastvideoprocessinglibrarybasedonnode.js项目地址:https://gitcode.com/gh_mirrors/ff/FFCreator1.项目介绍关于FFCreatorFFCreator是一款基于Node.js开发的高性能视频制作库，旨在简化
FunASR 语音识别系统概述瑞雪兆我心语音识别人工智能
FunASR（AFundamentalEnd-to-EndSpeechRecognitionToolkit）是一个基础的语音识别工具包，提供多种功能，包括语音识别（ASR）、语音端点检测（VAD）、标点恢复（PR）、语言模型（LM）、说话人分离等。项目源地址1语音识别（ASR）参考语音交互：聊聊语音识别-ASR（万字长文）语音识别技术（AutomaticSpeechRecognition,ASR）
python 3D体感游戏雨轩智能 Unity3D教程游戏 python 开发语言
python和Unity制作的3D体感游戏初步，python获取手势关键点控制Uinty场景中游戏物体，实现3D场景游戏，python代码如下fromcvzone.HandTrackingModuleimportHandDetectorimportcv2importsocketcap=cv2.VideoCapture(0)cap.set(3,1280)cap.set(4,720)success,i
使用PyTorch实现的DeepSpeech模型: 强大的语音识别利器毕艾琳
使用PyTorch实现的DeepSpeech模型:强大的语音识别利器deepspeech.pytorchSpeechRecognitionusingDeepSpeech2.项目地址:https://gitcode.com/gh_mirrors/de/deepspeech.pytorch在今天的数字化世界中，语音识别技术已成为人机交互的关键组成部分。deepspeech.pytorch是一个由Sea
掌财社:在html5中使用video进行全屏播放与自动播放的代码方法总结！ weixin_45378258 HTML
今天由于在之前小编在项目中遇到的有关于：“在html5中使用video进行全屏播放与自动播放的代码方法总结！”这方面的内容，所以今天就来和大家分享有关于这方面的相关内容！近期开始开发公司新版官网，首页顶部（header）是一个全屏播放的小视频,现简单总结如下：页面代码：其中php简单判断了一下是否是移动设备,移动设备不展示视频(如果移动端展示的话,需要解决iOS上无法自动播放的问题):ps:如果H
使用transform对html的video播放器窗口放大宣晨光前端整理 html video缩放 transform
核心是使用播放容器$('video').css({'transform':'scale(2)','transform-origin':'centertop'});其中scale表示放大倍数，可以是小数transform-origin表示位置，1）可以使用坐标点如'120px200px'2）或者使用方位坐标，leftrighttopbottom总共九个，如左上方'lefttop'上方‘topcent
使用flv.js + websokect播放rtsp格式视频流音视频开发老马流媒体服务器 Android音视频开发音视频开发 ffmpeg 开发语言 flv 流媒体服务器音视频开发
1.问题背景在最近的项目中，涉及到海康接入的视频播放的问题，海康这边获取到的视频流是rtsp格式，web端目前没有直接可以播放的组件，于是最开始是后端处理了视频流，返回hls格式的m3u8地址，这样用videojs插件就可以播放了，但是问题就是处理了的m3u8地址播放效果非常差，第一次加载时间较长，且播放过程中很卡，尤其是项目的界面做的是视频监控墙，不止一个视频，导致没办法看了。想着最好的方式还是
ffplay音视频同步分析攻城狮百里音视频音视频 C++ffplay
ffplay默认也是采用的这种同步策略。主流程ffplay中将视频同步到音频的主要方案是，如果视频播放过快，则重复播放上一帧，以等待音频；如果视频播放过慢，则丢帧追赶音频。这一部分的逻辑实现在视频输出函数video_refresh中，分析代码前，我们先来回顾下这个函数的流程图：在这个流程中，“计算上一帧显示时长”这一步骤至关重要。先来看下代码：staticvoidvideo_refresh(voi
AI绘画笔记 Denny# AI作画笔记 AIGC lora SD stablediffusion
最近学习怎么AI绘画，这里主要记录相关笔记：1工具工具主要用秋叶大神的工具：https://www.bilibili.com/video/BV1iM4y1y7oA/?spm_id_from=333.999.0.0&vd_source=b5b407651100ec703e082fd10b30caa7秋叶大神的空间：https://space.bilibili.com/12566101（小破站有较多假
mysql怎么把utf8mb4_unicode_ci转为utf8mb4_general_ci 我是杨天 mysql ci/cd oracle 数据库
数据库相关学习资料：https://edu.51cto.com/video/655.htmlMySQL字符集转换方案：从utf8mb4_unicode_ci到utf8mb4_general_ci在MySQL数据库中，字符集和排序规则对于数据的存储和检索具有重要影响。utf8mb4_unicode_ci和utf8mb4_general_ci是两种常见的utf8mb4字符集的排序规则。其中，utf8m
统一思想认识永夜-极光思想
1.统一思想认识的基础,才能有的放矢原因: 总有一种描述事物的方式最贴近本质,最容易让人理解. 如何让教育更轻松,在于找到最适合学生的方式. 难点在于,如何模拟对方的思维基础选择合适的方式. &
Joda Time使用笔记 bylijinnan java joda time
Joda Time的介绍可以参考这篇文章： http://www.ibm.com/developerworks/cn/java/j-jodatime.html 工作中也常常用到Joda Time，为了避免每次使用都查API，记录一下常用的用法： /** * DateTime变化（增减） */ @Tes
FileUtils API eksliang FileUtils FileUtils API
转载请出自出处：http://eksliang.iteye.com/blog/2217374 一、概述这是一个Java操作文件的常用库，是Apache对java的IO包的封装，这里面有两个非常核心的类FilenameUtils跟FileUtils，其中FilenameUtils是对文件名操作的封装;FileUtils是文件封装，开发中对文件的操作，几乎都可以在这个框架里面找到。非常的好用。
各种新兴技术不懂事的小屁孩技术
1:gradle Gradle 是以 Groovy 语言为基础，面向Java应用为主。基于DSL（领域特定语言）语法的自动化构建工具。现在构建系统常用到maven工具，现在有更容易上手的gradle，搭建java环境: http://www.ibm.com/developerworks/cn/opensource/os-cn-gradle/ 搭建android环境： http://m
tomcat6的https双向认证酷的飞上天空 tomcat6
1.生成服务器端证书 keytool -genkey -keyalg RSA -dname "cn=localhost,ou=sango,o=none,l=china,st=beijing,c=cn" -alias server -keypass password -keystore server.jks -storepass password -validity 36
托管虚拟桌面市场势不可挡蓝儿唯美
用户还需要冗余的数据中心，dinCloud的高级副总裁兼首席营销官Ali Din指出。该公司转售一个MSP可以让用户登录并管理和提供服务的用于DaaS的云自动化控制台，提供服务或者MSP也可以自己来控制。在某些情况下，MSP会在dinCloud的云服务上进行服务分层，如监控和补丁管理。 MSP的利润空间将根据其参与的程度而有所不同，Din说。 “我们有一些合作伙伴负责将我们推荐给客户作为个
spring学习——xml文件的配置 a-john spring
在Spring的学习中，对于其xml文件的配置是必不可少的。在Spring的多种装配Bean的方式中，采用XML配置也是最常见的。以下是一个简单的XML配置文件： <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.or
HDU 4342 History repeat itself 模拟 aijuans 模拟
来源：http://acm.hdu.edu.cn/showproblem.php?pid=4342 题意：首先让求第几个非平方数，然后求从1到该数之间的每个sqrt(i)的下取整的和。思路：一个简单的模拟题目，但是由于数据范围大，需要用__int64。我们可以首先把平方数筛选出来，假如让求第n个非平方数的话，看n前面有多少个平方数，假设有x个，则第n个非平方数就是n+x。注意两种特殊情况，即
java中最常用jar包的用途 asia007 java
java中最常用jar包的用途 jar包用途axis.jarSOAP引擎包commons-discovery-0.2.jar用来发现、查找和实现可插入式接口，提供一些一般类实例化、单件的生命周期管理的常用方法.jaxrpc.jarAxis运行所需要的组件包saaj.jar创建到端点的点到点连接的方法、创建并处理SOAP消息和附件的方法，以及接收和处理SOAP错误的方法. w
ajax获取Struts框架中的json编码异常和Struts中的主控制器异常的解决办法百合不是茶 js json编码返回异常
一:ajax获取自定义Struts框架中的json编码出现以下问题: 1,强制flush输出 json编码打印在首页 2, 不强制flush js会解析json 打印出来的是错误的jsp页面却没有跳转到错误页面 3, ajax中的dataType的json 改为text 会
JUnit使用的设计模式 bijian1013 java 设计模式 JUnit
JUnit源代码涉及使用了大量设计模式 1、模板方法模式（Template Method）定义一个操作中的算法骨架，而将一些步骤延伸到子类中去，使得子类可以不改变一个算法的结构，即可重新定义该算法的某些特定步骤。这里需要复用的是算法的结构，也就是步骤，而步骤的实现可以在子类中完成。
Linux常用命令（摘录） sunjing crond chkconfig
chkconfig --list 查看linux所有服务 chkconfig --add servicename 添加linux服务 netstat -apn | grep 8080 查看端口占用 env 查看所有环境变量 echo $JAVA_HOME 查看JAVA_HOME环境变量安装编译器 yum install -y gcc
【Hadoop一】Hadoop伪集群环境搭建 bit1129 hadoop
结合网上多份文档，不断反复的修正hadoop启动和运行过程中出现的问题，终于把Hadoop2.5.2伪分布式安装起来，跑通了wordcount例子。Hadoop的安装复杂性的体现之一是，Hadoop的安装文档非常多，但是能一个文档走下来的少之又少，尤其是Hadoop不同版本的配置差异非常的大。Hadoop2.5.2于前两天发布，但是它的配置跟2.5.0，2.5.1没有分别。 &nb
Anychart图表系列五之事件监听白糖_ chart
创建图表事件监听非常简单：首先是通过addEventListener('监听类型',js监听方法)添加事件监听，然后在js监听方法中定义具体监听逻辑。以钻取操作为例，当用户点击图表某一个point的时候弹出point的name和value，代码如下： <script> //创建AnyChart var chart = new AnyChart(); //添加钻取操作&quo
Web前端相关段子 braveCS web前端
Web标准：结构、样式和行为分离使用语义化标签 0）标签的语义：使用有良好语义的标签，能够很好地实现自我解释，方便搜索引擎理解网页结构，抓取重要内容。去样式后也会根据浏览器的默认样式很好的组织网页内容，具有很好的可读性，从而实现对特殊终端的兼容。 1）div和span是没有语义的：只是分别用作块级元素和行内元素的区域分隔符。当页面内标签无法满足设计需求时，才会适当添加div
编程之美-24点游戏 bylijinnan 编程之美
import java.util.ArrayList; import java.util.Arrays; import java.util.HashSet; import java.util.List; import java.util.Random; import java.util.Set; public class PointGame { /**编程之美
主页面子页面传值总结 chengxuyuancsdn 总结
1、showModalDialog returnValue是javascript中html的window对象的属性,目的是返回窗口值,当用window.showModalDialog函数打开一个IE的模式窗口时,用于返回窗口的值主界面 var sonValue=window.showModalDialog("son.jsp"); 子界面 window.retu
[网络与经济]互联网+的含义 comsci 互联网+
互联网+后面是一个人的名字 = 网络控制系统互联网+你的名字 = 网络个人数据库每日提示:如果人觉得不舒服,千万不要外出到处走动,就呆在床上,玩玩手游,更不能够去开车,现在交通状况不
oracle 创建视图 with check option daizj 视图 view oralce
我们来看下面的例子： create or replace view testview as select empno,ename from emp where ename like ‘M%’ with check option; 这里我们创建了一个视图，并使用了with check option来限制了视图。然后我们来看一下视图包含的结果： select * from testv
ToastPlugin插件在cordova3.3下使用 dibov Cordova
自己开发的Todos应用，想实现“ 再按一次返回键退出程序 ”的功能，采用网上的ToastPlugins插件，发现代码或文章基本都是老版本，运行问题比较多。折腾了好久才弄好。下面吧基于cordova3.3下的ToastPlugins相关代码共享。 ToastPlugin.java package&nbs
C语言22个系统函数 dcj3sjt126com c function
C语言系统函数一、数学函数下列函数存放在math.h头文件中Double floor(double num) 求出不大于num的最大数。Double fmod(x, y) 求整数x/y的余数。Double frexp(num, exp); double num; int *exp; 将num分为数字部分（尾数）x和以2位的指数部分n，即num=x*2n，指数n存放在exp指向的变量中，返回x。D
开发一个类的流程 dcj3sjt126com 开发
本人近日根据自己的开发经验总结了一个类的开发流程。这个流程适用于单独开发的构件，并不适用于对一个项目中的系统对象开发。开发出的类可以存入私人类库，供以后复用。以下是开发流程： 1. 明确类的功能，抽象出类的大概结构 2. 初步设想类的接口 3. 类名设计（驼峰式命名） 4. 属性设置(权限设置) 判断某些变量是否有必要作为成员属
java 并发 shuizhaosi888 java 并发
能够写出高伸缩性的并发是一门艺术在JAVA SE5中新增了3个包 java.util.concurrent java.util.concurrent.atomic java.util.concurrent.locks 在java的内存模型中，类的实例字段、静态字段和构成数组的对象元素都会被多个线程所共享，局部变量与方法参数都是线程私有的，不会被共享。
Spring Security（11）——匿名认证 234390216 Spring Security ROLE_ANNOYMOUS 匿名
匿名认证目录 1.1 配置 1.2 AuthenticationTrustResolver 对于匿名访问的用户，Spring Security支持为其建立一个匿名的AnonymousAuthenticat
NODEJS项目实践0.2[ express,ajax通信...] 逐行分析JS源代码 Ajax nodejs express
一、前言通过上节学习，我们已经 ubuntu系统搭建了一个可以访问的nodejs系统，并做了nginx转发。本节原要做web端服务及 mongodb的存取，但写着写着，web端就
在Struts2 的Action中怎样获取表单提交上来的多个checkbox的值 lhbthanks java html struts checkbox
第一种方法：获取结果String类型在 Action 中获得的是一个 String 型数据，每一个被选中的 checkbox 的 value 被拼接在一起，每个值之间以逗号隔开(,)。所以在 Action 中定义一个跟 checkbox 的 name 同名的属性来接收这些被选中的 checkbox 的 value 即可。以下是实现的代码：前台 HTML 代码：
003.Kafka基本概念 nweiren hadoop kafka
Kafka基本概念：Topic、Partition、Message、Producer、Broker、Consumer。 Topic：消息源（Message）的分类。 Partition： Topic物理上的分组，一
Linux环境下安装JDK roadrunners jdk linux
1、准备工作创建JDK的安装目录： mkdir -p /usr/java/ 下载JDK，找到适合自己系统的JDK版本进行下载： http://www.oracle.com/technetwork/java/javase/downloads/index.html 把JDK安装包下载到/usr/java/目录，然后进行解压： tar -zxvf jre-7
Linux忘记root密码的解决思路 tomcat_oracle linux
1：使用同版本的linux启动系统，chroot到忘记密码的根分区passwd改密码　　2：grub启动菜单中加入init=/bin/bash进入系统，不过这时挂载的是只读分区。根据系统的分区情况进一步判断. 　　3: grub启动菜单中加入 single以单用户进入系统. 　　4:用以上方法mount到根分区把/etc/passwd中的root密码去除　　例如: 　　ro
跨浏览器 HTML5 postMessage 方法以及 message 事件模拟实现 xueyou jsonp jquery 框架 UI html5
postMessage 是 HTML5 新方法，它可以实现跨域窗口之间通讯。到目前为止，只有 IE8+, Firefox 3, Opera 9, Chrome 3和 Safari 4 支持，而本篇文章主要讲述 postMessage 方法与 message 事件跨浏览器实现。postMessage 方法 JSONP 技术不一样，前者是前端擅长跨域文档数据即时通讯，后者擅长针对跨域服务端数据通讯，p

Two-Stream Convolutional Networks for Action Recognition in Videos [Paper Part]

你可能感兴趣的:(video,recognition)