路途…

【AIOT】手势捕捉论文阅读笔记

title: Hand Analyse Record
date: 2020-06-20 11:32:44
author: liudongdong1
img: https://gitee.com/github-25970295/blogImage/raw/master/img/dataglove.jpg
reprintPolicy: cc_by
cover: false
categories: AIOT
tags:

HandPose

level: CVPR CCF_A
author: Tomas Simon Carnegie Mellon University
date: 2017
keyword:

hand pose

Paper: OpenPose HandKeypoint

Hand Keypoint Detection in Single Images using Multiview Bootstrapping

Summary

present an approach that uses a multi-camera system to train fine-grained detectors for keypoints.

Research Objective

Application Area: hand based HCI and robotics
Purpose: to extract hand point coordinate from single RGB images.

Proble Statement

self-occlusion due to articulation, view-point, grasped object.

previous work:

many approaches to image-based face and body keypoint localization exist, there are no markerless hand keypoint detectors that work on RGB images in the wild.

Methods

Problem Formulation:

input: a crop image patch $I\epsilon R^{w*h*3}$

output: P keypoint location, $X_p\epsilon R^2$ ,with associated confidence $C_p$ .
$Keypoint\;detector: \;\;\;\;d(I)->[ (X_p,c_p) \;for \; p \epsilon[1....P]]$

system overview:

【Multiview Bootstrapped Training】

$Initial\;trainingset: \;\;\;\;T_0:=[ (I_f,y_p^f) \;for\;f\epsilon[1...N_0]]$ , $f$ denote the particular image frame,set $[y_p^f\epsilon R^2]$ include all labeled keypoints for image $I^f$ .

Multiview Bootrstrap:

Inputs:
- calibrated cameras configuration
- unlabled images: $I_v^f \; for \; v\epsilon views,\; f\epsilon frames]$
- keypoint detector: $d_0(I)->[(x_p,c_p)\;for\;p\epsilon points]$
- labeled training data: $T_0$
Output: improved detector $d_K(.)$ and training set $T_k$
for iteration $i$ in 0 to K:
1. Triangulate keypoint from weak detections
  - for every frame $f$ :
    - run detector $d_i(I_v^f)$ on all views $v$ , $D<-\{d_i(I_v^f) \; for \; v\epsilon [1...V] \}$ (1)
    - robustly triangulate keypoints, $X_p^f=argmin_X \sum_{v\epsilon I_p^f}{||P_v(X)-x_p^v||_2^2}$ (2)
2. score and sort triangulated frames , $score(\{X_p^f\})=\sum_{p\epsilon [1...P]}\sum_{v\epsilon I_p^f}C_p^v$ (3)
3. retrain with N-best reprojections. $d_{i+1}<-train(T_0\;U\;T_{i+1})$ (4)

supplement for the mathmatic:

for (1):for one frame, for each keypoint p, we have V detections $x_p^v,c_p^v)$ , robustly triangulate each point p into a 3D location, use RANSAC on point D with confidence above a detection threshold $\lambda$ .
for (2): $I_p^f$ is the inlier set, $X_p^f \epsilon R^3$ is the 3D triangulated keypoint p in frame f, $P_v(X) \epsilon R^2$ denotes projection of 3D point $X$ into view $v$ . triangulate all landmarks of each finger(4 points ) at a time.
for(3): pick the best frame for every window of $W$ frames. Sort the frame in descending order according to their score, to obtain an ordered sequence of frames, $s_1,s_2,...s_F^‘]$ , $F^‘$ is the number of subsampled frames, $s_i$ is the ordered frame index.
- while verigy the good labled frame, using some strategies to automatically removing bad frame:
  - average number of inliews
  - average detection detection confidence
  - difference of per-point velocity with medium velocity between two video frames
  - anthropomorphic limits on joint lengths
  - complete occlusion as determined by camera ray intersection with body joints
for(4): $T_{i+1}=\{(I_v^{s_n},\{P_v(X_p^{s_n}):\;v\epsilon [1...V],\; p\epsilon [1...P]\})\; for\; n\epsilon[1...N]\}$

【Detection Architecture】

Hand Bounding Box Detection: directly use the body pose estimation models from [29], and [4] and use wrist and elbow position to approximate the hand location, assuming the hand extends 0.15 times the length of the forearm(前臂) in the same direction.
using architecture of CPMs with some modification. CPMs predict a confidence map for each keypoint, representing the keypoint’s location as a Gaussian centered at the true position
using pre-trained VGG-19 network

Evaluation

Environment:
- Dataset:
  - the MPII human pose dataset[2] reflect every-day human activities
  - Images from the New Zealand Sign Language Exercised os the Victoria University of Wellington contains a variety of hand poses found in conversation

Conclusion

the first real-time hand keypoint detector showing practical applicability to in-the-wild RGB videos
the first markerless 3D hand motion capture system capable of reconstructing challenging hand-object interactions and musical performances without manual intervention
using multi-view bootstrapping, improving both the quality and quantity of the annotations

Notes

Bootstrap步骤：
- 在原有的样本中通过重抽样抽取一定数量（比如100）的新样本。
- 基于产生的新样本，计算我们需要估计的统计量 $\alpha_i$ 。
- 重复上述步骤n次（一般是n>1000次）。计算被估计量的均值和方差。
- $\vec{\alpha}=Mean(\alpha_i...)$
RANSAC: robust estimation techniques such as M-estimators and least-median squares that have been adopted by the computer vision community from the statistics literature, RANSAC was developed from within the computer vision community

level: CVPR
author: Kuo Du1
date: 2019
keyword:

hand skeleton

Paper: CrossInfoNet

CrossInfoNet: Multi-Task Information Sharing Based Hand Pose Estimation

Summary

proposed CrossInfoNet decomposes hand pose estimation task into palm pose estimation sub-task and finger pose estimation sub-task, and adopts two-branch cross-connection structure to share the beneficial complementary information between the sub-tasks.
propose a heat-map guided feature extraction structure to get better feature maps, and train the complete network end-to-end.

Proble Statement

previous work:

treating depth maps as 2D images and regressing 3D joint coordinates directly is a commonly used hand pose estimation pipeline.
designing effective networks receives the most attentions. Learning multiple tasks simultaneously will be helpful to enforce a model with better generalizing ability.
the output representations can be classified into the probability density map or the 3D coordinates for each joint. heat-map based method outperforms direct coordinate regression method, and the final joint coordinates have usually to be inferred by maximum operation on the heat-maps

Methods

system overview:

【Heat-map guided feature extraction】

ResNet-50 [15] backbone network with four residual modules
apply the feature pyramid structure to merge different feature layers.
the heat maps are only used as the constraints to guide the feature extraction and will not be passed to the subsequent module.

【Baseline feature refinement architecture】

【New Feature refinement architecture】

【Loss Functions Defines】

Evaluation

Environment:
- Dataset: ICVL datasets, NYU datasets, MSRA datasets, Hands 2017 Challenge Frame-based Dataset.

Conclusion

use hierarchical model to decompose the final task into palm joint regression sub-task and finger joint regression sub-task.
a heat-map guided feature extraction structure is proposed.

Notes 去加强了解

https://github.com/dumyy/handpose

Paper: Emotion Identification

Hand Gestures Based Emotion Identification Using Flex Sensors

Summary

Paper: Gesture To Speech

Gesture To Speech Conversion using Flex sensors,MPY6050 and Python

Summary

Arduino Uno, Flex Sensors, MPU6050 an accelerometer gyroscope sensor which is used to detect the alignment of an object.
To recognise the ALS Sign Language

Paper: Flex

Flex: Hand Gesture Recognition using Muscle Flexing Sensors

Summary

Flex Sensors from Spectra-Symbol for angle displacement measuremetns.
apply a linear response delay filter to the raw sensors output for noise reduction and signal smoothing.

Paper: Survey on Hand Pose Estimation

A Survey on Hand Pose Estimation with Wearable Sensors and Computer-Vision- Based Methods

Summary

详细介绍了基于视觉基于传感器方法

Paper: Flex&Gyroscopes

Recognizing words in Thai Sign Language using ﬂex sensors and gyroscopes

Summary

some sensors
- contact sensors for detecting fingers touching each other
- accelerometers for measuring the acceleration of the hand in different direction
- gyro-scopes for measuring the hand orientation and angular movement
- magnetoresistive sensors for measuring the magnetic field for deriving the hand orientation
presents a Thai sign language recognition framework using a glove-based device with flex sensors and gyro-scops.
the measurements from the sensors are processed using finite Legendre and Linear Discriminant Analysis, then classified using k-nearest neighbors.
Handware design:
the gyroscopes can return values in three different types of measurement
- the quaternions are the raw data returned from the sensor. This measurement yields a four-dimensional output.
- Euler angles are data converted from the four quaternion values. The Euler angles consist of three values, matching x, y, and z axis.
- YPR measures the angle but with respect to the direction of the ground. It has three elements like the Euler angles. However it also requires gravity values from the accelerometer in order to calibrate. to calculate YPR, four quaternion elements and three gravity values are needed
Date processing
- segment and normalize the data ???how to segment data unclear??
- the value from flex sensors differ greatly depend on person, by requiring a calibration phase which the user clenches and releases his hands at least 3 times to determine th e maximum and minimum values of each flex sensor, and quantize the data to 3 possible values(0,1,2)
- 这部分不理解：

Human-Machine-Interaction

Taheri, Omid, et al. “GRAB: A dataset of whole-body human grasping of objects.” European Conference on Computer Vision. Springer, Cham, 2020.

Paper: GRAB

GRAB: A dataset of whole-body human grasping of objects

Summary

collect a new dataset, GRAB of whole-body grasps, containing full 3D shape and pose sequences of 10 subjects interacting with 51 every day objects of varying shape and size.
using MoCap markers to fit the full 3D body shape and pose, including the articulated face and hands, as well as the 3D object pose.
adapt MoSh++ to solve for the body, face, and hands of SMPL-X to obtain detailed moving 3D meshes, and according to the meshes and tracked 3D objects, we compute plausible contact on the object and the human and provide an analysis of observed patterns.

Relative

require complex 3D object shapes, detailed contact information, hand pose and shape, and the 3D body motion over time;
MoCap: https://mocap.reallusion.com/iClone-motion-live-mocap/

Paper: A Mobile Robot Hand-arm

A Mobile Robot Hand-Arm Teleoperation System by Vision and IMU

Summary

present a multi-modal mobile teleoperation system that consists of a novel vision-based hand pose regression network and IMU-based arm tracking methods.

observe the human hand through a depth camera and generates joint angles and depth images of paired robot hand poses through an image-to-image translation process.

Transteleop takes the depth image of the human hand as input, then estimates the joint angles of the robot hand, and also generates the reconstructed image of the robot hand.

design a keypoint-based reconstruction loss to focus on the local reconstruction quality around the keypoints of the hand.

Research Objective

Application Area: space, rescue, medical, surgery, imitation learning.
Purpose: implement different manipulation tasks such as pick and place, cup insertion, object pushing, and dual-arm handover tasks

Proble Statement

the robot hand and human hand occupy two different domains, how to compensate for kinematic differences between them plays an essential role in markerless vision-based teleoperation

previous work:

Image-to-Image translation: aims to map representation of a scene into another, used in collection of style transfer, object transfiguration, and imitation learning.

Methodsj

system overview:

【Question 1】how to discover the latent feature embedding the Zpose between the human hand and robot hand?

using Encoder-decoder module

【Question 2】how to get more accuracy of local features such as the position of fingertips instead of global features such as image style?

design a keypoint-based reconstruction loss to capture the overall structure of the hand and concentrate on the pixels around the 15 keypoints of the hand.

using mean squared error(MSE) loss to calculate the joint from $Z_R$ (robot feature)

【Question 3】the poses of the human hand vary considerably in their global orientations?

applied spatial transformation network(STN) provides spatial transformation capabilities of input images before the encoder module.

【Question 4】the hand easily disappears from the field of view of the camera, and the camera position is uncertain ?

using a cheap 3D-printed camera holder

using Perception Neuron device to control the arm of the robot.

Evaluation

Environment:
- Dataset: dataset of paired human-robot images, contains 400k pairs of simulated robot depth images and human hand depth images, the ground trush are 19 joint angles of the robot hand, record the 9 depth images of the robot hand from different viewpoints simultaneously corresponding to one human pose.

Notes 去加强了解

https://Smilels.github.io/multimodal-translation-teleop
可能有什么问题，

level: PerDial’19
author:
date: 2019
keyword:

robot, ASL,

Paper: Human-Robot

Human-Robot Interaction with Smart Shopping Trolley using Sign Language: Data Collection

Summary

presents a concept of smart robotic trolley for supermarkets with multi-modal user interface, including sign language and acoustic speech recognition, and equipped with a touch screen.

Proble Statement

continuous or dynamic sign language recognition remains an unresolved challenge.
sensitivity to size and speed variations, poor performance under varying lighting conditions and complex background have limited the use of SLR in modern dialogue systems.

previous work:

the level of voiced speech and isolated/static hand gesture automatic recognition quality is quite high.
EffiBot[1] takes goods and automatically goes with them to the point of discharge, and follow the user when the corresponding mode is activated.
The Dash Robotic Shopping Cart[2] :
- a supermarket trolley that facilitates shopping and navigation in the store, the car is equipped with a touchscreen for entering a list of products of interest to the client.
Gita by Piaggio[3]: a robotic trolley that follows the owner.
none of the interfaces of the aforementioned robotic carts are multimodal.

Methods

system overview:

speaker-independent system of automatic continuous Russian speech recognition
speaker-independent system of Russian sign language recognition with video processing using Kinect2.0 device
interactive graphical user interface with touchscreen
dialogue and data manager that access an application database, generates multi modal output and synchronizes input modalities fusion and output modalities fission
modules for audio-visual speech synthesis to be applied for a talking avatar

Conclusion

understanding voice commands
understanding Russian sign language commands
escort the user to a certain place in the store
speech synthesis, synthesis of answers in Russian sign language using a 3D avatar.

Notes

介绍了一些手语数据集
【30】，32 ，7 ， 34
机器人：https://www.effidence.com/effibot
- https://mygita.com/#/how-does-gita-work

level: IJCAI
author: YangYi (MediaLab,Tencent) FengNi(PekingUniversity)
date: 2019
keyword:

hand gesture understand

Paper: MKTB&GRB

High Performance Gesture Recognition via Effective and Efficient Temporal Modeling

Research Objective

Purpose: hand gesture recognition instead of human-human or human-object relationships.

Proble Statement

hand gesture recognition methods based on spatio-temporal features using 3DCNNs or ConvLSTM suffer from the inefficiency due to high computational complexity of their structure.

previous work:

Temporal Modeling for Action Recognition
- 2DCNN by Narayana et al., 2018
- 3DCNNs by Miao et al., 2017
- ConvLSTM by Zhang et al., 2017
- TSN by Wang et al.2016 models long-range temporal structures with segment-based sampling and aggregation module.
- C3D by Li et al.2016 designs a 3DCNN with small 333 convolution kernels to learn spatiotemporal features.
- I3D by Carreira 2017 inflates convolutional filters and pooling kernels into 3D structures.
- R(2+1)D by Wang et al. 2018 present non-local operations to capture long-reange dependencies
Gesture Recognition:
- 2DCNN by Narayana et al.2018 (学习下，多模态的,只了解多模态部分) fuses multi-channels(global/left-hand/right-hand/for RGB/depth/RGB-flow/Depth-flow modalities)
- combines 3DCNN， bidirectional ConvLSTM and 2DCNN into a unified framework. ( 学习下如何整合到一个框架中)

Methods

system overview:
- the model builds upon TSN, for TSN lacks of capability of modeling the temporal information from feature-space, the proposed MKTB and GRB are effective temporal modeling modules in feature-space.

【Multi-Kernel Temporal Block】

unlike 3DCNNs, performing convolutional operation for both spatial and temporal dimension jointly, the MTKB decouples the joint spatial-temporal modeling process and focuses on learning the temporal information.
the design of multi-kernel works well on shaping the pyramidal and discriminative temporal features.

define feature maps from layer $l$ of 2DCNN(ResNet-50) as $F_s\epsilon R^{(B*T)*C*H*W}$
reduce the channels of $F_s$ via convolution layer with kernel size of 1*1, denoted as $F_s^‘ \epsilon R^{ ( B * T) * C^‘ * H * W}$
using depthwise temporal conv [Chollet,2017]

【Global Refinement Block】

MKTB mainly focuses on the local neighborhoods,but the global temporal features across channels are not sufficiently attended.
GRB is designed to perform the weighted temporal aggregation, in which it allows distant temporal features to contribute to the filtered temporal features according to the cross-similarity. 遗留问题，如何计算similarity， MKTB 中如何sum

Evaluation

Conclusion

MKTB captures both short-term and long-term temporal information by using the multiple 1D depthwise convolutions.
MKTB and GRB maintain the same size between input and output, and can be easily deployed everywhere.

Notes 去加强了解

https://github.com/nemonameless/Gesture-Recognition

level: CCF_A CVPR
author: Liuhao Ge, Nanyang Technological University
date: 2018
keyword:

hand pose

Paper: Hand PointNet

Hand PointNet: 3D Hand Pose Estimation using Point Sets

Summary

propose HandPointNet model, that directly processes the 3D point cloud that models the visible surface of the hand for pose regression,Taking the normalized point cloud as the input, the regression network capture complex hand structures and accurately regress a low dimensional representation of the 3D hand pose.
design a fingertip refinement network that directly takes the neighboring points of the estimated fingertip location as input to refine the fingertip location.

Research Objective

Application Area: hand based interaction
Purpose: exact hand skeleton

Proble Statement

high dimensionality of 3D hand pose, large variations in hand orientations, high self-similarity of fingers and servere self-occlusion

previous work:

large hand pose datasets[38, 34, 33, 49, 48]
CNN model:
- the time and space complexities of the 3D CNN grow cubically with the resolution of the input 3D volume, using low resolution may lose useful details of the hand
- PointNet: perform 3D object classification and segmentation on point sets directly
- using multi-view CNNs-based method and 3D CNN-based method
Hand Pose Estimation: Discriminative approaches, generative approaches, hybrid approaches
- feedback loop model[21]
- spatial attention network[47]
- deep generative models[41]
3D Deep Learning:
- Multi-view CNNs-based approaches[32, 24, 7, 2] project 3D points into 2D images and use 2D CNNs to process them.
- 3D CNNs based on octrees[27, 43] are proposed for efficient computation on high resolution volumes.

Methods

Problem Formulation:

Input: depth image containing a hand;

outputs: a set of 3D hand joint locations in the amera coordinate system.

system overview:

【Basic PointNet】 [23] directly takes a set of points as the input and is able to extract discriminative features of the point cloud. cannot capture local structures of the point cloud in a hierarchical way.

basic architecture of PointNet takes N points as the input, Each D-dim input point is mapped into a C-dim feature through MLP. Per-point features are aggregated into a global feature by max-pooling, and mapped into F-dim output vector.

【Hierarchical PointNet】[25]:

The hierarchical structure is composed by a number of set abstraction levels, at each level, a set of points is processed and abstracted to produce a new set with fewer elements. The set abstraction level is made of three key layers: 点云采样+成组+提取局部特征（S+G+P）的方式，包含这三部分的机构称为 Set Abstraction

sampling layer: selects a set of points from input points, which defines the centroids of lcoal regions. use interative farthest point sampling(FPS) to choose the subset of points.

grouping layer: constructs local region sets by fining “neighboring” points around the centroids. N`K(d+C): d-dim coordinates, and C-dim point feature, K is the number of points in the neighborhood of centroid points. Ball query finds all points that are within a radius to the query point.

PointNet alyer: uses a mini-PointNet to encode local region patterns into feaature vectors.

分类网络是逐层提取特征，最后总结出全局特征。

分割网络先将点云提取一个全局特征，在通过这个全局特征逐步上采样。每层新的中心点都是从上一层抽取的特征子集，中心点的个数就是成组的点集数，随着层数增加，中心点的个数也会逐渐降低，抽取到点云的局部结构特征。当点云不均匀时，每个子区域中如果在分区的时候使用相同的球半径，会导致部分稀疏区域采样点过小。多尺度成组 (MSG)和多分辨率成组 (MRG)

**多尺度成组（MSG）：**对于选取的一个中心点设置多个半径进行成组，并将经过PointNet对每个区域抽取后的特征进行拼接（concat）来当做该中心点的特征.

**多分辨率成组（MRG）：**对不同特征层上（分辨率）提取的特征再进行concat，以上图右图为例，最后的concat包含左右两个部分特征，分别来自底层和高层的特征抽取，对于low level点云成组后经过一个pointnet和high level的进行concat，思想是特征的抽取中的跳层连接。当局部点云区域较稀疏时，上层提取到的特征可靠性可能比底层更差，因此考虑对底层特征提升权重。当然，点云密度较高时能够提取到的特征也会更多。这种方法优化了直接在稀疏点云上进行特征抽取产生的问题，且相对于MSG的效率也较高。

【OBB-based Point Cloud Normalization】to deal with large variation in global orientation of the hand. normalization the hand point cloud into a canonical coordinate system in which the global orientations of the transformed hand point clouds are as consistent as possible. normalization step ensures that our method is robust to variations in hand global orientations

each column corresponds to the same local region, and each row correspnd to the same filter. Following pictures show the sensitivity of points in three loacl regions to two fitlers at each of the first two levels.

【Refine the Fingertip】

Based on the obervation: the fingertip location of straightened finger is usually easy to be fined, since K nearest neighboring points of the fingertip will not change a lot even if the estimated location deviates from the ground truth location to some extent when K is relatively large

Conclusion

estimate 3D hand joint locations directly from 3D point cloud base on the netword architecture of PointNet. better expoit the 3D spatial information in the depth image
robust to variations in hand global orientations, normalize the sampled 3D points in an oriented bounding box without applying any additional network to transform the hand piont cloud.
refine the fingertip locations with a basic PointNet that takes the Neighboring points of the estimation fingertip location as input to regress the refined fingertip location.

Notes 去加强了解

1. 最远点采样

最远点采样(Farthest Point Sampling)是一种非常常用的采样算法，由于能够保证对样本的均匀采样，被广泛使用，像3D点云深度学习框架中的PointNet++对样本点进行FPS采样再聚类作为感受野，3D目标检测网络VoteNet对投票得到的散乱点进行FPS采样再进行聚类，6D位姿估计算法PVN3D中用于选择物体的8个特征点进行投票并计算位姿。

输入点云有N个点，从点云中选取一个点P0作为起始点，得到采样点集合S={P0}；

计算所有点到P0的距离，构成N维数组L，从中选择最大值对应的点作为P1，更新采样点集合S={P0，P1}；

计算所有点到P1的距离，对于每一个点Pi，其距离P1的距离如果小于L[i]，则更新L[i] = d(Pi, P1)，因此，数组L中存储的一直是每一个点到采样点集合S的最近距离；

选取L中最大值对应的点作为P2，更新采样点集合S={P0，P1，P2}；

重复2-4步，一直采样到N’个目标采样点为止。

初始点选择：

随机选择一个点，每次结果不同；

选择距离点云重心的最远点，每次结果相同，一般位于局部极值点，具有刻画能力；

距离度量

欧氏距离：主要对于点云，在3D体空间均匀采样；

测地距离：主要对于三角网格，在三角网格面上进行均匀采样；

from __future__ import print_function
import torch
from torch.autograd import Variable

def farthest_point_sample(xyz, npoint): 

    """
    Input:
        xyz: pointcloud data, [B, N, 3]
        npoint: number of samples
    Return:
        centroids: sampled pointcloud index, [B, npoint]
    """

    xyz = xyz.transpose(2,1)
    device = xyz.device
    B, N, C = xyz.shape
    
    centroids = torch.zeros(B, npoint, dtype=torch.long).to(device)     # 采样点矩阵（B, npoint）
    distance = torch.ones(B, N).to(device) * 1e10                       # 采样点到所有点距离（B, N）

    batch_indices = torch.arange(B, dtype=torch.long).to(device)        # batch_size 数组

    #farthest = torch.randint(0, N, (B,), dtype=torch.long).to(device)  # 初始时随机选择一点
    
    barycenter = torch.sum((xyz), 1)                                    #计算重心坐标 及 距离重心最远的点
    barycenter = barycenter/xyz.shape[1]
    barycenter = barycenter.view(B, 1, 3)

    dist = torch.sum((xyz - barycenter) ** 2, -1)
    farthest = torch.max(dist,1)[1]                                     #将距离重心最远的点作为第一个点

    for i in range(npoint):
        print("-------------------------------------------------------")
        print("The %d farthest pts %s " % (i, farthest))
        centroids[:, i] = farthest                                      # 更新第i个最远点
        centroid = xyz[batch_indices, farthest, :].view(B, 1, 3)        # 取出这个最远点的xyz坐标
        dist = torch.sum((xyz - centroid) ** 2, -1)                     # 计算点集中的所有点到这个最远点的欧式距离
        print("dist    : ", dist)
        mask = dist < distance
        print("mask %i : %s" % (i,mask))
        distance[mask] = dist[mask]                                     # 更新distance，记录样本中每个点距离所有已出现的采样点的最小距离
        print("distance: ", distance)

        farthest = torch.max(distance, -1)[1]                           # 返回最远点索引
 
    return centroids

if __name__ == '__main__':

    sim_data = Variable(torch.rand(1,3,8))
    print(sim_data)

    centroids = farthest_point_sample(sim_data, 4)
    
    print("Sampled pts: ", centroids)

2. PointNet网络结构

数据集中每一行是六个点，及每一点有六个特征（3d坐标，法向量）normal意思是法向量，可以自己设置，要不要使用法向量，使用的话初始输入的点云数据除了3个位置信息x，y，z以外还有三个法向量Nx，Ny，Nz，每个点一共是6个特征。

class PointNetEncoder(nn.Module):
    def __init__(self, global_feat=True, feature_transform=False, channel=3):
        super(PointNetEncoder, self).__init__()
        self.stn = STN3d(channel)
        self.conv1 = torch.nn.Conv1d(channel, 64, 1)
        self.conv2 = torch.nn.Conv1d(64, 128, 1)
        self.conv3 = torch.nn.Conv1d(128, 1024, 1)
        self.bn1 = nn.BatchNorm1d(64)
        self.bn2 = nn.BatchNorm1d(128)
        self.bn3 = nn.BatchNorm1d(1024)
        self.global_feat = global_feat
        self.feature_transform = feature_transform
        if self.feature_transform:
            self.fstn = STNkd(k=64)

    def forward(self, x):
        B, D, N = x.size()
        trans = self.stn(x)
        x = x.transpose(2, 1)
        if D > 3:
            feature = x[:, :, 3:]
            x = x[:, :, :3]
        x = torch.bmm(x, trans)
        if D > 3:
            x = torch.cat([x, feature], dim=2)
        x = x.transpose(2, 1)
        x = F.relu(self.bn1(self.conv1(x)))

        if self.feature_transform:
            trans_feat = self.fstn(x)
            x = x.transpose(2, 1)
            x = torch.bmm(x, trans_feat)
            x = x.transpose(2, 1)
        else:
            trans_feat = None

        pointfeat = x
        x = F.relu(self.bn2(self.conv2(x)))
        x = self.bn3(self.conv3(x))
        x = torch.max(x, 2, keepdim=True)[0]
        x = x.view(-1, 1024)
        if self.global_feat:
            return x, trans, trans_feat
        else:
            x = x.view(-1, 1024, 1).repeat(1, 1, N)
            return torch.cat([x, pointfeat], 1), trans, trans_feat

class PointNetCls(nn.Module):
    def __init__(self, k = 2):
        super(PointNetCls, self).__init__()
        self.k = k
        self.feat = PointNetEncoder(global_feat=False)
        self.conv1 = torch.nn.Conv1d(1088, 512, 1)
        self.conv2 = torch.nn.Conv1d(512, 256, 1)
        self.conv3 = torch.nn.Conv1d(256, 128, 1)
        self.conv4 = torch.nn.Conv1d(128, self.k, 1)
        self.bn1 = nn.BatchNorm1d(512)
        self.bn2 = nn.BatchNorm1d(256)
        self.bn3 = nn.BatchNorm1d(128)

    def forward(self, x):
    	'''分类网络'''
        batchsize = x.size()[0]
        n_pts = x.size()[2]
        x, trans = self.feat(x)
        x = F.relu(self.bn1(self.conv1(x)))
        x = F.relu(self.bn2(self.conv2(x)))
        x = F.relu(self.bn3(self.conv3(x)))
        x = self.conv4(x)
        x = x.transpose(2,1).contiguous()
        x = F.log_softmax(x.view(-1,self.k), dim=-1)
        x = x.view(batchsize, n_pts, self.k)
        return x


class PointNetPartSeg(nn.Module):
    def __init__(self,num_class):
        super(PointNetPartSeg, self).__init__()
        self.k = num_class
        self.feat = PointNetEncoder(global_feat=False)
        self.conv1 = torch.nn.Conv1d(1088, 512, 1)
        self.conv2 = torch.nn.Conv1d(512, 256, 1)
        self.conv3 = torch.nn.Conv1d(256, 128, 1)
        self.conv4 = torch.nn.Conv1d(128, self.k, 1)
        self.bn1 = nn.BatchNorm1d(512)
        self.bn1_1 = nn.BatchNorm1d(1024)
        self.bn2 = nn.BatchNorm1d(256)
        self.bn3 = nn.BatchNorm1d(128)

    def forward(self, x):
        '''分割网络'''
        batchsize = x.size()[0]
        n_pts = x.size()[2]
        x, trans = self.feat(x)
        x = F.relu(self.bn1(self.conv1(x)))
        x = F.relu(self.bn2(self.conv2(x)))
        x = F.relu(self.bn3(self.conv3(x)))
        x = self.conv4(x)
        x = x.transpose(2,1).contiguous()
        x = F.log_softmax(x.view(-1,self.k), dim=-1)
        x = x.view(batchsize, n_pts, self.k)
        return x, trans

通过引入了不同分辨率/尺度的Grouping去对局部做PointNet求局部的全局特征，最后再将不同尺度的特征拼接起来；同时也通过在训练的时候随机删除一部分的点来增加模型的缺失鲁棒性。 -->解决点稀疏问题

3. PointNet++ 网络结构

import torch.nn as nn
import torch.nn.functional as F
from pointnet2_utils import PointNetSetAbstraction
import torch
import numpy as np

class get_model(nn.Module):
    def __init__(self,num_class,normal_channel=True):
        super(get_model, self).__init__()
        in_channel = 6 if normal_channel else 3
        self.normal_channel = normal_channel
        self.sa1 = PointNetSetAbstraction(npoint=512, radius=0.2, nsample=32, in_channel=in_channel, mlp=[64, 64, 128], group_all=False)
        self.sa2 = PointNetSetAbstraction(npoint=128, radius=0.4, nsample=64, in_channel=128 + 3, mlp=[128, 128, 256], group_all=False)
        self.sa3 = PointNetSetAbstraction(npoint=None, radius=None, nsample=None, in_channel=256 + 3, mlp=[256, 512, 1024], group_all=True)
        self.fc1 = nn.Linear(1024, 512)
        self.bn1 = nn.BatchNorm1d(512)
        self.drop1 = nn.Dropout(0.4)
        self.fc2 = nn.Linear(512, 256)
        self.bn2 = nn.BatchNorm1d(256)
        self.drop2 = nn.Dropout(0.4)
        self.fc3 = nn.Linear(256, num_class)

    def forward(self, xyz):
        B, _, _ = xyz.shape
        print("xyz.shape",xyz.shape)
        if self.normal_channel:
            norm = xyz[:, 3:, :]
            xyz = xyz[:, :3, :]
        else:
            norm = None
        l1_xyz, l1_points = self.sa1(xyz, norm)
        l2_xyz, l2_points = self.sa2(l1_xyz, l1_points)
        l3_xyz, l3_points = self.sa3(l2_xyz, l2_points)
        x = l3_points.view(B, 1024)
        x = self.drop1(F.relu(self.bn1(self.fc1(x))))
        x = self.drop2(F.relu(self.bn2(self.fc2(x))))
        x = self.fc3(x)
        x = F.log_softmax(x, -1)


        return x, l3_points



class get_loss(nn.Module):
    def __init__(self):
        super(get_loss, self).__init__()

    def forward(self, pred, target, trans_feat):
        total_loss = F.nll_loss(pred, target)

        return total_loss


if __name__ == "__main__":
    data = torch.ones([24,3,1024])
    print(data.shape)
    model = get_model(num_class=40,normal_channel=False)
    print(model)
    parameters = filter(lambda p: p.requires_grad, model.parameters())
    parameters = sum([np.prod(p.size()) for p in parameters]) / 1_000_000
    print('Trainable Parameters: %.3fM' % parameters)
    pred, trans_feat  = model(data)

    print("Shape of out :", pred.shape)  # [10,30,10]

get_model(
  (sa1): PointNetSetAbstraction(
    (mlp_convs): ModuleList(
      (0): Conv2d(3, 64, kernel_size=(1, 1), stride=(1, 1))
      (1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
      (2): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1))
    )
    (mlp_bns): ModuleList(
      (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (sa2): PointNetSetAbstraction(
    (mlp_convs): ModuleList(
      (0): Conv2d(131, 128, kernel_size=(1, 1), stride=(1, 1))
      (1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
      (2): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1))
    )
    (mlp_bns): ModuleList(
      (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (sa3): PointNetSetAbstraction(
    (mlp_convs): ModuleList(
      (0): Conv2d(259, 256, kernel_size=(1, 1), stride=(1, 1))
      (1): Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1))
      (2): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1))
    )
    (mlp_bns): ModuleList(
      (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (fc1): Linear(in_features=1024, out_features=512, bias=True)
  (bn1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (drop1): Dropout(p=0.4, inplace=False)
  (fc2): Linear(in_features=512, out_features=256, bias=True)
  (bn2): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (drop2): Dropout(p=0.4, inplace=False)
  (fc3): Linear(in_features=256, out_features=40, bias=True)
)

4. HandPointNet 代码阅读

数据处理部分

% create point cloud from depth image
% author: Liuhao Ge

clc;clear;close all;
%使用 fread，文件标识符无效。使用 fopen 生成有效的文件标识符。 这个错误是文件路径不对。
dataset_dir='C:\Users\liudongdong\OneDrive - tju.edu.cn\桌面\HandPointNet\data\cvpr15_MSRAHandGestureDB\';%'../data/cvpr15_MSRAHandGestureDB/'
save_dir='./';
subject_names={'P0','P1','P2','P3','P4','P5','P6','P7','P8'};
%subject_names={'P0'};
%gesture_names={'1'};
gesture_names={'1','2','3','4','5','6','7','8','9','I','IP','L','MP','RP','T','TIP','Y'};

JOINT_NUM = 21;
SAMPLE_NUM = 1024;
sample_num_level1 = 512;
sample_num_level2 = 128;

load('msra_valid.mat');

for sub_idx = 1:length(subject_names)
    mkdir([save_dir subject_names{sub_idx}]);
    
    for ges_idx = 1:length(gesture_names)
        gesture_dir = [dataset_dir subject_names{sub_idx} '/' gesture_names{ges_idx}];
        depth_files = dir([gesture_dir, '/*.bin']);
        
        % 1. read ground truth
        fileID = fopen([gesture_dir '/joint.txt']);
        
        frame_num = fscanf(fileID,'%d',1);    % 读取帧的个数
        A = fscanf(fileID,'%f', frame_num*21*3);   % 读取所有帧的关键点数据
        gt_wld=reshape(A,[3,21,frame_num]);     % 数据reshape操作
        gt_wld(3,:,:) = -gt_wld(3,:,:);
        gt_wld=permute(gt_wld, [3 2 1]);
        
        fclose(fileID);
        
        % 2. get point cloud and surface normal
        save_gesture_dir = [save_dir subject_names{sub_idx} '/' gesture_names{ges_idx}];  %matlab 文件拼接
        mkdir(save_gesture_dir);    %创建存储的路径文件
        
        display(save_gesture_dir);    %显示变量的信息
        
        Point_Cloud_FPS = zeros(frame_num,SAMPLE_NUM,6);
        Volume_rotate = zeros(frame_num,3,3);
        Volume_length = zeros(frame_num,1);
        Volume_offset = zeros(frame_num,3);
        Volume_GT_XYZ = zeros(frame_num,JOINT_NUM,3);
        valid = msra_valid{sub_idx, ges_idx};
        
        for frm_idx = 1:length(depth_files)
            if ~valid(frm_idx)                 %valid 数组主要用于判断这个数据帧是不是有效的
                continue;
            end
            %% 2.1 read binary file
            fileID = fopen([gesture_dir '/' num2str(frm_idx-1,'%06d'), '_depth.bin']);   %num2str(id,'%06d')  文件数据格式
            img_width = fread(fileID,1,'int32');
            img_height = fread(fileID,1,'int32');

            bb_left = fread(fileID,1,'int32');
            bb_top = fread(fileID,1,'int32');
            bb_right = fread(fileID,1,'int32');
            bb_bottom = fread(fileID,1,'int32');
            bb_width = bb_right - bb_left;
            bb_height = bb_bottom - bb_top;

            valid_pixel_num = bb_width*bb_height;

            hand_depth = fread(fileID,[bb_width, bb_height],'float32');     %读取手部区域有效的深度信息
            hand_depth = hand_depth';
            
            fclose(fileID);
            
            %% 2.2 convert depth to xyz
            fFocal_MSRA_ = 241.42;	% mm
            hand_3d = zeros(valid_pixel_num,3);
            for ii=1:bb_height
                for jj=1:bb_width
                    idx = (jj-1)*bb_height+ii;      % 手部区域深度图中每一个像素索引，按列优先
                    hand_3d(idx, 1) = -(img_width/2 - (jj+bb_left-1))*hand_depth(ii,jj)/fFocal_MSRA_;
                    hand_3d(idx, 2) = (img_height/2 - (ii+bb_top-1))*hand_depth(ii,jj)/fFocal_MSRA_;
                    hand_3d(idx, 3) = hand_depth(ii,jj);     % 深度距离值，   这个真实的z应该  是x*x+y*y+z*z=d*d  ??
                end
            end

            valid_idx = 1:valid_pixel_num;
            valid_idx = valid_idx(hand_3d(:,1)~=0 | hand_3d(:,2)~=0 | hand_3d(:,3)~=0);
            hand_points = hand_3d(valid_idx,:);             %过滤无效的数据

            jnt_xyz = squeeze(gt_wld(frm_idx,:,:));
            
            %% 2.3 create OBB
            [coeff,score,latent] = pca(hand_points);   %coeff = pca(X) 返回 n×p 数据矩阵 X 的主成分系数，也称为载荷。X 的行对应于观测值，列对应于变量。
                                              %系数矩阵是 p×p 矩阵。coeff 的每列包含一个主成分的系数，并且这些列按成分方差的降序排列。默认情况下，pca 将数据中心化，并使用奇异值分解 (SVD) 算法。
            if coeff(2,1)<0
                coeff(:,1) = -coeff(:,1);
            end
            if coeff(3,3)<0
                coeff(:,3) = -coeff(:,3);
            end
            coeff(:,2)=cross(coeff(:,3),coeff(:,1));   % 这里几步不太明白作用？

            ptCloud = pointCloud(hand_points);

            hand_points_rotate = hand_points*coeff;    %类似归一化处理，是的bounding box 的朝向基本一致

            %% 2.4 sampling                        %数据少的时候只是在原有的点基础上重复使用了一些点，  这里不知道可不可以直接使用
            if size(hand_points,1)<SAMPLE_NUM
                tmp = floor(SAMPLE_NUM/size(hand_points,1));
                rand_ind = [];
                for tmp_i = 1:tmp
                    rand_ind = [rand_ind 1:size(hand_points,1)];
                end
                rand_ind = [rand_ind randperm(size(hand_points,1), mod(SAMPLE_NUM, size(hand_points,1)))];  %返回行向量，其中包含在 1 到 size(hand_points,1) 之间随机选择的 k 个唯一整数。  
            else
                rand_ind = randperm(size(hand_points,1),SAMPLE_NUM);
            end
            hand_points_sampled = hand_points(rand_ind,:);
            hand_points_rotate_sampled = hand_points_rotate(rand_ind,:);
            
            %% 2.5 compute surface normal
            normal_k = 30;
            normals = pcnormals(ptCloud, normal_k);
            normals_sampled = normals(rand_ind,:);

            sensorCenter = [0 0 0];
            for k = 1 : SAMPLE_NUM
               p1 = sensorCenter - hand_points_sampled(k,:);
               % Flip the normal vector if it is not pointing towards the sensor.
               angle = atan2(norm(cross(p1,normals_sampled(k,:))),p1*normals_sampled(k,:)');
               if angle > pi/2 || angle < -pi/2
                   normals_sampled(k,:) = -normals_sampled(k,:);
               end
            end
            normals_sampled_rotate = normals_sampled*coeff;

            %% 2.6 Normalize Point Cloud    %通过每一轴的最值*scale进行 缩放处理
            x_min_max = [min(hand_points_rotate(:,1)), max(hand_points_rotate(:,1))];
            y_min_max = [min(hand_points_rotate(:,2)), max(hand_points_rotate(:,2))];
            z_min_max = [min(hand_points_rotate(:,3)), max(hand_points_rotate(:,3))];

            scale = 1.2;
            bb3d_x_len = scale*(x_min_max(2)-x_min_max(1));
            bb3d_y_len = scale*(y_min_max(2)-y_min_max(1));
            bb3d_z_len = scale*(z_min_max(2)-z_min_max(1));
            max_bb3d_len = bb3d_x_len;

            hand_points_normalized_sampled = hand_points_rotate_sampled/max_bb3d_len;
            if size(hand_points,1)<SAMPLE_NUM
                offset = mean(hand_points_rotate)/max_bb3d_len;
            else
                offset = mean(hand_points_normalized_sampled);
            end
            hand_points_normalized_sampled = hand_points_normalized_sampled - repmat(offset,SAMPLE_NUM,1);

            %% 2.7 FPS Sampling
            pc = [hand_points_normalized_sampled normals_sampled_rotate];
            % 1st level
            sampled_idx_l1 = farthest_point_sampling_fast(hand_points_normalized_sampled, sample_num_level1)';
            other_idx = setdiff(1:SAMPLE_NUM, sampled_idx_l1);
            new_idx = [sampled_idx_l1 other_idx];
            pc = pc(new_idx,:);
            % 2nd level
            sampled_idx_l2 = farthest_point_sampling_fast(pc(1:sample_num_level1,1:3), sample_num_level2)';
            other_idx = setdiff(1:sample_num_level1, sampled_idx_l2);
            new_idx = [sampled_idx_l2 other_idx];
            pc(1:sample_num_level1,:) = pc(new_idx,:);
            
            %% 2.8 ground truth
            jnt_xyz_normalized = (jnt_xyz*coeff)/max_bb3d_len;
            jnt_xyz_normalized = jnt_xyz_normalized - repmat(offset,JOINT_NUM,1);

            Point_Cloud_FPS(frm_idx,:,:) = pc;
            Volume_rotate(frm_idx,:,:) = coeff;
            Volume_length(frm_idx) = max_bb3d_len;
            Volume_offset(frm_idx,:) = offset;
            Volume_GT_XYZ(frm_idx,:,:) = jnt_xyz_normalized;
        end
        % 3. save files
        save([save_gesture_dir '/Point_Cloud_FPS.mat'],'Point_Cloud_FPS');
        save([save_gesture_dir '/Volume_rotate.mat'],'Volume_rotate');
        save([save_gesture_dir '/Volume_length.mat'],'Volume_length');
        save([save_gesture_dir '/Volume_offset.mat'],'Volume_offset');
        save([save_gesture_dir '/Volume_GT_XYZ.mat'],'Volume_GT_XYZ');
        save([save_gesture_dir '/valid.mat'],'valid');
    end
end

网络代码部分

nstates_plus_1 = [64,64,128]
nstates_plus_2 = [128,128,256]
nstates_plus_3 = [256,512,1024,1024,512]

class PointNet_Plus(nn.Module):
    def __init__(self, opt):
        super(PointNet_Plus, self).__init__()
        self.num_outputs = opt.PCA_SZ
        self.knn_K = opt.knn_K
        self.ball_radius2 = opt.ball_radius2
        self.sample_num_level1 = opt.sample_num_level1
        self.sample_num_level2 = opt.sample_num_level2
        self.INPUT_FEATURE_NUM = opt.INPUT_FEATURE_NUM
        
        self.netR_1 = nn.Sequential(
            # B*INPUT_FEATURE_NUM*sample_num_level1*knn_K
            nn.Conv2d(self.INPUT_FEATURE_NUM, nstates_plus_1[0], kernel_size=(1, 1)),
            nn.BatchNorm2d(nstates_plus_1[0]),
            nn.ReLU(inplace=True),
            # B*64*sample_num_level1*knn_K
            nn.Conv2d(nstates_plus_1[0], nstates_plus_1[1], kernel_size=(1, 1)),
            nn.BatchNorm2d(nstates_plus_1[1]),
            nn.ReLU(inplace=True),
            # B*64*sample_num_level1*knn_K
            nn.Conv2d(nstates_plus_1[1], nstates_plus_1[2], kernel_size=(1, 1)),
            nn.BatchNorm2d(nstates_plus_1[2]),
            nn.ReLU(inplace=True),
            # B*128*sample_num_level1*knn_K
            nn.MaxPool2d((1,self.knn_K),stride=1)
            # B*128*sample_num_level1*1
        )
        
        self.netR_2 = nn.Sequential(
            # B*131*sample_num_level2*knn_K
            nn.Conv2d(3+nstates_plus_1[2], nstates_plus_2[0], kernel_size=(1, 1)),
            nn.BatchNorm2d(nstates_plus_2[0]),
            nn.ReLU(inplace=True),
            # B*128*sample_num_level2*knn_K
            nn.Conv2d(nstates_plus_2[0], nstates_plus_2[1], kernel_size=(1, 1)),
            nn.BatchNorm2d(nstates_plus_2[1]),
            nn.ReLU(inplace=True),
            # B*128*sample_num_level2*knn_K
            nn.Conv2d(nstates_plus_2[1], nstates_plus_2[2], kernel_size=(1, 1)),
            nn.BatchNorm2d(nstates_plus_2[2]),
            nn.ReLU(inplace=True),
            # B*256*sample_num_level2*knn_K
            nn.MaxPool2d((1,self.knn_K),stride=1)
            # B*256*sample_num_level2*1
        )
        
        self.netR_3 = nn.Sequential(
            # B*259*sample_num_level2*1
            nn.Conv2d(3+nstates_plus_2[2], nstates_plus_3[0], kernel_size=(1, 1)),
            nn.BatchNorm2d(nstates_plus_3[0]),
            nn.ReLU(inplace=True),
            # B*256*sample_num_level2*1
            nn.Conv2d(nstates_plus_3[0], nstates_plus_3[1], kernel_size=(1, 1)),
            nn.BatchNorm2d(nstates_plus_3[1]),
            nn.ReLU(inplace=True),
            # B*512*sample_num_level2*1
            nn.Conv2d(nstates_plus_3[1], nstates_plus_3[2], kernel_size=(1, 1)),
            nn.BatchNorm2d(nstates_plus_3[2]),
            nn.ReLU(inplace=True),
            # B*1024*sample_num_level2*1
            nn.MaxPool2d((self.sample_num_level2,1),stride=1),
            # B*1024*1*1
        )
        
        self.netR_FC = nn.Sequential(
            # B*1024
            nn.Linear(nstates_plus_3[2], nstates_plus_3[3]),
            nn.BatchNorm1d(nstates_plus_3[3]),
            nn.ReLU(inplace=True),
            # B*1024
            nn.Linear(nstates_plus_3[3], nstates_plus_3[4]),
            nn.BatchNorm1d(nstates_plus_3[4]),
            nn.ReLU(inplace=True),
            # B*512
            nn.Linear(nstates_plus_3[4], self.num_outputs),
            # B*num_outputs
        )
    def forward(self, x, y):
        # x: B*INPUT_FEATURE_NUM*sample_num_level1*knn_K, y: B*3*sample_num_level1*1
        x = self.netR_1(x)
        # B*128*sample_num_level1*1
        x = torch.cat((y, x),1).squeeze(-1)
        # B*(3+128)*sample_num_level1
        
        inputs_level2, inputs_level2_center = group_points_2(x, self.sample_num_level1, self.sample_num_level2, self.knn_K, self.ball_radius2)
        # B*131*sample_num_level2*knn_K, B*3*sample_num_level2*1
        
        # B*131*sample_num_level2*knn_K
        x = self.netR_2(inputs_level2)
        # B*256*sample_num_level2*1
        x = torch.cat((inputs_level2_center, x),1)
        # B*259*sample_num_level2*1
        
        x = self.netR_3(x)
        # B*1024*1*1
        x = x.view(-1,nstates_plus_3[2])
        # B*1024
        x = self.netR_FC(x)
        # B*num_outputs
        
        return x

学习代码： https://github.com/erikwijmans/Pointnet2_PyTorch.git

level: CVPR, CCF_A
author:Pavlo Molchanov, Xiaodong Yang (NVIDIA)
date: 2016
keyword:

Hand Gesture,

Paper: R3DCNN Dynamic Hand

Online Detection and Classiﬁcation of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks

Proble Statement

Large diversity in how people perform gestures.
Work online to classify before competing a gesture.
Three overlapping phases: preparation, nucleus, and retraction.

previous work:

Hand-crafted spatio-temporal features.
- Shape, appearance, motion cues( image gradients, optical flow).
Feature representations by DNN.
- uNeverova et al. combine color and depth data from hand regions and upper-body skeletons to recognize SL.
Employ pre-segmented video sequences.
Treate detect and classify separately

Methods

Problem Formulation:

Input: a video clip as volume $C_t$ : $C_t\epsilon R^{k*l*c*m}$ ; $m$ : sequential frames; $C$ : channels of size $k * l$ pixels.

$h_t\epsilon R_d$ : a hidden state vector;

$W_{in}\epsilon R^{d*q}$ , $W_h\epsilon R^{d*d}$ , $W_s\epsilon R^{w*d}$ : weight matrices;

$b\epsilon R^w$ : bias;

$S$ : softmax functions, $R^w->R^w_{[0,1]},where [S(x)]_i=e^{x_i}/ \sum_ke^{xk}$
$F: R^{k*l*c*m}->R_q,where f_t=F(C_t)\\ h_t=R(W_{in}f_t+W_hh_{t-1});\\ s_t=S(W_sh_t+b);\\$
For a video $V$ of $T$ clips, get the probabilities set $S$ :
$S={s_0,s_1,...,s_{T-1}}\\ S^{avg}=1/T\sum_{s\epsilon S}s\\ predicted_label:y=argmax_i([s^{avg}]_i)$

system overview:

【Pre-training the 3D-CNN】

initialize the 3D-CNN with the C3D network [37] trained on the large-scale Sport1M [13] human action recognition dataset.
append a softmax prediction layer to the last fully-connected layer and ﬁne-tune by back-propagation with negative log-likelihood to predict gestures classes from individual clips $C_i$ .

【Cost Function】

For Log-likelihood cost function:

$L_v=-1/P \sum_{i=0}^{P-1}log(p(y_i|V_i))\\ p(y_i|V_i)=[s^{avg}]_{y_i}$

【Learning Rule】

To optimize the network parameters $W$ with respect to either of the loss functions we use stochastic gradient descent (SGD) with a momentum term $µ = 0.9$ . We update each parameter of the network θ ∈ W at every back-propagation step i by:

$\theta_i=\theta_{i-1}+v_i-yj\theta_{i-1}\\ v_i=uv_{i-1}-jJ(<\sigma E/\sigma \theta>_{batch})$

Evaluation

Environment:
- Dataset: used the SoftKinetic DS325 sensor to acquire frontview color and depth videos and a top-mounted DUO 3D sensor to record a pair of stereo-IR streams.
- randomly split the data by subject into training (70%) and test (30%) sets, resulting in 1050 training and 482 test videos.
- SKIG contains 1080 RGBD hand gesture sequences by 6 subjects collected with a Kinect sensor
- ChaLearn 2014 dataset contains more than 13K RGBD videos of 20 upper-body Italian sign language gestures performed by 20 subjects
Results:

Conclusion

Design R3DCNN to performs simultaneous detection and classification.
Using CTC model to predict label from in-progress gesture in unsegmented input streams.
Achieves high accuracy of 88.4%.

你可能感兴趣的:(论文阅读,iot,ai)

普通人怎么利用AI赚钱？AI 变现的 8 种神操作，最后一个你绝对想不到！ AI设计酷卡人工智能 stable diffusion AI作画 AIGC midjourney
在国内外，几百款AI工具竞争激烈，衍生出各种需求与市场。下面我们就来盘点AI变现的八大生意，看看你能猜到几个？一、AI文本生成：打造公众号矩阵提到AI，ChatGPT无疑是最为知名的工具之一，其核心功能在于生成高质量文本，写出热门文章。许多人利用AI文本生成的能力，成功构建公众号矩阵，创造出大量10w+的文章，甚至有流量主月入过万。今年上半年，一些知名账号每分钟发布数篇文章，依靠AI技术和自动化手
Trae使用教程，帮助您快速上手这款编程神器。云上的阿七云计算
Trae是一款由字节跳动推出的AI驱动集成开发环境（IDE），旨在通过智能代码补全、多模态交互以及对整个代码库的上下文分析等功能，帮助开发者更高效地编写代码。其强大的AI能力能够理解开发者的需求并提供精准的代码生成和修改建议。目前，Trae提供免费版本，集成了Claude-3.5-Sonnet和GPT-4o等主流大模型。rae使用教程，帮助您快速上手这款编程神器。一、安装Trae访问官网：前往Tr
从头开始学C语言第三十二天——函数神阶平天牛魔王 c语言
函数可以定义为完成特定功能的模块，函数程序代码独立，通常要求要有返回值，也就是return，也可以返回空值0主要函数分为三类：主函数也就是main函数库函数，包括用过的scanf，printf，strlen，strcpy等包含在stdio.h，string.h等库中自定义函数，程序员自己定义的函数模块一般形式：(){语句序列；return[()]；}数据类型是整个函数返回值的类型return语句表
第三十九个问题-详细讲讲PPO & GRPO原理释迦呼呼 AI一千问人工智能深度学习机器学习语言模型自然语言处理算法
PPO（ProximalPolicyOptimization）原理详解PPO（近端策略优化）是OpenAI于2017年提出的强化学习算法，旨在解决传统策略梯度方法中训练不稳定和样本效率低的问题。其核心思想是通过限制策略更新的幅度，确保新策略不会偏离旧策略太远，从而稳定训练过程。1.策略梯度（PolicyGradient）基础策略梯度方法通过直接优化策略参数θθ来最大化期望回报。目标函数为：J(θ)
python调用DeepSeek的API garfield_sun06 大模型 python 语言模型
1获取API获得deepseek开放平台的APIhttps://platform.deepseek.com/api_keys点击创建APIkey2调用方法方法一：采用openai的调用方法pipinstallopenai需要openai的包调用的代码框架fromopenaiimportOpenAIimportosclient=OpenAI(api_key='自己的APIkey',base_url=
python智能合约编程_技术指南 | Python智能合约开发？看这一篇就够了 weixin_39897127 python智能合约编程
01前言在之前的技术视点文章中，我们介绍了目前本体主网支持的智能合约体系以及相应的智能合约开发工具SmartX。很多小伙伴都想上手练一练。在本期的技术视点中，我们将正式开始讲述智能合约语法部分。本体的智能合约API分为7个模块，分别是Blockchain&BlockAPI、RuntimeAPI、StorageAPI、NativeAPI、UpgradeAPI、ExecutionEngineAPI以及
langchain chroma 与 chromadb笔记 phynikesi langchain 笔记 chromadb
chromadb可独立使用也可搭配langchain框架使用。环境：python3.9langchain=0.2.16chromadb=0.5.3chromadb使用示例importchromadbfromchromadb.configimportSettingsfromchromadb.utilsimportembedding_functions#加载embedding模型en_embeddin
AI 生成 PPT 网站介绍与优缺点分析 KL_lililli 人工智能 powerpoint
随着人工智能技术不断发展，利用AI自动生成PPT已成为提高演示文稿制作效率的热门方式。本文将介绍几款主流的AIPPT工具，重点列出免费使用机会较多的网站，并对各平台的优缺点进行详细分析，帮助用户根据自身需求选择合适的工具。1.免费及免费试用机会较多的网站1.1Tome网址：Tome–TheAIassistantforsales简介：Tome是一款专注于AI助力讲故事与演示制作的工具，用户只需输入简
LLM大模型提示工程Prompt Engineering Langchain prompt langchain 私有化大模型人工智能产品经理 ai大模型 LLM
在LLM中影响词汇的分布主要通过两种方式，一种是通过提示（Prompting），另外一种就是通过训练（Training）。提示是影响词汇分布最简单的方法，通过给LLM输入提示文本（有时会包含指令和示例）使得词汇的分布概率发生变化。以上一篇中提到的例子说明，最初的语句是“我写信给农场，希望他们送我一个宠物，他们送给我一只（）“词汇的分布如下：代码语言：javascript**复制牛0.1羊0.2狗0
用ACM模式模板刷hot100 boguboji java
面试手撕给的模板基础上写给的模板一般是下面这样把while内容删除（一般刷hot100题目输入不需要同时输入几组）第一个方法里写处理输入输出自己再写一个方法，就是力扣里的核心代码（加上static）第一个处理输入输出的方法里面调用第二块的方法importjava.util.*;publicclassMain{publicstaticvoidmain(String[]args){Scannerin=
python电脑怎么打开任务管理器_利用Python调用Windows API，实现任务管理器功能 weixin_39778400
任务管理器具体功能有：1、列出系统当前所有进程。2、列出隶属于该进程的所有线程。3、如果进程有窗口，可以显示和隐藏窗口。4、强行结束指定进程。通过Python调用WindowsAPI还是很实用的，能够结合Python的简洁和WindowsAPI的强大，写出各种各样的脚本。编码中的几个难点有：1、API的入参是结构体时，怎么解决？答：Python内手动建立结构体。详见：https://baijiah
Java Panama 项目：Java 与 AI 的融合 AI天才研究院计算 Java实战 DeepSeek R1 &大数据AI人工智能大模型人工智能 java python
JavaPanama项目：Java与AI的融合Java在AI领域的优势Java在AI领域的优势主要体现在以下几个方面：强大的生态系统：Java拥有丰富的库和框架，为AI开发提供了坚实的基础。跨平台性：Java的“一次编写，到处运行”特性，降低了AI应用的运维成本。高性能与稳定性：Java虚拟机（JVM）的优化和垃圾回收机制，确保了AI应用的高效运行和内存管理。实时数据处理能力：Java可以高效处理
Leetcode 306. Additive Number 小白菜又菜 Leetcode 解题报告 leetcode python 深度优先
ProblemAnadditivenumberisastringwhosedigitscanformanadditivesequence.Avalidadditivesequenceshouldcontainatleastthreenumbers.Exceptforthefirsttwonumbers,eachsubsequentnumberinthesequencemustbethesumoft
OpenCV 基础模块 Python 版 ice_junjun OpenCV opencv python 计算机视觉
OpenCV基础模块权威指南（Python版）一、模块全景图plaintextOpenCV架构(v4.x+)├─核心层│├─core：基础数据结构与操作（Mat/Scalar/Point）│└─imgproc：图像处理流水线（滤波→变换→检测）├─交互层│├─highgui：GUI与媒体I/O（显示/捕获/交互）│└─video：视频分析（运动检测/目标跟踪）├─3D视觉层│└─calib3d：相
oracle12c 监控表状态，类似触发器，获取表名称乱码问题 YiWait Java java oracle
1、类似触发器原理，实时监听2、解决获取表名称乱码问题进入调试模式查看源码里面这个类，oracletableName的编码模式：主体代码如下：搞了两天终于发现问题所在，tablename开始出来是???这种乱码。确定是字符集编码的问题，在网上找了类似问题。需要引入oracle的语言包。@Slf4jpublicclassMyTest{publicstaticvoidmain(String[]args
0 Token 间间隔 100% GPU 利用率，百度百舸 AIAK 大模型推理引擎极限优化 TPS 百度云大模型gpu
01什么是大模型推理引擎大模型推理引擎是生成式语言模型运转的发动机，是接受客户输入prompt和生成返回response的枢纽，也是拉起异构硬件，将物理电能转换为人类知识的变形金刚。大模型推理引擎的基本工作模式可以概括为，接收包括输入prompt和采样参数的并发请求，分词并且组装成batch输入给引擎，调度GPU执行前向推理，处理计算结果并转为词元返回给用户。和人类大脑处理语言的机制类似，大模型首
AI算力要变天了？一文搞懂ASIC和GPU asicgpuai芯片
近期，全球股市的动荡中，ASIC和GPU这两个科技股概念突然变得火热，引起了市场的高度关注。博通作为ASIC的代表，股价一路猛涨，而英伟达作为GPU的代表，股价却一路下跌。这是否意味着AI算力市场即将变天？随着人工智能技术的飞速发展，AI算力的重要性日益凸显。从早期的简单模型训练到如今的大规模语言模型如ChatGPT等的出现，对算力的需求呈爆发式增长。01那什么是ASIC和GPU？ASIC：定制化
云智慧：拥抱AI算法驱动的智能运维服务创新引擎
随着信息化、数字化、智能化的加码，企业对人工智能、大数据等技术应用呈现出明显兴趣，海笔研究对国内中型规模企业调研表明，在2020年，54.1%的企业选择购买人工智能类应用，41.9%的企业选择购买大数据及BI类应用，各类产品软件的应用大幅提升了企业信息系统复杂度，以及运维管理难度。业务发展催生服务需求从系统管理者角度出发，信息系统从“单机Excel表格”到“集中式单系统”再到“微服务、云架构”等，
算力租赁：人工智能时代的“水电煤”革命——以NVIDIA 4090为例解读下一代算力解决方案算法工程gpu
引言：当AI算力需求遇上“算力饥渴症”2023年，ChatGPT仅用2个月突破1亿用户，StableDiffusion让普通人秒变艺术家，但背后是单次训练消耗超10万GB内存、千亿级参数的恐怖算力需求。当全球AI企业陷入“算力饥渴症”时，一种名为算力租赁的创新模式正以每年37%的增速（MarketsandMarkets数据）重塑行业格局。本文将深度解析这一革命性服务，并聚焦搭载NVIDIARTX4
AI Agent赛道：昙花一现还是生态革命？6大咖拆解泡沫与未来人工智能比特币区块链web3
作者：CRYPTO币圈不设防币圈不设防第四期Space总结：AIAgent赛道还能火多久？在Web3华语主持人茄哥的主持下，第四期《币圈不设防》围绕“AIAgent赛道还能火多久？”展开深度探讨。本期嘉宾阵容强大，包括Uweb校长于佳宁、TradingBaseAI创始人Mr.Z、BuilderLogEarn、区块链爱好者flyawei、投研博主清风#BTC，以及社区领袖小智。以下是讨论的核心观点总
AI 真的懂你问的问题吗？ llmclaudeopenai
Hey,我是沉浸式趣谈本文首发于【沉浸式趣谈】，我的个人博客https://yaolifeng.com也同步更新。转载请在文章开头注明出处和版权信息。如果本文对您有所帮助，请点赞、评论、转发，支持一下，谢谢！AI真的懂你问的问题吗？AI—它可能是个「语言魔术师」，但绝对不是「人类大脑」你心血来潮问AI：你：「为什么古埃及人建造金字塔？」AI（认真回答）：「古埃及人建造金字塔主要是作为法老的陵墓，同
英伟达开源超强模型Nemotron-70B；OpenAI推出Windows版ChatGPT桌面客户端 go2coding AI日报 chatgpt
AI新闻英伟达开源超强模型Nemotron-70B摘要：英伟达近日开源了新型AI模型Nemotron-70B，迅速超越GPT-4o和Claude3.5Sonnet，成为AI社区的新宠。该模型在多项基准测试中表现优异，采用混合训练方法和人类反馈强化学习，模型权重已在HuggingFace发布。Niemotron-70B的开发基于Llama-3.1，且开源数据集加强其训练效果。分析指出，英伟达的策略是
Nacos Server 的启动入口在哪里？启动参数有哪些？冰糖心书房 Nacos源码系列服务发现 java
一、NacosServer启动入口NacosServer的启动入口位于nacos-server模块的com.alibaba.nacos.Nacos类。主类:com.alibaba.nacos.Nacos主方法:publicstaticvoidmain(String[]args)当运行NacosServer的启动脚本(startup.sh或startup.cmd)时，脚本最终会执行java命令，并指
AI大模型产品经理学习路线，2025最新，从AI产品经理零基础入门到精通，非常详细收藏我这一篇够了！ AGI-杠哥人工智能产品经理学习语言模型 agi 自然语言处理
随着人工智能技术的发展，尤其是大模型（LargeModel）的兴起，越来越多的企业开始重视这一领域的投入。作为大模型产品经理，你需要具备一系列跨学科的知识和技能，以便有效地推动产品的开发、优化和市场化。以下是一份详细的大模型产品经理学习路线，旨在帮助你构建所需的知识体系，从零基础到精通。一、基础知识阶段1.计算机科学基础数据结构与算法：理解基本的数据结构（如数组、链表、树、图等）和常用算法（如排序
大模型实战—你的个人AI数字大脑Khoj 不二人生大模型人工智能大模型
Khoj是你的开源个人AI伴侣，提供即时答案。Khoj轻松地深入知识，简化复杂信息，整合你的个人背景，并根据你的独特需求量身定制响应。在线问题：如果你有一个问题需要从互联网获取最新的信息，Khoj可以进行在线搜索，找到相关答案。例如，查询当前的天气情况或某个新闻事件的最新动态。本地笔记和文档：如果你有很多保存的笔记、PDF文件、Markdown文档、GitHub仓库或Notion文件，Khoj可以
Dify1.01版本vscode 本地环境搭建运行实践 hamish-wu vscode 编辑器 dify 大模型 python flask
dify是python编写的低代码AI开发平台，是常用的大模型开发平台。本文基于最新的1.0.1版本实践完成，有需要的可以私信交流。咨询免费，详细文档及视频需要一定成本，大概相当于节约的时间成本。搭建环境windows11开发工具vscode搭建步骤：1.Startthedocker-composestackwindow环境下运行docker命令，需要下载docker官网镜像，会遇到timeout
SL导轨通常指的是“直线导轨”（Linear Guide），也称为线性滑轨或直线轴承 getapi 数据库云计算
SL导轨通常指的是“直线导轨”（LinearGuide），也称为线性滑轨或直线轴承。它是一种机械传动元件，广泛应用于需要高精度直线运动的机械设备中。SL导轨的主要功能是为运动部件提供平稳、精确的直线导向，同时承受一定的负载。以下是关于SL导轨的一些关键信息：1.SL导轨的基本结构SL导轨通常由以下几个主要部分组成：导轨（Rail）：安装在固定部件上，作为滑块的运动轨道。滑块（Block/Carri
ERROR: Failed building wheel for pyaudioFailed to build pyaudioERROR: ERROR: Failed to build insta 小李飞刀李寻欢 python audio pyaudio 安装库 python
ERROR:FailedbuildingwheelforpyaudioFailedtobuildpyaudioERROR:ERROR:Failedtobuildinstallablewheelsforsomepyproject.tomlbasedprojects(pyaudio)这个错误表明在编译pyaudio时缺少PortAudio开发库。以下是完整解决方案：Linux系统解决方案#1.安装系统
字节跳动离职后，转行学起了AI大模型！该说不说，真的香！！小城哇哇人工智能 AI大模型语言模型 agi ai LLM 转行
个人自我介绍鄙人出生于南方小乡镇，为了走出小镇，在当地够拼够努力，不是自夸，确确实实也算得上“别人家的小孩”，至少在学习这件事情少，没有要家里人操过心。高考特别顺利，一个老牌985，具体哪个学校就不说了，不想给母校丢脸。毕业后，也算是“风光”地进入了字节跳动。做的是运维测试。在职期间刚入职的时候真的信心满满⛽️，但才3天就感受到了互联网头部公司的强度不是一般的大。明面上的早十晚八工作制完全不存在，
别只会用别人的模型了，自学Ai大模型，顺序千万不要搞反了！刚入门的小白必备！鸡腿爱学习人工智能学习自然语言处理服务器数据库
大家好，我是JackBytes，一个专注于将人工智能应用于日常生活的半吊子程序猿，平时主要分享AI、NAS、Docker、搞机技巧、开源项目等。在使用诸如DeepSeek、ChatGPT、豆包、文心一言等大模型之余，你是否知道这些大模型背后的技术原理是什么？假如让你从头开始学习大模型，你知道应该遵循什么样的路线嘛？今天给大家介绍一下Ai大模型的学习路线，顺序千万不要搞反了！，大家可以按照这个路线进
mondb入手木zi_鸣 mongodb
windows 启动mongodb 编写bat文件， mongod --dbpath D:\software\MongoDBDATA mongod --help 查询各种配置配置在mongob 打开批处理，即可启动，27017原生端口，shell操作监控端口扩展28017，web端操作端口启动配置文件配置，数据更灵活
大型高并发高负载网站的系统架构 bijian1013 高并发负载均衡
扩展Web应用程序一.概念简单的来说，如果一个系统可扩展，那么你可以通过扩展来提供系统的性能。这代表着系统能够容纳更高的负载、更大的数据集，并且系统是可维护的。扩展和语言、某项具体的技术都是无关的。扩展可以分为两种： 1.
DISPLAY变量和xhost(原创) czmmiao display
DISPLAY 在Linux/Unix类操作系统上, DISPLAY用来设置将图形显示到何处. 直接登陆图形界面或者登陆命令行界面后使用startx启动图形, DISPLAY环境变量将自动设置为:0:0, 此时可以打开终端, 输出图形程序的名称(比如xclock)来启动程序, 图形将显示在本地窗口上, 在终端上输入printenv查看当前环境变量, 输出结果中有如下内容:DISPLAY=:0.0
获取B/S客户端IP 周凡杨 java 编程 jsp Web 浏览器
最近想写个B/S架构的聊天系统，因为以前做过C/S架构的QQ聊天系统，所以对于Socket通信编程只是一个巩固。对于C/S架构的聊天系统，由于存在客户端Java应用，所以直接在代码中获取客户端的IP，应用的方法为： String ip = InetAddress.getLocalHost().getHostAddress(); 然而对于WEB
浅谈类和对象朱辉辉33 编程
类是对一类事物的总称，对象是描述一个物体的特征，类是对象的抽象。简单来说，类是抽象的，不占用内存，对象是具体的，占用存储空间。类是由属性和方法构成的，基本格式是public class 类名{ //定义属性 private/public 数据类型属性名； //定义方法 publ
android activity与viewpager+fragment的生命周期问题肆无忌惮_ viewpager
有一个Activity里面是ViewPager，ViewPager里面放了两个Fragment。第一次进入这个Activity。开启了服务，并在onResume方法中绑定服务后，对Service进行了一定的初始化，其中调用了Fragment中的一个属性。 super.onResume(); bindService(intent, conn, BIND_AUTO_CREATE);
base64Encode对图片进行编码 843977358 base64 图片 encoder
/** * 对图片进行base64encoder编码 * * @author mrZhang * @param path * @return */ public static String encodeImage(String path) { BASE64Encoder encoder = null; byte[] b = null; I
Request Header简介 aigo servlet
当一个客户端(通常是浏览器)向Web服务器发送一个请求是，它要发送一个请求的命令行，一般是GET或POST命令，当发送POST命令时，它还必须向服务器发送一个叫“Content-Length”的请求头(Request Header) 用以指明请求数据的长度，除了Content-Length之外，它还可以向服务器发送其它一些Headers，如：
HttpClient4.3 创建SSL协议的HttpClient对象 alleni123 httpclient 爬虫 ssl
public class HttpClientUtils { public static CloseableHttpClient createSSLClientDefault(CookieStore cookies){ SSLContext sslContext=null; try { sslContext=new SSLContextBuilder().l
java取反 -右移-左移-无符号右移的探讨百合不是茶位运算符位移
取反：在二进制中第一位，1表示符数，0表示正数 byte a = -1; 原码：10000001 反码：11111110 补码：11111111 //异或: 00000000 byte b = -2; 原码：10000010 反码：11111101 补码：11111110 //异或: 00000001
java多线程join的作用与用法 bijian1013 java 多线程
对于JAVA的join，JDK 是这样说的：join public final void join （long millis ）throws InterruptedException Waits at most millis milliseconds for this thread to die. A timeout of 0 means t
Java发送http请求(get 与post方法请求) bijian1013 java spring
PostRequest.java package com.bijian.study; import java.io.BufferedReader; import java.io.DataOutputStream; import java.io.IOException; import java.io.InputStreamReader; import java.net.HttpURL
【Struts2二】struts.xml中package下的action配置项默认值 bit1129 struts.xml
在第一部份，定义了struts.xml文件，如下所示： <!DOCTYPE struts PUBLIC "-//Apache Software Foundation//DTD Struts Configuration 2.3//EN" "http://struts.apache.org/dtds/struts
【Kafka十三】Kafka Simple Consumer bit1129 simple
代码中关于Host和Port是割裂开的，这会导致单机环境下的伪分布式Kafka集群环境下，这个例子没法运行。实际情况是需要将host和port绑定到一起， package kafka.examples.lowlevel; import kafka.api.FetchRequest; import kafka.api.FetchRequestBuilder; impo
nodejs学习api ronin47 nodejs api
NodeJS基础什么是NodeJS JS是脚本语言，脚本语言都需要一个解析器才能运行。对于写在HTML页面里的JS，浏览器充当了解析器的角色。而对于需要独立运行的JS，NodeJS就是一个解析器。每一种解析器都是一个运行环境，不但允许JS定义各种数据结构，进行各种计算，还允许JS使用运行环境提供的内置对象和方法做一些事情。例如运行在浏览器中的JS的用途是操作DOM，浏览器就提供了docum
java-64.寻找第N个丑数 bylijinnan java
public class UglyNumber { /** * 64.查找第N个丑数具体思路可参考 [url] http://zhedahht.blog.163.com/blog/static/2541117420094245366965/[/url] * 题目：我们把只包含因子 2、3和5的数称作丑数（Ugly Number）。例如6、8都是丑数，但14
二维数组（矩阵）对角线输出 bylijinnan 二维数组
/** 二维数组对角线输出两个方向例如对于数组： { 1, 2, 3, 4 }, { 5, 6, 7, 8 }, { 9, 10, 11, 12 }, { 13, 14, 15, 16 }, slash方向输出： 1 5 2 9 6 3 13 10 7 4 14 11 8 15 12 16 backslash输出： 4 3
[JWFD开源工作流设计]工作流跳跃模式开发关键点(今日更新) comsci 工作流
既然是做开源软件的,我们的宗旨就是给大家分享设计和代码,那么现在我就用很简单扼要的语言来透露这个跳跃模式的设计原理大家如果用过JWFD的ARC-自动运行控制器,或者看过代码,应该知道在ARC算法模块中有一个函数叫做SAN(),这个函数就是ARC的核心控制器,要实现跳跃模式,在SAN函数中一定要对LN链表数据结构进行操作,首先写一段代码,把
redis常见使用 cuityang redis 常见使用
redis 通常被认为是一个数据结构服务器，主要是因为其有着丰富的数据结构 strings、map、 list、sets、 sorted sets 引入jar包 jedis-2.1.0.jar (本文下方提供下载) package redistest; import redis.clients.jedis.Jedis; public class Listtest
配置多个redis dalan_123 redis
配置多个redis客户端 <?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi=&quo
attrib命令 dcj3sjt126com attr
attrib指令用于修改文件的属性.文件的常见属性有:只读.存档.隐藏和系统. 只读属性是指文件只可以做读的操作.不能对文件进行写的操作.就是文件的写保护. 存档属性是用来标记文件改动的.即在上一次备份后文件有所改动.一些备份软件在备份的时候会只去备份带有存档属性的文件.
Yii使用公共函数 dcj3sjt126com yii
在网站项目中，没必要把公用的函数写成一个工具类，有时候面向过程其实更方便。在入口文件index.php里添加 require_once('protected/function.php'); 即可对其引用，成为公用的函数集合。 function.php如下： <?php /** * This is the shortcut to D
linux 系统资源的查看（free、uname、uptime、netstat） eksliang netstat linux uname linux uptime linux free
linux 系统资源的查看转载请出自出处：http://eksliang.iteye.com/blog/2167081 http://eksliang.iteye.com 一、free查看内存的使用情况语法如下： free [-b][-k][-m][-g] [-t] 参数含义 -b:直接输入free时，显示的单位是kb我们可以使用b(bytes),m
JAVA的位操作符 greemranqq 位运算 JAVA位移 <<>>>
最近几种进制，加上各种位操作符，发现都比较模糊，不能完全掌握，这里就再熟悉熟悉。 1.按位操作符：按位操作符是用来操作基本数据类型中的单个bit,即二进制位，会对两个参数执行布尔代数运算，获得结果。与（&）运算： 1&1 = 1, 1&0 = 0, 0&0 &
Web前段学习网站 ihuning Web
Web前段学习网站菜鸟学习：http://www.w3cschool.cc/ JQuery中文网：http://www.jquerycn.cn/ 内存溢出：http://outofmemory.cn/#csdn.blog http://www.icoolxue.com/ http://www.jikexue
强强联合：FluxBB 作者加盟 Flarum justjavac r
原文：FluxBB Joins Forces With Flarum作者：Toby Zerner译文：强强联合：FluxBB 作者加盟 Flarum译者：justjavac FluxBB 是一个快速、轻量级论坛软件，它的开发者是一名德国的 PHP 天才 Franz Liedke。FluxBB 的下一个版本(2.0)将被完全重写，并已经开发了一段时间。FluxBB 看起来非常有前途的，
java统计在线人数（session存储信息的） macroli java Web
这篇日志是我写的第三次了前两次都发布失败！郁闷极了！由于在web开发中常常用到这一部分所以在此记录一下，呵呵，就到备忘录了！我对于登录信息时使用session存储的，所以我这里是通过实现HttpSessionAttributeListener这个接口完成的。 1、实现接口类，在web.xml文件中配置监听类，从而可以使该类完成其工作。 public class Ses
bootstrp carousel初体验快速构建图片播放 qiaolevip 每天进步一点点学习永无止境 bootstrap 纵观千象
img{ border: 1px solid white; box-shadow: 2px 2px 12px #333; _width: expression(this.width > 600 ? "600px" : this.width + "px"); _height: expression(this.width &
SparkSQL读取HBase数据，通过自定义外部数据源 superlxw1234 spark sparksql sparksql读取hbase sparksql外部数据源
关键字：SparkSQL读取HBase、SparkSQL自定义外部数据源前面文章介绍了SparSQL通过Hive操作HBase表。 SparkSQL从1.2开始支持自定义外部数据源(External DataSource)，这样就可以通过API接口来实现自己的外部数据源。这里基于Spark1.4.0，简单介绍SparkSQL自定义外部数据源，访
Spring Boot 1.3.0.M1发布 wiselyman spring boot
Spring Boot 1.3.0.M1于6.12日发布，现在可以从Spring milestone repository下载。这个版本是基于Spring Framework 4.2.0.RC1,并在Spring Boot 1.2之上提供了大量的新特性improvements and new features。主要包含以下： 1.提供一个新的sprin