呆呆的猫

【数字人】2、MODA | 基于人脸关键点的语音驱动单张图数字人生成（ICCV2023）

文章目录

- 一、背景
- 二、方法
- - 2.1 问题描述和数据预处理
  - 2.2 Mapping-Once network with Dual Attentions
  - 2.3 Facial Composer Network
  - 2.4 使用 TPE 来合成人像图片
- 三、效果
- - 3.1 训练细节
  - 3.2 数据
  - 3.3 测评指标
  - 3.4 结果比较
- 四、代码
- - 4.1 数据前处理
  - 4.2 训练
  - 4.3 推理

论文：MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions

代码：https://tinyurl.com/iccv23-moda

出处：ICCV2023

贡献：

提出了一个 unifided MODA 网络，能够经过一次映射来同时获得确定的唇部动作和不确定的其他面部动作
是一种基于密集关键点的方法，能够同时驱动嘴、眼、头、肩的运动，更自然

和典型方法的对比：

Wav2Lip(MM2020) ：下半张脸被模糊了(Wav2Lip-GFPGAN 使用两个模型提升输出结果的分辨率）
PC-AVS(CVPR2021) ：基本都是正脸的图，头部变化不够多样
MakeItTalk(SIGGRAPH 2020)：由于使用的 2D warping 所以脸部会扭曲。
Audio2Head(IJCAI2021)：只会产生正脸的图，且由于使用的 2D warping 所以脸部会扭曲
SadTalker(CVPR2023)：唇部同步性较好，唇部较清晰，头部运动较丰富，牙齿不够清晰，没有考虑除唇部动作和眨眼外的其他面部表情，表情比较固定
MODA(ICCV2023)：使用一个模型（双分支）来学习确定性的【唇部】、不确定性的【眼部+面部+head+身体】的关键点，理论上能让动作看起来更自然，能保留更多的面部细节。

一、背景

talking head 是通过一个给定的语音信号来驱动图片，从而合成一个和语音同频的说话的视频

之前的方法 [7,29,52] 都是学习语音和图片帧之间的关系，且一般会忽略 head pose（因为他们认为 head pose 难以和面部动作分开）。

很多 3D 面部重建的方法和基于 GAN 的方法一般会估计一个中间表达（3D face shape、2D landmark、face expression parameters 等）来帮助生成

但是，这些稀疏的表达会丢失很多面部细节，导致过平滑（over-smooth）

NeRF[10,44] 以其高保真结果也受到了很多关注，但是其难以控制

虽然前面提到了这么多方法，但是生成一个真实且表情丰富的 talking vedio 仍然很难，因为人们对合成的 vedio 很敏感，所以要达到可用的效果要达到很高的标准

主要要考虑的问题如下：

正确性：合成的 vedio 要和驱动的 audio 高度一致
高视觉质量：合成的 vedio 要有高分辨率且包含很多细节信息
多样性：说话时主要是嘴唇需要很好的和声音同步，而眨眼和头部动作时不确定的，但也需要和正常人说话的动作类似

为了实现上面三个目标，之前的方法有的将 mouth landmark 和 head pose 分开学习，使用不同的 sub-network [22,50]，还有的方法只对 mouth 运动建模，head pose 是从其他 vedio 中拿来的[29,52]。但是这样 lip-sync 和其他运动会缺少关联，导致不确定的结果。

本文中，作者提出了 MODA，mapping-once network with dual attentions，是一个统一的结构来生成不同的表达，简化了步骤。

为了将唇部动作和其他动作结合起来，作者设计了一个 dual-attention module 来分别学习确定性的映射（确定的 mouth movement）和概率采样（the diverse head pose/eye blinking from time-totime）。
transformer-based dual attention module：生成准确且多样性的表示特征
facial composer network：得到更准确和细节的面部 landmark
tenporally guided renderer：合成 vedio

二、方法

整体框架如图 2 所示，本文方法主要是为了生成高保真 talking head，且具有确定的 lip motion 和其他的 multi-modal motion（head pose、eye blinking、torso movements）

共包含 3 个部分：

首先，给定 driven audio 和 conditioned subjects，MODA 会生成多模态和正确的语义人像部件
然后，面部合成网络会将 ficial component 结合起来，并添加一些细节面部细节
最后，使用具有时间位置嵌入（temporally positional embedding, TPE）的人像渲染器来合成高保真且稳定的视频

2.1 问题描述和数据预处理

给定一个长度为 t 的音频序列 $A=\{a_0, a_1,...,a_T\}$ ，其音频采样率为 r

本文的 talking portrait （说话人像）方法主要的面部是将这个音频映射到对应的视频 video clip 中， $V=\{I_0,I_1,...,I_K\}$ ，且 FPS 为 f， $K=\lfloor{fT/r}\rfloor$

由于 V 远远大于 A，很多方法提出逐步生成 V，并且引入很多中间表达 R，为了让 V 看起来更自然，那么多 R 的约束就自然很重要了

在之前的 audio-driven face 生成任务中，R 一般都是一种 face information（如 facial landmark、head pose）

为了更好的表达说话人像，本文作者定义 R 是多种不同的人像描述， $R=P^M, P_E, P^F, H, P^T$ ：

嘴部关键点 $P^M$ ：40 个
眼部关键点 $P^E$ ：60 个，包括眼睛和眉毛的关键点，控制眨眼
面部关键点 $P^F$ ：478 个，是密集的面部 3D 关键点，用于控制面部表情细节
头部动作 $H$ ：6 个，head rotation $(\theta, \phi, \psi)$ ，head transpose $(x, y, z)$
躯体动作 Torso points $P^T$ ：18 个，每个肩膀 9 个

所以，整个 talking portrait 可以被写为 A→R→V，作者也是分别设计了对应的网络来实现对应的过程

数据预处理：关键点提取

使用 Mediapipe 抽取 478 个 3D facial keypoints
使用 WHENet 估计 head pose
使用 BiseNet 分割，然后抽取出肩部关键点

2.2 Mapping-Once network with Dual Attentions

Mapping-once 结构：如图 3 所示

给定 driven audio A 和 subject condition S，MODA 的作用是使用一次前向过程来将其投影到 R 中（lip movement, eye blinking, head pose, and torso）
第一步：分别使用两个 encoder 来编码 audio feature 和抽取 subject style
第二步：使用一个 dual-attention module 来生成多样且确定的 motion feature
第三步：分别使用 4 个 decoder 来得到对应的关键点

audio 特征处理：

audio feature 抽取：首先使用 Wav2Vec[30] 来抽取语音上下文信息，然后使用 MLP 映射到 $s_a \in R^{d \times T}$ ，d 是一帧数据的特征维度，T 是待生成的 vedio 的 frame 的个数
为了建模不同说话风格，作者使用 conditioned subject 的 facial vertices 作为输入，然后将这些 vertices 映射到 d 维向量 $v_s$ 中作为 subject style code，这里的映射也是使用 MLP 来实现的，然后对 $s_a$ 和 $v_s$ 进行结合，得到结合后的特征 $s$
dual-attention module 的输入是 $s$ 和 $s_a$ ，输出是时序上下文 $s_t$
然后，使用 4 个 MLP 来解码不同的关键点

Dual-attention module：

specific attention branch：SpecAttn
probabilistic attention branch：ProbAttn

由于 talking portrait 生成任务需要从有限的驱动信息中生成多模态的输出，所以该任务具有很大的不确定性

本文方法提出的 dual-attention 模型，将这个任务解耦成了下面两个任务：

specific mapping ：得到时序对齐的确定的 audio 和 lip movement 特征
probabilistic mapping：得到时序关联的概率 audio 和 other movements 特征
作者使用两个子模块来分别学习不同的特征，然后使用 time-wise concatenation 来聚合这两种特征

dual-attention 的两个分支：

SpecAttn 分支
ProbAttn 分支

1、SpecAttn 分支：specific attention branch，用于捕捉 $s$ 和 audio feature $s_a$ 的实时对齐的 attention $s_{sa}$ ，根据 FaceFormer，本文的 SpecAttn 格式如下：

$d$ 是 $s_a$ 的维度
alignment bias $M_A$ 如下：

不同于 FaceFormer 中只在自回归中使用了 cross-attention，本文在整个序列中都使用了 cross-attention，计算速度提升了 Tx

为了捕捉更丰富的时序信息，作者还在 $s$ 上使用了 periodic positional encoding (PPE) 和 biased casual self-attention：

$M_T$ 是一个上三角区为负无穷的矩阵，这是为了避免看到未来的帧来进行当前帧的预测

$q$ 是控制序列周期的超参数
这样做能够让 encoded feature s’ 包含更丰富的空间-时序信息，能够生成的更准确

2、ProbAttn 分支

为了生成更逼真的结果且避免过平滑，学习声音特征和人像动作之间的概率映射很重要，VAE[17] 能够建模概率生成并且在时序的生成任务上表现的比较好

所以，基于 advanced transformer Variational Autoencoder (t-VAE)，本文设计了 probabilistic attention branch 来生成更多样的结果

给定特征表达 $s$ ，probabilistic attention 的目标是生成更多样的特征 $s_{pa}$ ：

首先，将 $s$ 送入 encoder（Enc），然后学习 $\mu$ 和 $\theta$ 来建模 $s$
然后，使用 decoder（Dec）通过 resample 来生成 multimodal 输出

$\Phi$ ：是 MLP
$U(\mu, \theta)$ ：是高斯分布

为了让 ProbAttn 能够学习更丰富的风格，使用 KL 散度 loss 来约束 t-VAE 的特征：

$d_l$ ：是 $\mu$ 的维度

3、整合两个 attention 的输出

Loss 函数：

MODA 有四个 decoder，分别生成不同部位的运动系数

所以作者使用了 multi-task 学习机制，通过最小化对应的 $L_1$ 距离来实现：

加上 KL loss：

2.3 Facial Composer Network

Facial composer network （FaCo-Net）的输入是 subject information $S$ 、mouth point $P^M$ 、eye point $P^E$

FaCo-Net 的目标：合成更精细的面部 landmark $P^F$ ：

FaCo-Net 的结构：

3 个 encoder 对 3 种特征分别编码
- subject encoder：将 facial point $S$ 映射到 style code $p_f$
- $P^M$ encoder：将 $P^M$ 映射到和 $p_f$ 同一空间的 $p_m$
- $P^E$ encoder：将 $P^E$ 映射到和 $p_f$ 同一空间的 $p_e$
1 个 decoder 生成面部 landmark
- Faco-Net 的作用是生成器：生成 “看起来逼真” 的 facial dense point

生成器的 loss 如下：

$L_{GAN}$ 是 adversarial loss， $\^{z}=D(P^F)$
$\lambda$ ：10

判别器 D：使用 GAN 作为判别器的 backbone 来判断是真实的 facial points 还是生成的 facial points

用于优化判别器 D 的 adversarial Loss：LSGAN loss

$z$ ：输入为 gt face points 时，判别器的输出
$\^{z}$ ：输入为生成的 face points 时，判别器的输出

生成 facial landmarks $P^F$ 后， $P^F$ 会根据 head pose 来变换到 camera coordinate

torso points 和变换后的 facial landmark 会映射到 image space 来进行写实的渲染

2.4 使用 TPE 来合成人像图片

最后就是要将前面得到的输出来渲染出人像，如图 2

作者使用 U-Net-like 的带 TPE 的渲染器 $G_R$ 来生成高保真且稳定的视频

TPE ：

然后使用 $G_R$ 来渲染 t-frame 的结果 $I_t$

$I_t^c$ ：是 frame index t 时的 condition image
$I_r$ ：是 reference image

三、效果

3.1 训练细节

训练细节：

超参数 $(\beta_1, \beta_2)=(0.9,0.99)$
学习率：10^-4
单卡 3090：三个部分分别需要 (30, 2, 6) 小时，（200,300,100）epoch，(32,32,4) batch
测试时，选择最小的验证 loss 的模型
使用滑动窗口来处理任意长度的视频（window size 300，stirde 150）

3.2 数据

作者使用的 HDTF 和 LSP 数据，video 的平均长度为 1-5 分钟，并且作者将其处理成了 25 fps

作者随机选择 80% 的视频作为训练集，其他的作为测试集，也就是有 132 个训练视频，32 个测试视频

所有视频以人脸为中心，被 resize 成 512x512 大小

数据预处理：

首先，使用 Mediapipe 对所有视频提取 478 个 3D facial landmarks
然后，使用开源方法估计 head pose H，且根据 head pose，将上面的 3D facial landmarks 投影到 canonical space
接着，使用 face parsing 方法来根据分割结果估计出 torso 的 boundary

3.3 测评指标

LMD：mouth landmark distance，衡量生成的视频的唇部正确性
LMD-v：velocity of mouth landmark distance，衡量生成的视频的唇部正确性
MA：衡量预测的 mouth area 和真实的 mouth area 的 IoU
confidence score from SyncNet：衡量 audio-video 的合成
Natural Image Quality Evaluator (NIQE) ：衡量图像的质量，能够捕捉图像的细节

3.4 结果比较

和 SOTA 结果的定量比较：

User Study：

消融实验：

dual-attention 的消融实验效果：

使用 LSTM 代替 dual-attention，LSTM 无法获得 multimodal 的结果，且 diverse score 降低到了 0
移除 specific attention branch，移除后，MODA 生成的唇部运动结果过平滑
移除两个 attention branch

FaCo-Net 消融实验：该模型的目标是为了为渲染器生成自然且连续的表达特征

作者通过移除该模块，直接使用 facial dense landmark 来代替 eye landmark 和 mouth landmark，如图 6a 展示了没有 FaCo-Net 的结果，唇部区域联系不太正常，且丢失了一些脸部细节

TPE 消融实验

作者使用时序一致性衡量方式来衡量 frame-wise consistency（TCM），

$O_t$ 表示 reference video（O）第 t 帧
$V_t$ 表示 generated video （V）第 t 帧
warp(.) 表示使用 optical flow 的 warping function
图 6b 展示了 with/without TPE 的对比效果，可以看出使用 TPE 能够让输出视频更稳定

本文方法的限制：

不能很好的泛化到不同的目标人物或 out-of-domain audio
对于新的人物要重新训练渲染部分的模型

单卡 3090 训练时间和测试时间对比：

四、代码

4.1 数据前处理

git clone https://github.com/DreamtaleCore/MODA.git

1、装环境

我按照官方给出的方法没有装成功，是一步步按 conda 的命令装的

2、下载 HDTF 数据

这里目前只找到了 HDTF 的数据：

有下载 HDTF 工具的 github 路径：https://github.com/universome/HDTF

下载方式：python download.py --output_dir /path/to/output/dir --num_workers 8
注意：要科学上网，需要安装 ffmpeg、youtubu-dl，否则会报错，报错原因可以去下载路径下的 log 中去看
注意：将 download.py 的第 168 行修改成 video_selection = f"best[ext={video_format}]"，才能保证下载的视频有声音，否则下载的视频没有声音

3、处理数据

处理数据在 MODA/data_prepare/ 目录下：

第一步：先编译 3DDFA-V2 的环境：

cd 3DDFA-V2
bash build.sh
cd ..

我用 MODA 自带下来的 3DDFA-V2 无法 build，自己重新 clone 了一份 3DDFA_V2 才 build 成功

 sh ./build.sh

第二步：下载 face-parsing 的模型并上传到 face-parsing/res/cp 中

第三步：执行处理代码：

python process.py -i your/video/dir -o your/output/dir

报错 1 ：这里 step0 第 42 行的路径没有写入权限，导致无法在程序运行中间写入，换成有权限的目录

报错 2：unrecognized option 'crf'

这常见于在使用 ffmpeg 时使用到了 libx264，但在实际的编译过程中并有指定编译 libx264 参数，默认不会编译这一部分组件，从而产生报错。

可以使用 apt 安装 ffmpeg ：

sudo apt install ffmpeg    //通过 apt 安装 ffmpeg

或者如下方式解决：

conda install x264

conda install x264 ffmpeg -c conda-forge

但我都没有解决，然后我就把 -crf 参数舍弃了哈哈哈

修改 step0 中的 line 51 如下：

# cvt_wav_cmd = 'ffmpeg -i ' + vfp + f' -vf scale={args.target_h}:{args.target_w} -crf 2 ' + args.out_video_fp + ' -y' # 无法处理 crf 参数
cvt_wav_cmd = 'ffmpeg -i ' + vfp + f' -vf scale={args.target_h}:{args.target_w} '+ args.out_video_fp + ' -y' # 注意 {args.target_w} 后的空格

报错 3：no module named 'FaceBoxes'

暂且将这里改成了绝对路径，得以解决

报错 4：找不到 viz_pose2

因为我这里用了 3DDFA_V2 源码，源码中没有这个函数，所以我从 MODA 中重新拷了这个函数，解决了

报错 5：

Could not find a backend to open `/mnt/cpfs/dataset/tuxiangzu/Face_Group/WM/project/MODA/HDTF_PROCESS/RD_Radio11_000/video.mp4`` with iomode `r?`

 python -m pip install imageio[ffmpeg]
 python -m pip install imageio[pyav]

报错 6：

找不到 step2 中的 3DDFA-V2/config/mb1_120x120.yml，这里没发现作者写成了非下划线，改了好久才发现，我们使用的是 3DDFA_V2 是这样写的，注意修改

报错 7：onnxruntime.InferenceSession 报错

按上面的提示添加对应参数：

报错 8 ：找不到 config 中写的路径， No such file or directory: 'weights/mb1_120x120.pth'， No such file or directory: 'configs/bfm_noneck_v3.pkl'

不知道是编译问题还是怎么的，相对路径都不起作用，暂且将 mb1_120x120.yml 中的路径都改为绝对路径

报错 9：module 'numpy' has no attribute 'long'，改为 np.longlong()

numpy.long 在 numpy 1.20中被弃用，并在 numpy 1.24 中被删除，可以尝试 numpy.longlong

报错 10：AttributeError: module 'numpy' has no attribute 'int'.

修改为 np.int_，然后重新编译 sh ./build.sh

报错 11：ModuleNotFoundError: No module named 'RobustVideoMatting'

报错 12：其实是提示，但这里也最好改一下，在 step5 中加上 n_init 这个参数：

最后就愉快的跑起来啦，我这里其实很多问题都是相对路径找不到的锅~

预估跑完 HDTF 的 167 个视频需要一两天时间，8线程

训练时报的错误：缺少 shoulder-billboard.npy

其实可以看到在整个数据处理过程是没有运行 step6 这个文件的，也就是没有从 shoulder.npy 生成 shoulder-billboard.npy，所以训练时候在 audio2repr_dataset.py 中是找不到这个文件的

但作者这里代码和实现逻辑有些出入，没有专门生成 shoulder.py 而是将其写入了 feature.npz 中，可以通过如下方式来调用，所以可以在 step5 后面加入 step6，将 process.py 中的 force_update=False，就是如果已有需要生成的文件时，不执行步骤，这样就能只执行 step6，不执行其他步骤了，生成对应的 shoulder-billboard.npy 就可以了。

process.py

step6.py

将 62 行注释，添加 64 行

这里下载的视频数据有些被损坏，有些没有内容，需要删除：

WDA_MaggieHassan_000.mp4
WRA_PeterKing_000.mp4

4.2 训练

首先，建立自己的 train.txt 和 val.txt

这里作者写的是随机选取的，代码里也没有写是怎么选的，所以我这里也就先随机选了一些：

import os
import random
datapath = 'MODA/assets/dataset/HDTF/HDTF_PROCESS'
dir_list = os.listdir(datapath)
val_list_num = random.sample([x for x in range(0, len(dir_list))], 32)
with open('assets/dataset/HDTF/train.txt', 'w') as f1:
    with open('assets/dataset/HDTF/val.txt', 'w') as f2:
        for i, dirs in enumerate(dir_list):
            if i in val_list_num:
                f2.write('HDTF_PROCESS/' + dirs + '\n')
            else:
                f1.write('HDTF_PROCESS/' + dirs + '\n')

得到的 txt 中放的就是这样的路径：

报错 1：Expected more than 1 value per channel when training, got input size [1,128]

这里的原因应该是最后一个 batch=1 了，所以这里设置丢弃最后一个就行了

MODA/dataset/__init__.py 的 self.dataloader 中的 drop_last=True 打开

模型结构：

model [MODAModel] was created
---------- Networks initialized -------------
[Network MODA] Total number of parameters : 96.718 M
-----------------------------------------------
---------- Networks initialized -------------
DataParallel(
  (module): MODANet(
    (audio_encoder): Wav2Vec2Model(
      (feature_extractor): Wav2Vec2FeatureEncoder(
        (conv_layers): ModuleList(
          (0): Wav2Vec2GroupNormConvLayer(
            (conv): Conv1d(1, 512, kernel_size=(10,), stride=(5,), bias=False)
            (activation): GELUActivation()
            (layer_norm): GroupNorm(512, 512, eps=1e-05, affine=True)
          )
          (1): Wav2Vec2NoLayerNormConvLayer(
            (conv): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False)
            (activation): GELUActivation()
          )
          (2): Wav2Vec2NoLayerNormConvLayer(
            (conv): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False)
            (activation): GELUActivation()
          )
          (3): Wav2Vec2NoLayerNormConvLayer(
            (conv): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False)
            (activation): GELUActivation()
          )
          (4): Wav2Vec2NoLayerNormConvLayer(
            (conv): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False)
            (activation): GELUActivation()
          )
          (5): Wav2Vec2NoLayerNormConvLayer(
            (conv): Conv1d(512, 512, kernel_size=(2,), stride=(2,), bias=False)
            (activation): GELUActivation()
          )
          (6): Wav2Vec2NoLayerNormConvLayer(
            (conv): Conv1d(512, 512, kernel_size=(2,), stride=(2,), bias=False)
            (activation): GELUActivation()
          )
        )
      )
      (feature_projection): Wav2Vec2FeatureProjection(
        (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
        (projection): Linear(in_features=512, out_features=768, bias=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): Wav2Vec2Encoder(
        (pos_conv_embed): Wav2Vec2PositionalConvEmbedding(
          (conv): Conv1d(768, 768, kernel_size=(128,), stride=(1,), padding=(64,), groups=16)
          (padding): Wav2Vec2SamePadLayer()
          (activation): GELUActivation()
        )
        (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
        (layers): ModuleList(
          (0): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
          (1): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
          (2): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
          (3): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
          (4): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
          (5): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
          (6): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
          (7): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
          (8): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
          (9): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
          (10): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
          (11): Wav2Vec2EncoderLayer(
            (attention): Wav2Vec2Attention(
              (k_proj): Linear(in_features=768, out_features=768, bias=True)
              (v_proj): Linear(in_features=768, out_features=768, bias=True)
              (q_proj): Linear(in_features=768, out_features=768, bias=True)
              (out_proj): Linear(in_features=768, out_features=768, bias=True)
            )
            (dropout): Dropout(p=0.1, inplace=False)
            (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (feed_forward): Wav2Vec2FeedForward(
              (intermediate_dropout): Dropout(p=0.1, inplace=False)
              (intermediate_dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
              (output_dense): Linear(in_features=3072, out_features=768, bias=True)
              (output_dropout): Dropout(p=0.1, inplace=False)
            )
            (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          )
        )
      )
    )
    (audio_encoder_head): MLP(
      (layers): Sequential(
        (0): Linear(in_features=768, out_features=128, bias=True)
        (1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): LeakyReLU(negative_slope=0.2)
        (3): Linear(in_features=128, out_features=128, bias=True)
      )
    )
    (subject_encoder_head): MLP(
      (layers): Sequential(
        (0): Linear(in_features=1434, out_features=128, bias=True)
        (1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): LeakyReLU(negative_slope=0.2)
        (3): Linear(in_features=128, out_features=128, bias=True)
        (4): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): LeakyReLU(negative_slope=0.2)
        (6): Linear(in_features=128, out_features=128, bias=True)
      )
    )
    (temporal_body): DualTemporalMoudleV2(
      (short_layer): TemporalAlignedBlock(
        (decoder): TransformerDecoder(
          (layers): ModuleList(
            (0): TransformerDecoderLayer(
              (self_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (multihead_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (linear1): Linear(in_features=128, out_features=128, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
              (linear2): Linear(in_features=128, out_features=128, bias=True)
              (norm1): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm2): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm3): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (dropout1): Dropout(p=0.1, inplace=False)
              (dropout2): Dropout(p=0.1, inplace=False)
              (dropout3): Dropout(p=0.1, inplace=False)
            )
            (1): TransformerDecoderLayer(
              (self_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (multihead_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (linear1): Linear(in_features=128, out_features=128, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
              (linear2): Linear(in_features=128, out_features=128, bias=True)
              (norm1): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm2): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm3): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (dropout1): Dropout(p=0.1, inplace=False)
              (dropout2): Dropout(p=0.1, inplace=False)
              (dropout3): Dropout(p=0.1, inplace=False)
            )
            (2): TransformerDecoderLayer(
              (self_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (multihead_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (linear1): Linear(in_features=128, out_features=128, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
              (linear2): Linear(in_features=128, out_features=128, bias=True)
              (norm1): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm2): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm3): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (dropout1): Dropout(p=0.1, inplace=False)
              (dropout2): Dropout(p=0.1, inplace=False)
              (dropout3): Dropout(p=0.1, inplace=False)
            )
          )
        )
        (ppe): PeriodicPositionalEncoding(
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
      (long_layer): TemporalVAEBlock(
        (embedding): PositionalEncoding(
          (dropout): Dropout(p=0.1, inplace=False)
        )
        (encoder): TransformerEncoder(
          (layers): ModuleList(
            (0): TransformerEncoderLayer(
              (self_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (linear1): Linear(in_features=128, out_features=128, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
              (linear2): Linear(in_features=128, out_features=128, bias=True)
              (norm1): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm2): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (dropout1): Dropout(p=0.1, inplace=False)
              (dropout2): Dropout(p=0.1, inplace=False)
            )
            (1): TransformerEncoderLayer(
              (self_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (linear1): Linear(in_features=128, out_features=128, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
              (linear2): Linear(in_features=128, out_features=128, bias=True)
              (norm1): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm2): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (dropout1): Dropout(p=0.1, inplace=False)
              (dropout2): Dropout(p=0.1, inplace=False)
            )
            (2): TransformerEncoderLayer(
              (self_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (linear1): Linear(in_features=128, out_features=128, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
              (linear2): Linear(in_features=128, out_features=128, bias=True)
              (norm1): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm2): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (dropout1): Dropout(p=0.1, inplace=False)
              (dropout2): Dropout(p=0.1, inplace=False)
            )
          )
        )
        (decoder): TransformerDecoder(
          (layers): ModuleList(
            (0): TransformerDecoderLayer(
              (self_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (multihead_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (linear1): Linear(in_features=128, out_features=128, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
              (linear2): Linear(in_features=128, out_features=128, bias=True)
              (norm1): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm2): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm3): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (dropout1): Dropout(p=0.1, inplace=False)
              (dropout2): Dropout(p=0.1, inplace=False)
              (dropout3): Dropout(p=0.1, inplace=False)
            )
            (1): TransformerDecoderLayer(
              (self_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (multihead_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (linear1): Linear(in_features=128, out_features=128, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
              (linear2): Linear(in_features=128, out_features=128, bias=True)
              (norm1): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm2): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm3): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (dropout1): Dropout(p=0.1, inplace=False)
              (dropout2): Dropout(p=0.1, inplace=False)
              (dropout3): Dropout(p=0.1, inplace=False)
            )
            (2): TransformerDecoderLayer(
              (self_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (multihead_attn): MultiheadAttention(
                (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
              )
              (linear1): Linear(in_features=128, out_features=128, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
              (linear2): Linear(in_features=128, out_features=128, bias=True)
              (norm1): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm2): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (norm3): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
              (dropout1): Dropout(p=0.1, inplace=False)
              (dropout2): Dropout(p=0.1, inplace=False)
              (dropout3): Dropout(p=0.1, inplace=False)
            )
          )
        )
        (out): Sequential(
          (0): Linear(in_features=128, out_features=128, bias=True)
        )
        (to_mu): Linear(in_features=128, out_features=128, bias=True)
        (to_logvar): Linear(in_features=128, out_features=128, bias=True)
        (decode_latent): Linear(in_features=128, out_features=128, bias=True)
      )
    )
    (lipmotion_tail): MLP(
      (layers): Sequential(
        (0): Linear(in_features=256, out_features=512, bias=True)
        (1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): LeakyReLU(negative_slope=0.2)
        (3): Linear(in_features=512, out_features=120, bias=True)
      )
    )
    (eyemovement_tail): MLP(
      (layers): Sequential(
        (0): Linear(in_features=256, out_features=256, bias=True)
        (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): LeakyReLU(negative_slope=0.2)
        (3): Linear(in_features=256, out_features=256, bias=True)
        (4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): LeakyReLU(negative_slope=0.2)
        (6): Linear(in_features=256, out_features=180, bias=True)
      )
    )
    (headmotion_tail): MLP(
      (layers): Sequential(
        (0): Linear(in_features=256, out_features=256, bias=True)
        (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): LeakyReLU(negative_slope=0.2)
        (3): Linear(in_features=256, out_features=256, bias=True)
        (4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): LeakyReLU(negative_slope=0.2)
        (6): Linear(in_features=256, out_features=7, bias=True)
      )
    )
    (torsomotion_tail): MLP(
      (layers): Sequential(
        (0): Linear(in_features=256, out_features=256, bias=True)
        (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): LeakyReLU(negative_slope=0.2)
        (3): Linear(in_features=256, out_features=256, bias=True)
        (4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): LeakyReLU(negative_slope=0.2)
        (6): Linear(in_features=256, out_features=54, bias=True)
      )
    )
  )
)
[Network MODA] Total number of parameters : 96.718 M

lip decoder：MLP

Sequential(
  (0): Linear(in_features=256, out_features=512, bias=True)
  (1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (2): LeakyReLU(negative_slope=0.2)
  (3): Linear(in_features=512, out_features=120, bias=True)
)

batch norm 输出后的特征(x_1)基本都一样了

layer_0 的输出：

layer_1 的输出：全为负值

推测是模型根本没训练好，可能是学习率的问题，也可能是 target 的问题

这里把学习率从原本的 1e-4 调到了 1e-3 和 1e-5，都没有什么改变，loss 很大，尤其是 headmotion loss 大概在几十万，所以这里 target 的训练应该是有问题的

所以我又去看了看为什么 loss 这么大，发现 target_headmotion 和 target_torsomotion 的数据分布范围很大：

可以看看其他的 target 还是比较小的：

去 audio2repr_dataset.py 中看看数据是怎么处理的：

data: len=17，这里的 1200 表示 batch=2，每个 batch 帧数为 600

data_list[file_index][0]：audio_array，tensor([-0.8657, -0.9239, -0.8294, …, -0.0095, -0.0519, -0.1292])，torch.Size([640128])
data_list[file_index][1]：av_rate，533
data_list[file_index][2]：face_vertices，torch.Size([1200, 478, 3])
data_list[file_index][3]：face_vert_ref 均值，[478, 3]
data_list[file_index][4]：face_vert_ref 方差，[478, 3]
data_list[file_index][5]：face_headposes，[1200, 3]
data_list[file_index][6]：face_head_ref 均值，[3]
data_list[file_index][7]：face_head_ref 方差, [3]
data_list[file_index][8]：face_transposes, [1200, 3]
data_list[file_index][9]：face_trans_ref 均值, [3]
data_list[file_index][10]：face_trans_ref 方差, [3]
data_list[file_index][11]：face_scales, [1200, 1]
data_list[file_index][12]：face_scale_ref 均值, [1]
data_list[file_index][13]：face_scale_ref 方差, [1]
data_list[file_index][14]：torso_info, [1200, 18, 3]
data_list[file_index][15]：torso_info_ref 均值, [18, 3]
data_list[file_index][16]：torso_info_ref 方差, [18, 3]

4.3 推理

先使用 mediapipe 来提取面部关键点

# 一段从 utils.py 截出来的代码片，只是展示操作方式而已
import mediapipe as mp
mp_drawing_styles = mp.solutions.drawing_styles
mp_connections = mp.solutions.face_mesh_connections
def get_semantic_indices():
    semantic_connections = {
        'Contours':     mp_connections.FACEMESH_CONTOURS,
        'FaceOval':     mp_connections.FACEMESH_FACE_OVAL,
        'LeftIris':     mp_connections.FACEMESH_LEFT_IRIS,
        'LeftEye':      mp_connections.FACEMESH_LEFT_EYE,
        'LeftEyebrow':  mp_connections.FACEMESH_LEFT_EYEBROW,
        'RightIris':    mp_connections.FACEMESH_RIGHT_IRIS,
        'RightEye':     mp_connections.FACEMESH_RIGHT_EYE,
        'RightEyebrow': mp_connections.FACEMESH_RIGHT_EYEBROW,
        'Lips':         mp_connections.FACEMESH_LIPS,
        'Tesselation':  mp_connections.FACEMESH_TESSELATION
    }

    def get_compact_idx(connections):
        ret = []
        for conn in connections:
            ret.append(conn[0])
            ret.append(conn[1])
        
        return sorted(tuple(set(ret)))
    
    semantic_indexes = {k: get_compact_idx(v) for k, v in semantic_connections.items()}

    return semantic_indexes

generate_feature.py 得到的面部信息如下：

{
'Contours': [0, 7, 10, 13, 14, 17, 21, 33, 37, 39, 40, 46, 52, 53, 54, 55, 58, 61, 63, 65, 66, 67, 70, 78, 80, 81, 82, 84, 87, 88, 91, 93, 95, 103, 105, 107, 109, 127, 132, 133, 136, 144, 145, 146, 148, 149, 150, 152, 153, 154, 155, 157, 158, 159, 160, 161, 162, 163, 172, 173, 176, 178, 181, 185, 191, 234, 246, 249, 251, 263, 267, 269, 270, 276, 282, 283, 284, 285, 288, 291, 293, 295, 296, 297, 300, 308, 310, 311, 312, 314, 317, 318, 321, 323, 324, 332, 334, 336, 338, 356, 361, 362, 365, 373, 374, 375, 377, 378, 379, 380, 381, 382, 384, 385, 386, 387, 388, 389, 390, 397, 398, 400, 402, 405, 409, 415, 454, 466], 
'FaceOval': [10, 21, 54, 58, 67, 93, 103, 109, 127, 132, 136, 148, 149, 150, 152, 162, 172, 176, 234, 251, 284, 288, 297, 323, 332, 338, 356, 361, 365, 377, 378, 379, 389, 397, 400, 454], 
'LeftIris': [474, 475, 476, 477], 
'LeftEye': [249, 263, 362, 373, 374, 380, 381, 382, 384, 385, 386, 387, 388, 390, 398, 466], 
'LeftEyebrow': [276, 282, 283, 285, 293, 295, 296, 300, 334, 336], 
'RightIris': [469, 470, 471, 472], 
'RightEye': [7, 33, 133, 144, 145, 153, 154, 155, 157, 158, 159, 160, 161, 163, 173, 246], 
'RightEyebrow': [46, 52, 53, 55, 63, 65, 66, 70, 105, 107], 
'Lips': [0, 13, 14, 17, 37, 39, 40, 61, 78, 80, 81, 82, 84, 87, 88, 91, 95, 146, 178, 181, 185, 191, 267, 269, 270, 291, 308, 310, 311, 312, 314, 317, 318, 321, 324, 375, 402, 405, 409, 415], 
'Tesselation': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467]}

你可能感兴趣的:(数字人,数字人)

随笔 | 仙一般的灵气海思沧海
仙岛今天，我看了你全部，似乎已经进入你的世界我不知道，这是否是梦幻，还是你仙一般的灵气吸引了我也许每一个人都要有一份属于自己的追求，这样才能够符合人生的梦想，生活才能够充满着阳光与快乐我不知道，我为什么会这样的感叹，是在感叹自己的人生，还是感叹自己一直没有孜孜不倦的追求只感觉虚度了光阴，每天活在自己的梦中，活在一个不真实的世界是在逃避自己，还是在逃避周围的一切有时候我嘲笑自己，嘲笑自己如此的虚无，
10月|愿你的青春不负梦想-读书笔记-01 Tracy的小书斋
本书的作者是俞敏洪，大家都很熟悉他了吧。俞敏洪老师是我行业的领头羊吧，也是我事业上的偶像。本日摘录他书中第一章中的金句：『一个人如果什么目标都没有，就会浑浑噩噩，感觉生命中缺少能量。能给我们能量的，是对未来的期待。第一件事，我始终为了进步而努力。与其追寻全世界的骏马，不如种植丰美的草原，到时骏马自然会来。第二件事，我始终有阶段性的目标。什么东西能给我能量？答案是对未来的期待。』读到这里的时候，我便
谢谢你们，爱你们！鹿游儿
昨天家人去泡温泉，二个孩子也带着去，出发前一晚，匆匆下班，赶回家和孩子一起收拾。饭后，我拿出笔和本子（上次去澳门时做手帐的本子）写下了1\2\3\4\5\6\7\8\9,让后让小壹去思考，带什么出发去旅游呢？她在对应的数字旁边画上了，泳衣、泳圈、肖恩、内衣内裤、tapuy、拖鞋……画完后，就让她自己对着这个本子，将要带的，一一带上，没想到这次带的书还是这本《便便工厂》(晚上姑婆发照片过来，妹妹累得
Cell Insight | 单细胞测序技术又一新发现，可用于HIV-1和Mtb共感染个体诊断尐尐呅
结核病是艾滋病合并其他疾病中导致患者死亡的主要原因。其中结核病由结核分枝杆菌（Mycobacteriumtuberculosis,Mtb）感染引起，获得性免疫缺陷综合症（艾滋病）由人免疫缺陷病毒（Humanimmunodeficiencyvirustype1,HIV-1）感染引起。国家感染性疾病临床医学研究中心/深圳市第三人民医院张国良团队携手深圳华大生命科学研究院吴靓团队，共同研究得出单细胞测序
30天风格练习-DAY2 黄希夷
Day2（重义）在一个周日/一周的最后一天，我来到位于市中心/市区繁华地带的一家购物中心/商场，中心内人很多/熙熙攘攘。我注意到/看见一个独行/孤身一人的年轻女孩/，留着一头引人注目/长过腰际的头发，上身穿一件暗红色/比正红色更深的衣服/穿在身体上的东西。走下扶梯的时候，她摔倒了/跌向地面，在她正要站起来/让身体离开地面的时候，过长/超过一般人长度的头发被支撑身体/躯干的手掌压/按在下面，她赶紧用
高级编程--XML+socket练习题 masa010 java 开发语言
1.北京华北2114.8万人上海华东2,500万人广州华南1292.68万人成都华西1417万人（1）使用dom4j将信息存入xml中（2）读取信息，并打印控制台（3）添加一个city节点与子节点（4）使用socketTCP协议编写服务端与客户端，客户端输入城市ID，服务器响应相应城市信息（5）使用socketTCP协议编写服务端与客户端，客户端要求用户输入city对象，服务端接收并使用dom4j
三大师传 beca酱
巴尔扎克的作品被誉为“法国社会的一面镜子”。文学大师维克多·雨果对巴尔扎克的评价是：“在最伟大的人物中间，巴尔扎克是名列前茅者；在最优秀的人物中间，巴尔扎克是佼佼者之一。”一个原本寂寂无名的小人物，从地中海的某个海岛上，只身一人来到巴黎，没有朋友，也没有名望。作为一个一文不名的外乡人，凭着赤手空拳赢得了巴黎，征服了整个法兰西，并且赢得了世界。这个人就是十九世纪法国伟大的军事家、政治家，法兰西第一帝
开心蒋泳频
从无比抗拒来上课到接受，感动，收获～看着波哥成长，晶晶幸福笑容满面。感觉自己做的事情很有意义，很开心！还有3个感召目标就是还有三个有缘人，哈哈。明天感召去明日计划：8：30-11：00小公益11：00-21点上班，感召图片发自App图片发自App图片发自App
2018-07-23-催眠日作业-#不一样的31天#-66小鹿小鹿_33
预言日：人总是在逃避命运的路上，与之不期而遇。心理学上有个著名的名词，叫做自证预言；经济学上也有一个很著名的定律叫做，墨菲定律；在灵修派上，还有一个很著名的法则，叫做吸引力法则。这3个领域的词，虽然看起来不太一样，但是他们都在告诉人们一个现象：你越担心什么，就越有可能会发生什么。同样的道理，你越想得到什么，就应该要积极地去创造什么。无论是自证预言，墨菲定律还是吸引力法则，对人都有正反2个维度的影响
本周第二次约练 2cfbdfe28a51
中原焦点团队中24初26刘霞2021.12.3约练161次，分享第368天当事人虽然是带着问题来的，但是咨询过程中发现，她是经过自己不断地调整和努力才走到现在的，看到当事人的不容易，找到例外，发现资源，力量感也就随之而来。增强画面感，或者说重温，会给当事人带来更深刻的感受。
放下是一段成长的修行小莳玥
人来到这个世界上，只有两件事：生和死。一件事已经做完了，另一件你还急什么呢?是人，都有七情六欲。是心，都有喜怒哀乐，这些再正常不过了。别总抱怨自己活得累，过得辛苦。永远记住：舒坦是留给死人的。苦，才是生活；累，才是工作；变，才是命运；忍，才是历练；容，才是智慧；静，才是修养；舍，才会得到；做，才会拥有。人生，活得太清楚，才是最大的不明白。有些事，看得很清，却说不清；有些人，了解很深，却猜不透；有些
活给自己看，笑容才灿烂听着了么
白岩松说“有时候，我们活得很累，并非生活过于刻薄，而是我们太容易被外界的氛围所感染，被他人的情绪所左右。”心情是自己的。若只是活在别人的眼里、嘴里，便掌握不了让自己开心的主动权。人活着，不是为了活给别人看的，唯有做最真实的自己，活给自己看，笑容才灿烂。诚然，世事纷繁复杂，人人都有一张嘴，管也管不了。永远有人欣赏你，也永远有人批评你，不可能做到让所有人都满意，开心做自己才是最重要的。人生苦短，有太多
每日一题——第八十九题互联网打工人no1 C语言程序设计每日一练 c语言
题目：在字符串中找到提取数字，并统计一共找到多少整数，a123xxyu23&8889，那么找到的整数为123，23，8889//思想：#include#include#includeintmain(){charstr[]="a123xxyu23&8889";intcount=0;intnum=0;//用于临时存放当前正在构建的整数。boolinNum=false;//用于标记当前是否正在读取一个整
2022-04-18 Apbenz
语重心长的和我说，不要老是说不行，人至而立之年危机四伏，内在的，外在的，感觉就是心力憔悴，让人无所适从。面对职场的无情，突然好羡慕干体力劳动的外卖小哥。难道命运是想让我去送外卖了吗？干体力活才能让我活下去？fastadmin打卡成功,淘宝金币任务完成。ㅏㅓㅗㅜㅡㅣㅐㅔㅑㅕㅛㅠㅢㅒㅖY行。야자여자요리우유의사얘기예
每日一题——第八十三题互联网打工人no1 C语言程序设计每日一练 c语言
题目：将输入的整形数字输出,输出1990，输出"1990"#include#defineMAX_INPUT1024intmain(){intarrr_num[MAX_INPUT];intnum,i=0;printf("请输入一个数字：");scanf_s("%d",&num);while(num!=0){arrr_num[i++]=num%10;num/=10;}printf("\"");for(
网易严选官方旗舰店，优质商品，卓越服务高省_飞智666600
网易严选官方旗舰店是网易旗下的一家电商平台，以提供优质商品和卓越服务而闻名。作为一名SEO优化师，我将为您详细介绍网易严选官方旗舰店，并重点强调其特点和优势。大家好！我是高省APP最大团队&联合创始人飞智导师。相较于其他返利app，高省APP的佣金更高，模式更好，最重要的是，终端用户不会流失！高省APP佣金更高，模式更好，终端用户不流失。【高省】是一个自用省钱佣金高，分享推广赚钱多的平台，百度有几
《庄子.达生9》钱江潮369
【原文】孔子观于吕梁，县水三十仞，流沫四十里，鼋鼍鱼鳖之所不能游也。见一丈夫游之，以为有苦而欲死也，使弟子并流而拯之。数百步而出，被发行歌而游于塘下。孔子从而问焉，曰：“吾以子为鬼，察子则人也。请问，‘蹈水有道乎’”曰：“亡，吾无道。吾始乎故，长乎性，成乎命。与齐俱入，与汩偕出，从水之道而不为私焉。此吾所以蹈之也。”孔子曰：“何谓始乎故，长乎性，成乎命？”曰：“吾生于陵而安于陵，故也；长于水而安于
git常用命令笔记咩酱-小羊 git 笔记
###用习惯了idea总是不记得git的一些常见命令，需要用到的时候总是担心旁边站了人~~~记个笔记@_@，告诉自己看笔记不丢人初始化初始化一个新的Git仓库gitinit配置配置用户信息gitconfig--globaluser.name"YourName"gitconfig--globaluser.email"[email protected]"基本操作克隆远程仓库gitclone查看
水泥质量纠纷案代理词徐宝峰律师
贵州领航建设有限公司诉贵州纳雍隆庆乌江水泥有限公司产品质量纠纷案代理词尊敬的审判长、审判员：贵州千里律师事务所接受被告贵州纳雍隆庆乌江水泥有限公司的委托，指派我担任其诉讼代理人，参加本案的诉讼活动。下面，我结合本案事实和相关法律规定发表如下代理意见，供合议庭评议案件时参考：原告应当举证证明其遭受的损失与被告生产的水泥质量的因果关系。首先水泥是一种粉状水硬性无机胶凝材料。加水搅拌后成浆体，能在空气中
腾讯云技术深度探索：构建高效云原生微服务架构我的运维人生云原生架构腾讯云运维开发技术共享
腾讯云技术深度探索：构建高效云原生微服务架构在当今快速发展的技术环境中，云原生技术已成为企业数字化转型的关键驱动力。腾讯云作为行业领先的云服务提供商，不断推出创新的产品和技术，助力企业构建高效、可扩展的云原生微服务架构。本文将深入探讨腾讯云在微服务领域的最新进展，并通过一个实际案例展示如何在腾讯云平台上构建云原生应用。腾讯云微服务架构概览腾讯云微服务架构基于云原生理念，旨在帮助企业快速实现应用的容
直抒《紫罗兰永恒花园外传》雷姆的黑色童话
没看过《紫罗兰永恒花园》的我莫名的看完了《紫罗兰永恒花园外传》，又莫名的被故事中的姐妹之情狠狠地感动了的一把。感动何在：困苦中相依为命的姐妹二人被迫分离，用一个人的自由换取另一个人的幸福。之后，虽相隔不知几许依旧心心念念彼此牵挂。这种深深的姐妹情谊就是令我为之动容的所在。贝拉和泰勒分别影片开始，海天之间一个孩童凭栏眺望，手中拿着折旧的信纸。镜头一转，挑灯伏案的薇尔莉特正在打字机前奋笔疾书。这些片段
谁家酒器最绝唱，藏在酒厂人未知？景阳冈酒厂先秦藏品大揭秘李虓酒评论
文/王赛时中国的酒器酒具历史久远，举世闻名。从北京的故宫博物院、中国国家博物馆，到世界各国的大型博物馆，都以能够收藏中国古代酒具而夸耀。但很少有人知道，在山东阳谷景阳冈酒厂，默默地收藏了两千件中国酒器。这些酒器，就封藏在景阳冈的酒道馆里。其中有一些青铜酒器，一睡就是三、四千年，堪称无声国宝，堪作无字史书！今天，我将引领诸位首先窥视一下景阳冈酒道馆的9件先秦藏品，你自己来说震撼不震撼。提示：这只是景
感赏日志133 马姐读书
图片发自App感赏自己今天买个扫地机，以后可以解放出来多看点书，让这个智能小机器人替我工作了。感赏孩子最近进步很大，每天按时上学，认真听课，认真背书，主动认真完成老师布置的作业。感赏自己明白自己容易受到某人的影响，心情不好，每当此刻我就会舒缓，感赏，让自己尽快抽离，想好的一面。感赏儿子今天在我提醒他事情时，告诉我谢谢妈妈对我的提醒我明白了，而不是说我啰嗦，管事情，孩子更懂事了，懂得感恩了。投射父母
我的黑历史袖手围观有来有去
孩子同学与我们一起共进晚餐，俩孩子加我三个人。小同学是一个大方率性礼貌的小孩，我们也都非常喜欢。好了，回到正题上来让我把这个故事讲完。俩孩子都喜欢吃鱼，所以就发生了小孩子之间常会发生的事。我狠狠的盯了我家孩子，孩子表情有些狼狈。和孩子单独一起的时候，见她尚未释怀，并谴责我不该狠盯她，让她没面子。也许是她触动了我的童年往事吧。由此，一狠心，给她讲了一段埋藏心里极深的黑历史：我奶奶有四个儿子，四个儿子
509. 斐波那契数(每日一题) lzyprime
lzyprime博客(github)创建时间：2021.01.04qq及邮箱：2383518170leetcode笔记题目描述斐波那契数，通常用F(n)表示，形成的序列称为斐波那契数列。该数列由0和1开始，后面的每一项数字都是前面两项数字的和。也就是：F(0)=0，F(1)=1F(n)=F(n-1)+F(n-2)，其中n>1给你n，请计算F(n)。示例1：输入：2输出：1解释：F(2)=F(1)+
郎朗大婚娶公主：所有光环的背后，都是十年如一日的自律简小尘
近日，关于郎朗大婚的新闻上了热搜，看了新娘的照片，既有天使般的面容，更有魔鬼般的身材，关键是人家还身世好，又有才华，这真的是让所有男人羡慕嫉妒恨哪。有些人不禁会想，“凭什么郎朗的人生就象开挂了一样，可我却每天都活得这么狼狈！”其实，每个开挂的人生背后，都是苦行僧般的自律。01欲戴王冠，必承其重。练琴不能只靠兴趣，更需要自律！我们先来看一下朗朗在小时候的作息时间表：早晨5:45起床，练琴1小时。中午
《中华小厨师》单行VS爱藏：姜是老的辣，书是新的好 cicoky
《汉书·郦食其传》有曰：“王者以民为天，而民以食为天。”自古以来，吃饱饭是每一个人的基本要求，而吃好饭却是每一个人的最终追求。于是，厨师这一职业孕育而生，其渊源之久，甚至可追溯到4000年前的奴隶时代。职业本身无贵贱，但职业能力却有高低之分。所以一家餐馆生意好不好，厨师的水平决定一切，而站在所有厨师顶端的就被称之为“特级厨师”。今天要说的就是一个关于“特级厨师刘昴星”的故事。连载历程1995年第4
万物难度不度己边度512
你好，陌生人！你是否有过迷茫，在别人的面前自己却不曾展示！你是否自己承担着所有的痛苦，却又笑对人生！你是否在很多时候想找人诉说，翻开手机却发现，手机里面空无一人！你是否有很多事情想做，最后却因你自己拖延，最后发现自己什么都做不了！对没有错，我的名字就叫你是否！不要怀疑！不要悲伤！我们的生活可是还有很到要继续的呢！还有很多那个人，很多地方我们都没有去过！所以我们已经没有退路了！那就继续向前吧！加油！
凤凰公园吴侬暖语sym
凤凰公园距离我们家880米，大概步行12分钟就到了，这是我们每天饭后散步或者闲暇时的去处。现在夏季徬晚时分广场舞大妈们总是热情非凡，那里的大门口就是一个好地方，每天总有两拨人在那踩着节奏翩翩起舞呢！而且一路上，从我们小区到公园，或者从昆仑西苑沿河到公园，都是饭后锻炼的人们，川流不息，老人小孩，年轻人，…！哪哪都是。最早家乡的公园，所有公园都是要收门票的，那时候也就是休息天会有人花钱去转转，平时一般
读书||陶新华《教育中的积极心理学》1—28 流水淙淙2022
读一本好书，尤如和一位高尚者对话，亦能对人的精神进行洗礼。但是若不能和实践结合起来，也只能落到空读书的状态。读书摘要与感想1、塞利格曼在《持续的幸福》一书中提出了幸福2.0理论，提出幸福由5个元素决定——积极情绪、投入的工作和生活、目标和意义、和谐的人际关系、成就感。2、人的大脑皮层在进行智力活动时，都伴有皮下中枢活动，对这些活动进行体验请假，并由此产生了情感解读。人的情绪情感体验总是优先于大脑的
Dom 周华华 JavaScript html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml&q
【Spark九十六】RDD API之combineByKey bit1129 spark
1. combineByKey函数的运行机制 RDD提供了很多针对元素类型为(K,V)的API，这些API封装在PairRDDFunctions类中，通过Scala隐式转换使用。这些API实现上是借助于combineByKey实现的。combineByKey函数本身也是RDD开放给Spark开发人员使用的API之一首先看一下combineByKey的方法说明：
msyql设置密码报错：ERROR 1372 (HY000): 解决方法详解 daizj mysql 设置密码
MySql给用户设置权限同时指定访问密码时，会提示如下错误： ERROR 1372 (HY000): Password hash should be a 41-digit hexadecimal number；问题原因：你输入的密码是明文。不允许这么输入。解决办法：用select password('你想输入的密码');查询出你的密码对应的字符串，然后
路漫漫其修远兮吾将上下而求索周凡杨学习思索
王国维在他的《人间词话》中曾经概括了为学的三种境界古今之成大事业、大学问者，罔不经过三种之境界。“昨夜西风凋碧树。独上高楼，望尽天涯路。”此第一境界也。“衣带渐宽终不悔，为伊消得人憔悴。”此第二境界也。“众里寻他千百度，蓦然回首，那人却在灯火阑珊处。”此第三境界也。学习技术，这也是你必须经历的三种境界。第一层境界是说，学习的路是漫漫的，你必须做好充分的思想准备，如果半途而废还不如不要开始。这里，注
Hadoop(二)对话单的操作朱辉辉33 hadoop
Debug： 1、 A = LOAD '/user/hue/task.txt' USING PigStorage(' ') AS (col1,col2,col3); DUMP A; //输出结果前几行示例： (>ggsnPDPRecord(21),,) (-->recordType(0),,) (-->networkInitiation(1),,)
web报表工具FineReport常用函数的用法总结（日期和时间函数）老A不折腾 finereport 报表工具 web开发
web报表工具FineReport常用函数的用法总结（日期和时间函数）说明：凡函数中以日期作为参数因子的，其中日期的形式都必须是yy/mm/dd。而且必须用英文环境下双引号(" ")引用。 DATE DATE(year,month,day):返回一个表示某一特定日期的系列数。 Year:代表年，可为一到四位数。 Month:代表月份。
c++ 宏定义中的##操作符墙头上一根草 C++
#与##在宏定义中的--宏展开 #include <stdio.h> #define f(a,b) a##b #define g(a) #a #define h(a) g(a) int main() { &nbs
分析Spring源代码之，DI的实现 aijuans spring DI 现源代码
(转) 分析Spring源代码之，DI的实现 2012/1/3 by tony 接着上次的讲，以下这个sample [java] view plain copy print
for循环的进化 alxw4616 JavaScript
// for循环的进化 // 菜鸟 for (var i = 0; i < Things.length ; i++) { // Things[i] } // 老鸟 for (var i = 0, len = Things.length; i < len; i++) { // Things[i] } // 大师 for (var i = Things.le
网络编程Socket和ServerSocket简单的使用百合不是茶网络编程基础 IP地址端口
网络编程;TCP/IP协议网络:实现计算机之间的信息共享,数据资源的交换协议:数据交换需要遵守的一种协议,按照约定的数据格式等写出去端口:用于计算机之间的通信每运行一个程序，系统会分配一个编号给该程序，作为和外界交换数据的唯一标识 0~65535 查看被使用的
JDK1.5 生产消费者 bijian1013 java thread 生产消费者 java多线程
ArrayBlockingQueue：一个由数组支持的有界阻塞队列。此队列按 FIFO（先进先出）原则对元素进行排序。队列的头部是在队列中存在时间最长的元素。队列的尾部是在队列中存在时间最短的元素。新元素插入到队列的尾部，队列检索操作则是从队列头部开始获得元素。 ArrayBlockingQueue的常用方法：
JAVA版身份证获取性别、出生日期及年龄 bijian1013 java 性别出生日期年龄
工作中需要根据身份证获取性别、出生日期及年龄，且要还要支持15位长度的身份证号码，网上搜索了一下，经过测试好像多少存在点问题，干脆自已写一个。 CertificateNo.java package com.bijian.study; import java.util.Calendar; import
【Java范型六】范型与枚举 bit1129 java
首先，枚举类型的定义不能带有类型参数，所以，不能把枚举类型定义为范型枚举类，例如下面的枚举类定义是有编译错的 public enum EnumGenerics<T> { //编译错，提示枚举不能带有范型参数 OK, ERROR; public <T> T get(T type) { return null;
【Nginx五】Nginx常用日志格式含义 bit1129 nginx
1. log_format 1.1 log_format指令用于指定日志的格式，格式： log_format name(格式名称) type(格式样式) 1.2 如下是一个常用的Nginx日志格式： log_format main '[$time_local]|$request_time|$status|$body_bytes
Lua 语言 15 分钟快速入门 ronin47 lua 基础
- - 单行注释 - - [[ [多行注释] - - ]] - - - - - - - - - - - 1. 变量 & 控制流 - - - - - - - - - - num = 23 - - 数字都是双精度 str = 'aspythonstring'
java-35.求一个矩阵中最大的二维矩阵 ( 元素和最大 ) bylijinnan java
the idea is from: http://blog.csdn.net/zhanxinhang/article/details/6731134 public class MaxSubMatrix { /**see http://blog.csdn.net/zhanxinhang/article/details/6731134 * Q35 求一个矩阵中最大的二维
mongoDB文档型数据库特点开窍的石头 mongoDB文档型数据库特点
MongoDD: 文档型数据库存储的是Bson文档-->json的二进制特点：内部是执行引擎是js解释器，把文档转成Bson结构，在查询时转换成js对象。 mongoDB传统型数据库对比传统类型数据库：结构化数据，定好了表结构后每一个内容符合表结构的。也就是说每一行每一列的数据都是一样的文档型数据库：不用定好数据结构，
[毕业季节]欢迎广大毕业生加入JAVA程序员的行列 comsci java
一年一度的毕业季来临了。。。。。。。。正在投简历的学弟学妹们。。。如果觉得学校推荐的单位和公司不适合自己的兴趣和专业，可以考虑来我们软件行业，做一名职业程序员。。。软件行业的开发工具中，对初学者最友好的就是JAVA语言了，网络上不仅仅有大量的
PHP操作Excel – PHPExcel 基本用法详解 cuiyadll PHP Excel
导出excel属性设置//Include classrequire_once('Classes/PHPExcel.php');require_once('Classes/PHPExcel/Writer/Excel2007.php');$objPHPExcel = new PHPExcel();//Set properties 设置文件属性$objPHPExcel->getProperties
IBM Webshpere MQ Client User Issue (MCAUSER) darrenzhu IBM jms user MQ MCAUSER
IBM MQ JMS Client去连接远端MQ Server的时候，需要提供User和Password吗？答案是根据情况而定，取决于所定义的Channel里面的属性Message channel agent user identifier (MCAUSER)的设置。 http://stackoverflow.com/questions/20209429/how-mca-user-i
网线的接法 dcj3sjt126com
一、PC连HUB (直连线)A端：（标准568B）：白橙，橙，白绿，蓝，白蓝，绿，白棕，棕。 B端：（标准568B）：白橙，橙，白绿，蓝，白蓝，绿，白棕，棕。二、PC连PC （交叉线）A端：(568A)：白绿，绿，白橙，蓝，白蓝，橙，白棕，棕； B端：（标准568B）：白橙，橙，白绿，蓝，白蓝，绿，白棕，棕。三、HUB连HUB&nb
Vimium插件让键盘党像操作Vim一样操作Chrome dcj3sjt126com chrome vim
什么是键盘党？键盘党是指尽可能将所有电脑操作用键盘来完成，而不去动鼠标的人。鼠标应该说是新手们的最爱，很直观，指哪点哪，很听话！不过常常使用电脑的人，如果一直使用鼠标的话，手会发酸，因为操作鼠标的时候，手臂不是在一个自然的状态，臂肌会处于绷紧状态。而使用键盘则双手是放松状态，只有手指在动。而且尽量少的从鼠标移动到键盘来回操作，也省不少事。在chrome里安装 vimium 插件
MongoDB查询（2）——数组查询[六] eksliang mongodb MongoDB查询数组
MongoDB查询数组转载请出自出处：http://eksliang.iteye.com/blog/2177292 一、概述 MongoDB查询数组与查询标量值是一样的，例如，有一个水果列表，如下所示： > db.food.find() { "_id" : "001", "fruits" : [ "苹
cordova读写文件（1） gundumw100 JavaScript Cordova
使用cordova可以很方便的在手机sdcard中读写文件。首先需要安装cordova插件：file 命令为： cordova plugin add org.apache.cordova.file 然后就可以读写文件了，这里我先是写入一个文件，具体的JS代码为： var datas=null;//datas need write var directory=&
HTML5 FormData 进行文件jquery ajax 上传到又拍云 ileson jquery Ajax html5 FormData
html5 新东西：FormData 可以提交二进制数据。页面test.html <!DOCTYPE> <html> <head> <title> formdata file jquery ajax upload</title> </head> <body> <
swift appearanceWhenContainedIn:(version1.2 xcode6.4) 啸笑天 version
swift1.2中没有oc中对应的方法： + (instancetype)appearanceWhenContainedIn:(Class <UIAppearanceContainer>)ContainerClass, ... NS_REQUIRES_NIL_TERMINATION; 解决方法：在swift项目中新建oc类如下： #import &
java实现SMTP邮件服务器 macroli java 编程
电子邮件传递可以由多种协议来实现。目前，在Internet 网上最流行的三种电子邮件协议是SMTP、POP3 和 IMAP，下面分别简单介绍。　　◆ SMTP 协议　　简单邮件传输协议(Simple Mail Transfer Protocol,SMTP)是一个运行在TCP/IP之上的协议，用它发送和接收电子邮件。SMTP 服务器在默认端口25上监听。SMTP客户使用一组简单的、基于文本的
mongodb group by having where 查询sql qiaolevip 每天进步一点点学习永无止境 mongo 纵观千象
SELECT cust_id, SUM(price) as total FROM orders WHERE status = 'A' GROUP BY cust_id HAVING total > 250 db.orders.aggregate( [ { $match: { status: 'A' } }, { $group: {
Struts2 Pojo（六） Luob. POJO strust2
注意：附件中有完整案例 1.采用POJO对象的方法进行赋值和传值 2.web配置 <?xml version="1.0" encoding="UTF-8"?> <web-app version="2.5" xmlns="http://java.sun.com/xml/ns/javaee&q
struts2步骤 wuai struts
1、添加jar包 2、在web.xml中配置过滤器 <filter> <filter-name>struts2</filter-name> <filter-class>org.apache.st