本章是对该论文的github部分说明进行了翻译:triplet-reid
如有不合适的地方请提出,我会进行修改或删除。
Triplet-based Person Re-Identification
Code for reproducing the results of our "In Defense of the Triplet Loss for Person Re-Identification" paper.
Training your own models
If you want more flexibility, we now provide code for training your own models.
This is not the code that was used in the paper (which became a unusable mess),
but rather a clean re-implementation of it in TensorFlow,
achieving about the same performance.
训练自己的模型使用的代码是tensorflow的,有着相
同的效果
- This repository requires at least version 1.4 of TensorFlow.
- The TensorFlow code is Python 3 only and won't work in Python 2!
如果你的数据很困难,不要忘记调整学习率
Defining a dataset
A dataset consists of two things:
- An
image_root
folder which contains all images, possibly in sub-folders. - A dataset
.csv
file describing the dataset.
-
image_root
文件夹包含了所有的图片,当然也可能在子文件夹中 -
.csv
文件是对这些文件的说明
To create a dataset, you simply create a new .csv
file for it of the following form:
.csv
文件中的格式如下:其中identity
说明PID,对应这类名,她可以是任意的字符串,但是同类必须一样。
relative_path/to/image.jpg
是相对与image_root
的文件位置。
identity,relative_path/to/image.jpg
Where the identity
is also often called PID
(P
erson ID
entity) and corresponds to the "class name",
it can be any arbitrary string, but should be the same for images belonging to the same identity.
The relative_path/to/image.jpg
is relative to aforementioned image_root
.
Training
Given the dataset file, and the image_root
, you can already train a model.
The minimal way of training a model is to just call train.py
in the following way:
有了数据文件,和image_root
你就可以训练模型了,最简单的方法就是通过如下的方法调用train.py
。
python train.py \
--train_set data/market1501_train.csv \
--image_root /absolute/image/root \
--experiment_root ~/experiments/my_experiment
This will start training with all default parameters.
We recommend writing a script file similar to market1501_train.sh
where you define all kinds of parameters,
it is highly recommended you tune hyperparameters such as net_input_{height,width}
, learning_rate
,
decay_start_iteration
, and many more.
See the top of train.py
for a list of all parameters.
这样会使用默认的参数进行训练,我们建议使用 market1501_train.sh
去定义参数,需要注意的是,强烈建议您调整 net_input_{height,width}
, learning_rate
,decay_start_iteration
,等参数,可以查看train.py
中在头部给出的参数说明。
As a convenience, we store all the parameters that were used for a run in experiment_root/args.json
.
为了方便,我们将所用我们在运行种使用的参数保存到了 experiment_root/args.json
文件中。
Pre-trained initialization
If you want to initialize the model using pre-trained weights, such as done for TriNet,
you need to specify the location of the checkpoint file through --initial_checkpoint
.
要什么预训练模型就从下面的链接下载,并修改market1501_train.sh
的参数。
For most common models, you can download the checkpoints provided by Google here.
For example, that's where we get our ResNet50 pre-trained weights from,
and what you should pass as second parameter to market1501_train.sh
.
Example training log
This is what a healthy training on Market1501 looks like, using the provided script:
The Histograms
tab in tensorboard also shows some interesting logs.
Interrupting and resuming training
终止和继续训练
Since training can take quite a while, interrupting and resuming training is important.
You can interrupt training at any time by hitting Ctrl+C
or sending SIGINT (2)
or SIGTERM (15)
to the training process; it will finish the current batch, store the model and optimizer state,
and then terminate cleanly.
但已经训练了一段时间,你可以按Ctrl+C
或者发送SIGINT (2)
or SIGTERM (15)
给训练中的模型用于停止当前的训练,并保存当前的模型。
Because of the args.json
file, you can later resume that run simply by running:
通过使用args.json
文件,我们可以继续之前的训练。
python train.py --experiment_root ~/experiments/my_experiment --resume
The last checkpoint is determined automatically by TensorFlow using the contents of the checkpoint
file.
Performance issues
性能问题
For some reason, current TensorFlow is known to have inconsistent performance and can sometimes become very slow.
The current only known workaround is to install google's performance-tools and preload tcmalloc:
由于一些原因,当前版本的tensorflow有着不稳定的性能,有时会运行的很慢。目前唯一已知的解决方法是安装谷歌的性能工具并预加载tcmalloc:
env LD_PRELOAD=/usr/lib/libtcmalloc_minimal.so.4 python train.py ...
This fixes the issues for us most of the time, but not always.
If you know more, please open an issue and let us know!
这个能解决我们的大部分问题。
Out of memory
内存溢出
The setup as described in the paper requires a high-end GPU with a lot of memory.
If you don't have that, you can still train a model, but you should either use a smaller network,
or adjust the batch-size, which itself also adjusts learning difficulty, which might change results.
本文中使用的是高端的GPU具有大量显存,如果你没有这些,并且依然要训练模型,你可以使用较小的网络或者修改batch-size,这样可能会使得学习的效果变差。
The two arguments for playing with the batch-size are --batch_p
which controls the number of distinct
persons in a batch, and --batch_k
which controls the number of pictures per person.
We usually lower batch_p
first.
有两个参数可以用来控制batch-size,一个是--batch_p
表示一个batch种有时别能力的人的数量,另一个--batch_k
控制了每个人的照片数量。
Custom network architecture
TODO: Documentation. It's also pretty straightforward.
The core network
The network head
Computing embeddings
计算隐层特征
Given a trained net, one often wants to compute the embeddings of a set of pictures for further processing.
对于给定的网路哦,我们需要计算隐层特征用于其他的操作。
This can be done with the embed.py
script, which can also serve as inspiration for using a trained model in a larger program.
可以通过embed.py
完成该操作。
The following invocation computes the embeddings of the Market1501 query set using some network:
下面是对Market1501数据集进行特征提取。
python embed.py \
--experiment_root ~/experiments/my_experiment \
--dataset data/market1501_query.csv \
--filename test_embeddings.h5
The embeddings will be written into the HDF5 file at ~/experiments/my_experiment/test_embeddings.h5
as dataset embs
.
获取的特征回忆HDF5的格式保存在~/experiments/my_experiment/test_embeddings.h5
Most relevant settings are automatically loaded from the experiment's args.json
file, but some can be overruled on the commandline.
大部分的参数会从args.json
文件中自动调用,但是也有一部分参数需要外部给定。
If the training was performed using data augmentation (highly recommended),
one can invest a some more time in the embedding step in order to compute augmented embeddings,
which are usually more robust and perform better in downstream tasks.
如果训练的时候进行了裁切(有着较高的召回率),则可以花更多的时间到特征提取。
The following is an example that computes extensively augmented embeddings:
python embed.py \
--experiment_root ~/experiments/my_experiment \
--dataset data/market1501_query.csv \
--filename test_embeddings_augmented.h5 \
--flip_augment \
--crop_augment five \
--aggregator mean
This will take 10 times longer, because we perform a total of 10 augmentations per image (2 flips times 5 crops).
上面的将会花费10倍的时间,因为每一张图片要执行10次,两次翻转,五次裁切。
All individual embeddings will also be stored in the .h5
file, thus the disk-space also increases.
One question is how the embeddings of the various augmentations should be combined.
When training using the euclidean metric in the loss, simply taking the mean is what makes most sense,
and also what the above invocation does through --aggregator mean
.
每一个的特征保存在.h5
文件中,一个问题是如何将这些特征进行组合,当在loss中使用欧式距离,仅简单的使用平均值最简单的。可以使用--aggregator mean
。
But if one for example trains a normalized embedding (by using a _normalize
head for instance),
The embeddings must be re-normalized after averaging, and so one should use --aggregator normalized_mean
.
但是,如果训练的时候使用归一化的特征(通过使用 _normalize
),这个特征必须平均值后再次归一化,这里我们需要使用--aggregator normalized_mean
。
The final combined embedding is again stored as embs
in the .h5
file, as usual.
最终组合的特征重新保存在 .h5
文件中。
Evaluating embeddings
特征评价
matlab偶一个,我们写了自己的。
Once the embeddings have been generated, it is a good idea to compute CMC curves and mAP for evaluation.
With only minor modifications, the embedding .h5
files can be used in
the official Market1501 MATLAB evaluation code,
which is exactly what we did for the paper.
For convenience, and to spite MATLAB, we also implemented our own evaluation code in Python.
This code additionally depends on scikit-learn,
and still uses TensorFlow only for re-using the same metric implementation as the training code, for consistency.
We verified that it produces the exact same results as the reference implementation.
The following is an example of evaluating a Market1501 model, notice it takes a lot of parameters
下面的是我们对Market1501模型的评价。
./evaluate.py \
--excluder market1501 \
--query_dataset data/market1501_query.csv \
--query_embeddings ~/experiments/my_experiment/market1501_query_embeddings.h5 \
--gallery_dataset data/market1501_test.csv \
--gallery_embeddings ~/experiments/my_experiment/market1501_test_embeddings.h5 \
--metric euclidean \
--filename ~/experiments/my_experiment/market1501_evaluation.json
The only thing that really needs explaining here is the excluder
.
需要解释的参数只有 excluder
。
For some datasets, especially multi-camera ones,one often excludes pictures of the query person from the gallery (for that one person)if it is taken from the same camera.This way, one gets more of a feeling for across-camera performance.Additionally, the Market1501 dataset contains some "junk" images in the gallery which should be ignored too.All this is taken care of by excluders
.We provide one for the Market1501 dataset, and a diagonal
one, which should be used where there is no such restriction,for example the Stanford Online Products dataset.
对于某些数据集,尤其是多相机的那种,有时有相同的摄像机,需要将测试人从数据集种删除。另外,Market1501数据集包含了一些无用图片,需要被忽视。可以通过 excluders
执行该功能。我们提供了一个用于Market1501数据集,一个参数diagonal
用于没有这种限制的情况,比如Stanford Online Products dataset。
Independent re-implementations
These are the independent re-implementations of our paper that we are aware of,
please send a pull-request to add more:
- Open-ReID (PyTorch, MIT license)