1 select test case
As a result, an algorithm is likely to perform similarly in many testcases and such a behavior makes comparable studies unreliable. Additionally, ifwe employ test cases where all the methods perform very well or very poorly,then any measure will fail to robustly characterize performance differences
A good evaluation test set should consist of representative examples thatare neither too easy nor too dif cult for the methods under evaluation
2 Precision Recallcurves
Precision-Recall curves may not offer a de nite answer as to whether amethod A is better than method B since it is often the case that different methods performwell on different types of images precision: how many result is right? Recallrate: how many result in the ground true datasets.
3 frame detectionaccuracy
This method should betwo number, for example, a is the true number of objects in the frame and b is thedetection number of object in the frame. We can get a number is c, which c is theb/a. then we can get the mean values in the video sequences. This number is thefame detection accuracy.
This result will beeffected by the threshold.
4 the whole data
The first is to dividethe condition:
correct assignment (one to one): onedetected object is tracked by one ground truth object
over-segmentation (one ground truthobject to many detected objects),
over-grouping (many ground truthobjects to one detected object),
multiple assignments (many groundtruth objects to many detected objects),
missed detection (one ground truthobject to zero detected object),
false detection (zero ground truthobject to one detected object).
The second step is to count the number in the video sequence.
The third step is to compute the distance between the ground true objectand the detection object. We will get the two number D(true) and the D(false),this will be the two threshold in order to decide whether the detection objectis the true object or which condition in the front.
In the end, we will use the dataset to draw the figure in order todirectly feel the performance between the two method.
5 a simple method
The author apple two number to express the result of experiment.
The first is the time of CPU compute.
The second is the directly number, for example, the fails after n frame andthe measure square error.
6 evaluation tracking
We must know the five condition
Mostly Tracked (MT), if the resulting trajectory is more than80% of the ground truth (GT) trajectory.
Mostly Lost (ML), if more than 80% of GT the trajectory is not tracked.
Fragmented (Fgmt), if the resulting trajectory is less than80 % of the GT trajectory
False Track (FAT), the resulting trajectorycorresponding to no real object.
ID switches(IDS), when identity exchanges between a pairof trajectories.
In the end, the method also gives a detection rata. Dr is the 1 – thetotal number of no trajectory / the total number of all trajectory frames in thevideo sequence.
7 (SODA)
The SODA is calculated using a ratio of the spatial intersection and unionof an output object and mapped ground truth objects
8 (SFDA)
The Sequence FrameDetection Accuracy (SFDA) is a frame-level measure that accounts for number ofobjects detected, missed detects, false positives, and spatial alignment of systemoutput and ground truth objects.
For a given frame t, the FrameDetection Accuracy (FDA(t)) is calculated as:
Overlap_ratio is the overlap areabetween the true object area and the detection area. The function is:
Ng refers to the numberof the ground trueobject.
Nd refers to the number of the detection objectin the t frame.
Then we will get The Sequence Frame Detection Accuracy (SFDA) iscalculated as the average of the FDA measure over all the relevant frames inthe sequence.
9 average trackingaccuracy (ATA)
STDA:
The Average TrackingAccuracy (ATA) is calculated as the average of STDA over all the unique objectsin the sequence.
10 multiple object detectionprecision/accuracy (MODP/MODA)
MODA:
To assess the accuracy aspectof system performance, we utilize the missed detection and false positivecounts. Assuming that the number of misses is indicated bymt and the numberof false positives is indicated by fpt for eachframe t, we can compute the Multiple Object Detection Accuracy (MODA)for the sequence as:
MODP:
Mapped Overlap Ratio is the overlaparea between the ground true object and the detection object.
N (mapped) is the number of mapped object pairs, where the mapping is done between objectswhich have the best spatial overlap in the given frame t.
11 multiple objecttracking precision/accuracy (MOTP/MOTA)
MOTA:
To extract the accuracyaspect of the system output track, we compute the number of missed detects, falsepositives, and switches in the system output track for a given reference groundtruth track
where, aftercomputing the mapping for frame t, mt isthe number of misses, fpt is the number offalse positives, and ID-SWITCHESt is the number of ID mismatches in frame t considering themapping in frame (t-1). It should be noted that because of the logfunction, we start the ID-SWITCH count at 1.
MOTP:
To obtain the precisionscore, we calculate the spatio-temporal overlap between the reference tracksand the system output tracks. The Multiple Object Tracking Precision (MOTP) is definedas:
where, N mapped refers to the mapped system output objects over an entirereference track taking into account splits and merges, andN tmapped refers tothe number of mapped objects in thetth frame.