Thanks to the organizers and congrats to all the winners and my wonderful teammates @nvnnghia and @steamedsheep
This is really very unexpected for us. Because we don’t have any NEW THING, we just keep optimizing cross validation F2 of our pipeline locally from the beginning till the end.
We designed a 2-stage pipeline. object detection -> classification re-score. Then a post-processing method follows.
Validation strategy: 3-fold cross validation split by video_id.
Finally, we use a simple post-processing to further boost our CV to 0.74+.
For example, the model has predicted some boxes B at #N frame, select the boxes from B which has a high confidence, these boxes are marked as “attention area”.
in the #N+1, #N+2, #N+3 frame, for the predicted boxes with conf > 0.01, if it has an IoU with the “attention area” larger than 0, boost the score of these boxes with score += confidence * IOU
We also tried the tracking method, which gives us a CV of +0.002. However, it introduces two additional hyperparameters. We therefore chose not to use it.
At the beginning of the competition, we used different F2 algorithms for each of the three members of our team, and later we found that for the same oof, we did not calculate the same score.
For example, nvnn shared an OOF file with F2=0.62, and sheep calculated F2=0.66, while I calculated F2=0.68.
We finally chose to use the F2 algorithm with the lowest score from nvnn to evaluate all our models.
https://www.kaggle.com/haqishen/f2-evaluation/script
Here’s our final F2 algorithm, if you are interested you can use this algorithm to compare your CV with ours!
As usual, I trained many models in this competition using Z by HP Z8G4 Workstation with dual A6000 GPU. The large memory of 48G for a single GPU allowed me to train large resolution images with ease. Thanks to Z by HP for sponsoring!
参考:
https://www.kaggle.com/competitions/tensorflow-great-barrier-reef/discussion/307878