DIUx xView 2018 Detection Challenge--Tutorial

Steps to have your solution evaluated

As an object detection Challenge, the task is to predict bounding boxes around each object in an input RGB image. Bounding boxes define the extent of each object in the image and classification labels (for each bounding box) identify the type of object. A single image may contain a few objects, or many. This tutorial exlpains how to structure your inference code to evaluate it on the validation dataset and see results on the Challenge leaderboard.

1. download data

First, download the data

Go to the Dataset Download page to download the three files: (i) train_images.zip contains 846 high-resolution RGB satellite images in TIFF format; (ii) train_labels.zip contains a GEOJSON file with bounding box and class labels and metadata for each image in the training set; (iii) val_images.zip contains 282 images of a validation set, another set of images which are used to compute the public leaderboard. Labels are not available for the validation set. Because the validation images are available for download, you can also evaluate your predictions locally on your own. These are large files, so it may take some time to download.

Inspect the images

The input data are RGB satellite images in TIFF format. Once the files have downloaded, decompress and inspect a few! Some images contain many objects, other images may not contain many objects. There are 60 classes of objects, with a variety of sizes, shapes, and contexts.


2. Develop your solution

Train a model

Object detection models can be complex, with many parameters. The purpose of "training" a model is to generate a set of parameters that can be used for inference (i.e., to make predictions). This tutorial assumes you have trained a model. If you don't have a trained model of your own yet, you can start from the xView baseline model, which uses the Tensorflow object detection interface. The inputs for this Challenge are RGB images, and the outputs are bounding boxes and class labels. Any choices of representation, model parameters, or other inner workings of a solution are up to you. However you choose to train your model, you must consider how it will be used to make predictions on new input data.

You can use any language or framework to develop/train your solution. The baseline model is available for you to inspect, download, and build upon if you wish. If you choose to use the same Tensorflow object detection API, then preparing inference code is as easy as adding a new checkpoint file to a container, as shown on the next page of this tutorial.

Inference

Although you will probably spend more time training and evaluating your model, inference is the focus of your submission of the Challenge. Your inference code must:

  1. Load state from your trained model (i.e., your latest set of trained parameters);
  2. Read a TIFF image as input;
  3. Write out bounding boxes and object classes as predicted by your model

And that's it! You can write inference code in any language or framework, but to be able to evaluate your code, your submission needs to follow a few conventions:

Input

The inputs are RGB satellite images in TIFF format. Your inference code must:

  • read a single TIFF image as input
  • handle this input image by a command line argument

The entire xView dataset has been split into three parts which are made available here through the Challenge:

  • Training set: train_images.zip + train_labels.zip. Solvers should train models against these inputs.
  • Validation set: val_images.zip. Solvers should run inference code against these inputs to generate predictions, but will not have access to the validation labels during the Challenge. The labels are used to compute the public leaderboard which is displayed throughout the Challenge.
  • Holdout test set: after the Challenge deadline, a private leaderboard will be computed using this holdout set. Neither images nor labels from the holdout set are available for download.

Examples

Here's a link to example inference code using the Tensorflow object detection API

Output

Inference code should generate a single output file for each input TIFF image. Each line in the output file represents a detected object. Your code must:

  • output a single text file with predictions for the single input TIFF image
  • the output filename must be the input file name with ".txt" appended
  • the output format is a space-delimited flat text file without headers, as defined below

The output file format is a space-delimited flat text file where each line indicates a detected object.

The space-delimited fields are: [ x_min, y_min, x_max, y_max, class_id, confidence ]

Example:

64 270 74 283 18 0.905418
245 18 253 28 18 0.905101
...

The coordinate values are the corners of the estimated bounding box as generated by your inference code; the x-min and y-min values define the top left corner of the bounding box, followed by the x-max and y-max values that define the lower right corner of the bounding box.

Running your code

For the Challenge to support any language and framework, we need to define a common entry point to run inference code. You must:

  • Write a run.sh script to run your code
  • Make sure this script is present in your Docker container
  • Make sure this script is set as executable (e.g., using chmod +x run.sh)
  • Write the script to take two arguments, as in the example below. The first argument is a fully qualified path to a single input TIFF image. The second argument is a fully qualified path to an output directory.

Example: 
run.sh /absolute/path/to/a/test/image.tif /absolute/path/to/output/directory

where run.sh is a script that you provide.

For example, here's what your run.sh would look like if you are using python and Tensorflow and specifying the name of your saved model at runtime:

  #!/bin/bash  
python inference.py -c ./model.pb --input-image $1 --output-dir $2

Here's an example of what run.sh might look like if your code was written in R:

  #!/bin/bash  
Rscript --vanilla inference.R $1 $2

When your solution is submitted to the xView platform and evaluated on the validation set, it is parallelized and run many times to perform inference over the whole set of validation images. Each time your code is run, it sees a single input image, and should produce a single output file containing your predictions for the given input.

Score

Your code only needs to perform inference, not scoring; the evaluation system will automatically compute scores over your predictions. Since it is probably useful to know how your predictions will be scored, we have provided the scoring code that will be used by the xVIEW app to score submissions and compute the leaderboard. When you submit a containerized solution through the app, your container is evaluated on the validation set. This dataset is quite large, so it may take some time for scores to be evaluated. You can monitor the progress of current and previous jobs on your Submissions page.

We recommend running scoring code locally to vet your inference model during development, using a small set of input images from the training set. You can run the following one-liner to test your model locally:

python3 score.py /path/to/all/your/prediction/files/ /path/to/labels --output /path/to/output/directory

This will generate metrics.txt and score.txt in the output directory. The value in the score.txt file is the mean average precision of your predictions. This is the value that is used by the xView app to rank your submissions against the other entries. Note: the provided scoring code must be evaluated with python3. You can check your python version with python --version

3. Prepare your submission

Prepare your submission

This section describes how to package your code for submission as a Docker container. Generally speaking you need to:

  • add your model/checkpoint to the dockerfile
  • make sure the container includes any inference code needed for inference
  • make sure the container includes a run.sh script for your inference code

Using the containerized baseline solutions

We have released both trained model checkpoints and containerized solutions using pre-trained baseline model checkpoints. You can view the container images on Docker Hub; there are three different tagged versions, each corresponding to a trained model checkpoint as released on Github.

To inspect a container image, first you need to pull it to your local filesystem with a command like:

docker pull xview2018/baseline:vanilla-v1-1

Then you can run the container locally and inspect it:

docker run -it xview2018/baseline:vanilla-v1-1 /bin/bash.

When inspecting the contents of the container you can use regular Linux shell commands (e.g., type ls to list the directory contents). You will see it has the following files:

  • run.sh file that launches inference code for a given input
  • inference code that loads a Tensorflow checkpoint and predicts bounding boxes
  • requirements.txt file to install python dependencies
  • A Tensorflow protobuf checkpoint model.pb which is used as input for the inference code
  • a Dockerfile that defines the container build process

Depending on which tagged container version you are inspecting, the actual values in the model.pb file will be different. You can verify which checkpoint is in a container by comparing the checksum hashes between the github release checkpoint file and the model.pb file in the container (e.g., shasum -a 1 FILENAME will compute the SHA1 hash).

If your trained model uses the Tensorflow object detection API, then you may not need to modify the inference code or scripts at all; you only need to load your trained checkpoint into the provided container in place of the existing model.pb file.

To load your own checkpoint into the container, edit the Dockerfile such that your checkpoint file is renamed to model.pb inside the working directory of the container (or alternatively, edit the run.sh script to point at your checkpoint file instead of model.pb. You can either use a COPY command to explicitly add a file, or you can make sure your checkpoint is present in the same directory/context as your Dockerfile and edit the RUN mv vanilla.pb model.pb line in the Dockerfile to make sure model.pb is overwritten with your checkpoint.

Using your own inference code

You can use any language or framework you wish, to develop a solution for the Challenge, as long as it follows the requirements above. Feel free to use the containerized solutions as a template.

Build your container

After you have developed your solution and prepared as described above (and each time you modify something), you will need to build the container. To build the container, navigate to the directory that has your model assets and your Dockerfile, and run a command like: docker build . -t alice/xview-inference

to build it.

To ensure you can test your container image locally, make sure you have included an RGB TIFF image in the container. Input images should be located at the root directory of your container. For example, you might use a line like ADD 1234.tif /1234.tif to add an example RGB image into your Dockerfile. The baseline containers already have a single example image already located at /1047.tif.

Tag and push the container

You will need to push your image to a public registry so that the xView app can download it. You MUST tag your image with a version of some kind. If you don't explicitly tag your container, your image will default to using the latest tag, and then when you submit the container for evaluation, the system will see the old/stale version rather than any updates you may have made. So, use a meaningful tag, and consider using a programmatic method (such as a Makefile) to bump versions each time you build a container!

docker tag alice/xview-inference alice/xview-inference:v3
docker push alice/xview-inference:v3

Test your image

Before submitting your containerized solution to xView for evaluation, you will want to make sure it works using a quick local test to ensure the code respects the input/output requirements and gives the output you expect. An easy way to test this is:

$ docker run alice/xview-inference:v3 bash -c './run.sh /1234.tif /tmp && cat /tmp/1234.tif.txt'

And the output (from /tmp/samplefile.tif.txt) should look something like this:

35 11 61 63 60 0.28
3 32 99 78 51 0.58
14 19 96 80 87 0.19
13 20 87 66 60 0.09
39 35 80 83 63 0.16
1 1 76 81 37 0.35
...

This will validate that your code runs against a real input (assuming you've included a /1234.tif in your Docker image) and produces a real output in the right place. If you are using the containerized baselines, replace 1234.tif from the example above with 1047.tif, which is actually present in the baseline containers.

4. Submit for evaluation

To submit for evaluation:

  • Navigate to the Submit page
  • Enter information for your container image. Make sure you are using a unique tag. Write a description that will help you keep track of this submission as you refer back to it over the course of the Challenge. The descriptions are only displayed for you, and will not be shown on the leaderboard or to any other users.
  • Agree to terms by checking the box.
  • Make sure you don't have any other submissions in the queue for evaluation. You may only have one submission in evaluation at a time; a new submission will interrupt any other submission in the queue. Previous submissions which have finsihed evaluation (including any submissions which have returned an error) will not be impacted.
  • Click Submit.
  • Check the "My Submissions" tab to see the current state of your submission. Does your submission return an error? Many types of errors will return quickly, within a few minutes. There are many pixels and many objects in xView, so evaluation is very compute-intensive. After your submission is queued for evaluation, it may take a few hours for your submission to be fully evaluated.

After your submission is scored, if the score is higher than any of your previous submissions, it will be ranked. If it is in the top 50 scores from all solvers, it will appear on the public leaderboard. For each submission, you can always refer back to My Submissions page to see the status (errors, scores, metrics, etc).



你可能感兴趣的:(DL)