View Detections

Using the SmartML Inference Service to visualize model predictions

After training your model with SmartML, you will need to use our Inference Service to visualize your model predictions.

The Inference Service allows users to run SmartML models on images and return visualizations or detections. The current version is available under sixgill/smartml-inference:1.1.0-cuda11.0.

An example usage, assuming a SmartML output zip file is available as example-model.zip in the working directory:

docker container run --rm --gpus all --network="host" -e SMARTML_MODELS_DIR=/workspace/ \
-v $PWD/example-model.zip:/workspace/example-model.zip \
-it sixgill/smartml-inference:1.1.0-cuda11.0

More models can be made accessible by mounting them into the specified SMARTML_MODELS_DIR. They can then be accessed using the name of the corresponding ZIP file, e.g. inference for the model model-x.zip can be accessed under /model/model-x/inference.

API Spec

Endpoint

Request Format

Response Format

/form (GET)

N/A

An interactive form for performing inference visualization.

/models (GET)

N/A

A list of all available model JSON objects as returned by /models/<id>.

/models/<model_name> (GET)

N/A

A JSON object with the following keys:

  • id: The model ID.

  • loaded: Whether or not the model is currently loaded

  • schema: An object with the following keys:

    • type: One of ‘featurePoints’, ‘polygon’, or ‘rectangle’.

    • classes: A list of class label names.

    • score_threshold: A threshold between 0.0 and 1.0 indicating the required confidence for reported detections. Detections with a lower confidence will not be returned.

    • preprocessor: A preprocessor description; see SmartML Backend for format details.

    • inferenze_resize_desc: A object describing the model resize policy. It has the following keys:

      • type: Either ‘none’ or ‘fixed’. ‘none’ indicates no resizing and other arguments are ignored. ‘fixed’ indicates resizing to a fixed width and/or height.

      • width (optional): The width in pixels to resize to for ‘fixed’ type . If -1, image is resized to fixed height and the width is set to preserve aspect ratio.

      • height (optional): The height in pixels to resize to for ‘fixed’ type . If -1, image is resized to fixed width and the height is set to preserve aspect ratio.

/models/<model_name>/inference (POST)

Performs inference on an image with the specified model and returns a list of Sense format detections.

Content-Type: multipart/form-data

A single PNG or JPG file upload named image

A JSON object with the following keys:

  • ok: true/false indicator of whether the request was successful

  • detections (present only if ok is true): A list of Sense format ‘annotations’ objects. Keys are class labels, values are lists of detections associated to that label.

  • error (present only if ok is false): A string describing the error encountered.

/models/<model_name>/visualization (POST)

Performs inference on an image with the specified model and returns a visualization of the detections.

Content-Type: multipart/form-data

A single PNG or JPG file upload named image

A visualization as a PNG image.

/models/<model_name>/load (POST)

Loads the specified model into GPU memory.

Empty

"OK”

/models/<model_name>/unload (POST)

Unloads the specified model from GPU memory.

Empty

“OK”

Failures

On failure, all API endpoints will return a JSON object with an “error” key describing the error.

Endpoints requiring the use of GPU memory can fail if GPU memory is full. In this case, to reduce GPU memory usage:

  • Unused models could be unloaded.

  • Image sizes used for inference could be reduced.

  • Other processes using GPU memory on the same machine could be terminated.