How it Works


The dockerfile is used to build an ECR image used by the training instance. The dockerfile contains the following important dependencies:

  • TensorFlow for GPU

  • Python 2 and 3

  • Coral retraining scripts

  • WPILib scripts


Images should be labelled in Supervisely. They should be downloaded as jpeg + json, in a tar file. When the user calls"s3://bucket"), SageMaker automatically downloads the content of that folder/bucket to /opt/ml/input/data/training inside of the training instance.

The tar is converted to the 2 records and .pbtxt used by the retraining script by the script. It automatically finds the ONLY tar in the specified folder and extracts it. It then uses to convert the jsons to 2 large csv files. converts the csv files into .record files. Finally, the meta.json file is parsed by to create the .pbtxt file, which is a label map.


At the moment, the only hyperparameter that you can change is the number of training steps. The dict specified in the notebook is written to /opt/ml/input/config/hyperparameters.json in the training instance. It is parsed by, and is used when calling ./ in train.

Training calls the train script inside the training instance. It downloads checkpoints, creates the records, trains, converts to .tflite, and uploads to S3.


The output output.tflite is moved to /opt/ml/model/output.tflite. This is then automatically uploaded to an S3 bucket generated by SageMaker. You can find exactly where this is uploaded by going into the completed training job in SageMaker. It will be inside of a tar, inside of a tar.