Hardware - nVidia Jetson NX Xavier
Software - Jetpack 4.6 / TensorRT 8
TensorRT sample - /usr/src/tensorrt/samples/sampleOnnxMNIST
ONNX model - https://github.com/PINTO0309/PINTO_model_zoo/tree/main/081_MiDaS_v2
The sampleOnnxMNIST is to detect hand write numbers from 0 ~ 9. I will make some change on this sample to get it work with MiDasV2 depth inference.
The sampleOnnxMNIST is to detect hand write numbers from 0 ~ 9. I will make some change on this sample to get it work with MiDasV2 depth inference.
1.MiDasV2 Model
get PINTO_model_zoo and download MiDasV2 ONNX model
git clone https://github.com/PINTO0309/PINTO_model_zoo.git
cd PINTO_model_zoo/081_MiDaS_v2
After successful downloading, file PINTO_model_zoo/081_MiDaS_v2/saved_model/model_float32.onnx is the custom ONNX model.
Now we need to know the input and output dimensions of the model. A tool netron will help it.
pip install netron
export PATH=$PATH:${HOME}/.local/bin
netron PINTO_model_zoo/081_MiDaS_v2/saved_model/model_float32.onnx
The netron will display all layers of the model on browser. Open url localhost:8080 from browser
Now we know the input layer name is inputs:0 and it's dimension is 1 x 256 x 256 x 3
Since the model requires an image input, so I guess the four dimension means batch x height x width x channel.
Go to bottom of the page. the output name is Identity:0 and it's dimension is 1 x 256 x 256. Since the model output depth map, so I guess the three dimension means batch x height x width
That's all we need to know about the model.
2.sample code
sudo -s
Copy ONNX model file to tensorrt sample folder
mkdir /usr/src/tensorrt/data/midas
cp PINTO_model_zoo/081_MiDaS_v2/saved_model/model_float32.onnx /usr/src/tensorrt/data/midas/
Copy source image
cp PINTO_model_zoo/081_MiDaS_v2/openvino/midasv2_small_256x256/FP16/dog.jpg /usr/src/tensorrt/bin
Create new sample from sampleOnnxMNIST
cd /usr/src/tensorrt/samples
cp -a sampleOnnxMNIST sampleOnnxMiDasV2
cd sampleOnnxMiDasV2
mv sampleOnnxMNIST.cpp sampleOnnxMiDasV2.cpp
Modify Makefile
--- ../sampleOnnxMNIST/Makefile 2021-06-26 08:17:31.000000000 +0800
+++ Makefile 2021-09-27 17:10:13.212404761 +0800
@@ -1,6 +1,8 @@
-OUTNAME_RELEASE = sample_onnx_mnist
-OUTNAME_DEBUG = sample_onnx_mnist_debug
+OUTNAME_RELEASE = sample_onnx_midasv2
+OUTNAME_DEBUG = sample_onnx_midasv2_debug
SAMPLE_DIR_NAME = $(shell basename $(dir $(abspath $(firstword $(MAKEFILE_LIST)))))
+COMMON_FLAGS = -I/usr/include/opencv4/opencv -I/usr/include/opencv4
+EXTRA_LIBS = -L/usr/lib/aarch64-linux-gnu/ -lopencv_dnn -lopencv_gapi -lopencv_highgui -lopencv_ml -lopencv_objdetect -lopencv_photo -lopencv_stitching -lopencv_video -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_videoio -lopencv_imgcodecs -lopencv_imgproc -lopencv_core
MAKEFILE ?= ../Makefile.config
include $(MAKEFILE)
Modify ../Makefile.config to get opencv correctly linked
$(ECHO) Linking: $@
- $(AT)$(CC) -o $@ $(LFLAGS) -Wl,--start-group $(LIBS) $^ -Wl,--end-group
+ $(AT)$(CC) -o $@ $(LFLAGS) -Wl,--start-group $(LIBS) $^ -Wl,--end-group $(EXTRA_LIBS)
$(ECHO) Linking: $@
- $(AT)$(CC) -o $@ $(LFLAGSD) -Wl,--start-group $(DLIBS) $^ -Wl,--end-group
+ $(AT)$(CC) -o $@ $(LFLAGSD) -Wl,--start-group $(DLIBS) $^ -Wl,--end-group $(EXTRA_LIBS)
The whole story is to read dog.jpg as input of depth inference and display image of dog.jpg and depth map of result on screen
Source code of sampleOnnxMiDadV2.cpp
3.Build & run
cd ../../bin
4.Diff from sampleOnnxMNIST.cpp
--- ../sampleOnnxMNIST/sampleOnnxMNIST.cpp 2021-06-26 08:17:31.000000000 +0800
+++ sampleOnnxMiDasV2.cpp 2021-09-27 16:49:44.045143887 +0800
@@ -15,11 +15,11 @@
-//! sampleOnnxMNIST.cpp
-//! This file contains the implementation of the ONNX MNIST sample. It creates the network using
-//! the MNIST onnx model.
+//! sampleOnnxMiDasV2.cpp
+//! This file contains the implementation of the ONNX MiDasV2 sample. It creates the network using
+//! the MiDasV2 onnx model.
//! It can be run with the following command line:
-//! Command: ./sample_onnx_mnist [-h or --help] [-d=/path/to/data/dir or --datadir=/path/to/data/dir]
+//! Command: ./sample_onnx_MiDasV2 [-h or --help] [-d=/path/to/data/dir or --datadir=/path/to/data/dir]
//! [--useDLACore=<int>]
@@ -37,18 +37,21 @@
#include <iostream>
#include <sstream>
+#include <opencv2/opencv.hpp>
using samplesCommon::SampleUniquePtr;
-const std::string gSampleName = "TensorRT.sample_onnx_mnist";
+const std::string gSampleName = "TensorRT.sample_onnx_midas";
-//! \brief The SampleOnnxMNIST class implements the ONNX MNIST sample
+//! \brief The SampleOnnxMiDasV2 class implements the ONNX MiDasV2 sample
//! \details It creates the network using an ONNX model
-class SampleOnnxMNIST
+class SampleOnnxMiDasV2
- SampleOnnxMNIST(const samplesCommon::OnnxSampleParams& params)
+ SampleOnnxMiDasV2(const samplesCommon::OnnxSampleParams& params)
: mParams(params)
, mEngine(nullptr)
@@ -74,7 +77,7 @@
std::shared_ptr<nvinfer1::ICudaEngine> mEngine; //!< The TensorRT engine used to run the network
- //! \brief Parses an ONNX model for MNIST and creates a TensorRT network
+ //! \brief Parses an ONNX model for MiDasV2 and creates a TensorRT network
bool constructNetwork(SampleUniquePtr<nvinfer1::IBuilder>& builder,
SampleUniquePtr<nvinfer1::INetworkDefinition>& network, SampleUniquePtr<nvinfer1::IBuilderConfig>& config,
@@ -83,23 +86,23 @@
//! \brief Reads the input and stores the result in a managed buffer
- bool processInput(const samplesCommon::BufferManager& buffers);
+ bool processInput(const samplesCommon::BufferManager& buffers, cv::Mat & image);
//! \brief Classifies digits and verify result
- bool verifyOutput(const samplesCommon::BufferManager& buffers);
+ bool verifyOutput(const samplesCommon::BufferManager& buffers, cv::Mat & originImage);
//! \brief Creates the network, configures the builder and creates the network engine
-//! \details This function creates the Onnx MNIST network by parsing the Onnx model and builds
-//! the engine that will be used to run MNIST (mEngine)
+//! \details This function creates the Onnx MiDasV2 network by parsing the Onnx model and builds
+//! the engine that will be used to run MiDasV2 (mEngine)
//! \return Returns true if the engine was created successfully and false otherwise
-bool SampleOnnxMNIST::build()
+bool SampleOnnxMiDasV2::build()
auto builder = SampleUniquePtr<nvinfer1::IBuilder>(nvinfer1::createInferBuilder(sample::gLogger.getTRTLogger()));
if (!builder)
@@ -162,24 +165,24 @@
ASSERT(network->getNbInputs() == 1);
mInputDims = network->getInput(0)->getDimensions();
- ASSERT(mInputDims.nbDims == 4);
+ ASSERT(mInputDims.nbDims == 4); // Input is 1 x 256 x 256 x 3
ASSERT(network->getNbOutputs() == 1);
mOutputDims = network->getOutput(0)->getDimensions();
- ASSERT(mOutputDims.nbDims == 2);
+ ASSERT(mOutputDims.nbDims == 3); // Output is 1 x 256 x 256
return true;
-//! \brief Uses a ONNX parser to create the Onnx MNIST Network and marks the
+//! \brief Uses a ONNX parser to create the Onnx MiDasV2 Network and marks the
//! output layers
-//! \param network Pointer to the network that will be populated with the Onnx MNIST network
+//! \param network Pointer to the network that will be populated with the Onnx MiDasV2 network
//! \param builder Pointer to the engine builder
-bool SampleOnnxMNIST::constructNetwork(SampleUniquePtr<nvinfer1::IBuilder>& builder,
+bool SampleOnnxMiDasV2::constructNetwork(SampleUniquePtr<nvinfer1::IBuilder>& builder,
SampleUniquePtr<nvinfer1::INetworkDefinition>& network, SampleUniquePtr<nvinfer1::IBuilderConfig>& config,
SampleUniquePtr<nvonnxparser::IParser>& parser)
@@ -212,9 +215,9 @@
//! \details This function is the main execution function of the sample. It allocates the buffer,
//! sets inputs and executes the engine.
-bool SampleOnnxMNIST::infer()
+bool SampleOnnxMiDasV2::infer()
- // Create RAII buffer manager object
+ // Create RAII buffer manager object
samplesCommon::BufferManager buffers(mEngine);
auto context = SampleUniquePtr<nvinfer1::IExecutionContext>(mEngine->createExecutionContext());
@@ -222,28 +225,29 @@
return false;
+ cv::Mat image = cv::imread("dog.jpg");
+ if (image.cols == 0 || image.rows == 0)
+ {
+ printf("image is empty\n");
+ return false;
+ }
// Read the input data into the managed buffers
ASSERT(mParams.inputTensorNames.size() == 1);
- if (!processInput(buffers))
+ if (!processInput(buffers, image))
return false;
// Memcpy from host input buffers to device input buffers
bool status = context->executeV2(buffers.getDeviceBindings().data());
if (!status)
return false;
// Memcpy from device output buffers to host output buffers
// Verify results
- if (!verifyOutput(buffers))
+ if (!verifyOutput(buffers, image))
return false;
@@ -254,31 +258,30 @@
//! \brief Reads the input and stores the result in a managed buffer
-bool SampleOnnxMNIST::processInput(const samplesCommon::BufferManager& buffers)
+bool SampleOnnxMiDasV2::processInput(const samplesCommon::BufferManager& buffers, cv::Mat & image)
- const int inputH = mInputDims.d[2];
- const int inputW = mInputDims.d[3];
+ const int inputChannels = mInputDims.d[3];
+ const int inputH = mInputDims.d[1];
+ const int inputW = mInputDims.d[2];
- // Read a random digit file
- srand(unsigned(time(nullptr)));
- std::vector<uint8_t> fileData(inputH * inputW);
- mNumber = rand() % 10;
- readPGMFile(locateFile(std::to_string(mNumber) + ".pgm", mParams.dataDirs), fileData.data(), inputH, inputW);
- // Print an ascii representation
- sample::gLogInfo << "Input:" << std::endl;
- for (int i = 0; i < inputH * inputW; i++)
- {
- sample::gLogInfo << (" .:-=+*#%@"[fileData[i] / 26]) << (((i + 1) % inputW) ? "" : "\n");
- }
- sample::gLogInfo << std::endl;
+ printf("inputs:0 - %d x %d x %d x %d\n", mInputDims.d[0], mInputDims.d[1], mInputDims.d[2], mInputDims.d[3]);
- float* hostDataBuffer = static_cast<float*>(buffers.getHostBuffer(mParams.inputTensorNames[0]));
- for (int i = 0; i < inputH * inputW; i++)
- {
- hostDataBuffer[i] = 1.0 - float(fileData[i] / 255.0);
- }
+ cv::Mat resized_image;
+ cv::resize(image, resized_image, cv::Size(inputW, inputH));
+ int batchIndex = 0;
+ int batchOffset = batchIndex * inputW * inputH * inputChannels;
+ float* hostDataBuffer = static_cast<float*>(buffers.getHostBuffer(mParams.inputTensorNames[0]));
+ // input shape [B,H,W,C]
+ // inputs:0 - 1 x 256 x 256 x 3
+ for (size_t h = 0; h < inputH; h++) {
+ for (size_t w = 0; w < inputW; w++) {
+ for (size_t c = 0; c < inputChannels; c++) {
+ hostDataBuffer[batchOffset + (h * inputW + w) * inputChannels + c] =
+ float(float(resized_image.at<cv::Vec3b>(h, w)[c]) / 255.0); // Division 255.0 is to convert uint8_t color to float_t
+ }
+ }
+ }
return true;
@@ -287,39 +290,27 @@
//! \return whether the classification output matches expectations
-bool SampleOnnxMNIST::verifyOutput(const samplesCommon::BufferManager& buffers)
+bool SampleOnnxMiDasV2::verifyOutput(const samplesCommon::BufferManager& buffers, cv::Mat & originImage )
- const int outputSize = mOutputDims.d[1];
float* output = static_cast<float*>(buffers.getHostBuffer(mParams.outputTensorNames[0]));
- float val{0.0f};
- int idx{0};
- // Calculate Softmax
- float sum{0.0f};
- for (int i = 0; i < outputSize; i++)
- {
- output[i] = exp(output[i]);
- sum += output[i];
- }
- sample::gLogInfo << "Output:" << std::endl;
- for (int i = 0; i < outputSize; i++)
- {
- output[i] /= sum;
- val = std::max(val, output[i]);
- if (val == output[i])
- {
- idx = i;
- }
- sample::gLogInfo << " Prob " << i << " " << std::fixed << std::setw(5) << std::setprecision(4) << output[i]
- << " "
- << "Class " << i << ": " << std::string(int(std::floor(output[i] * 10 + 0.5f)), '*')
- << std::endl;
- }
- sample::gLogInfo << std::endl;
- return idx == mNumber && val > 0.9f;
+ const int output0_row = mOutputDims.d[1];
+ const int output0_col = mOutputDims.d[2];
+ printf("Identity:0 - %d x %d x %d\n", mOutputDims.d[0], mOutputDims.d[1], mOutputDims.d[2]);
+ cv::Mat image = cv::Mat::zeros(cv::Size(output0_row, output0_col), CV_8U);
+ for (int row = 0; row < output0_row; row++) {
+ for (int col = 0;col < output0_col; col++) {
+ image.at<uint8_t>(row, col) = (uint8_t)(*(output + (row * output0_col) + col) / 8);
+ }
+ }
+ cv::imshow("img", image);
+ cv::imshow("orgimg", originImage);
+ int key = cv::waitKey(0);
+ cv::destroyAllWindows();
+ return true;
@@ -330,16 +321,15 @@
samplesCommon::OnnxSampleParams params;
if (args.dataDirs.empty()) //!< Use default directories if user hasn't provided directory paths
- params.dataDirs.push_back("data/mnist/");
- params.dataDirs.push_back("data/samples/mnist/");
+ params.dataDirs.push_back("data/midas/");
else //!< Use the data directory provided by the user
params.dataDirs = args.dataDirs;
- params.onnxFileName = "mnist.onnx";
- params.inputTensorNames.push_back("Input3");
- params.outputTensorNames.push_back("Plus214_Output_0");
+ params.onnxFileName = "model_float32.onnx";
+ params.inputTensorNames.push_back("inputs:0");
+ params.outputTensorNames.push_back("Identity:0");
params.dlaCore = args.useDLACore;
params.int8 = args.runInInt8;
params.fp16 = args.runInFp16;
@@ -353,12 +343,12 @@
void printHelpInfo()
- << "Usage: ./sample_onnx_mnist [-h or --help] [-d or --datadir=<path to data directory>] [--useDLACore=<int>]"
+ << "Usage: ./sample_onnx_MiDasV2 [-h or --help] [-d or --datadir=<path to data directory>] [--useDLACore=<int>]"
<< std::endl;
std::cout << "--help Display help information" << std::endl;
std::cout << "--datadir Specify path to a data directory, overriding the default. This option can be used "
"multiple times to add multiple directories. If no data directories are given, the default is to use "
- "(data/samples/mnist/, data/mnist/)"
+ "(data/samples/MiDasV2/, data/MiDasV2/)"
<< std::endl;
std::cout << "--useDLACore=N Specify a DLA engine for layers that support DLA. Value can range from 0 to n-1, "
"where n is the number of DLA engines on the platform."
@@ -387,9 +377,9 @@
- SampleOnnxMNIST sample(initializeSampleParams(args));
+ SampleOnnxMiDasV2 sample(initializeSampleParams(args));
- sample::gLogInfo << "Building and running a GPU inference engine for Onnx MNIST" << std::endl;
+ sample::gLogInfo << "Building and running a GPU inference engine for Onnx MiDasV2" << std::endl;
if (!sample.build())