2022年2月7日 星期一

Install TensorRT on Ubuntu 20.04



It takes me a lot of time to get TensorRT working with Ubuntu 20.04 on my laptop.
There are some issues makes it even harder @@

1.With the default nVidia driver from Ubuntu 20.04, the laptop failed to resume after suspend (hibernate).
The solution is to reinstall older version. T
his lead cuda version limited to 10.2

sudo apt purge nvidia-* 
sudo apt autoremove
sudo apt install nvidia-driver-450-server

2.Ubuntu 20.04 default is python3.8 and TensorRT works with python3.6 

3.TensorRT doesn't support Ubuntu 20.04 with cuda 10.2

The solution is to use python virtualenv to install TensorRT  

sudo apt install python3.6-venv

python3.6 -m venv tensorrt

source tensorrt/bin/activate

pip install --upgrade pip

python3 -m pip install numpy onnx

### Download & extract TensorRT-7.2.3.4.Ubuntu-18.04.x86_64-gnu.cuda-10.2.cudnn8.1.tar.gz

cd Downloads/

sudo cp -a TensorRT-7.2.3.4 /usr/local/

export LD_LIBRARY_PATH=/usr/local/TensorRT-7.2.3.4/lib

cd TensorRT-7.2.3.4/python/

python3 -m pip install tensorrt-7.2.3.4-cp36-none-linux_x86_64.whl

cd ../uff

python3 -m pip install uff-0.6.9-py2.py3-none-any.whl uff-0.6.9-py2.py3-none-any.whl

which convert-to-uff

cd ../graphsurgeon/

python3 -m pip install graphsurgeon-0.4.5-py2.py3-none-any.whl

cd ../onnx_graphsurgeon/

python3 -m pip install onnx_graphsurgeon-0.2.6-py2.py3-none-any.whl

cd ../..

### Download libcudnn8_8.2.1.32-1+cuda10.2_amd64.deb & libcudnn8-dev_8.2.1.32-1+cuda10.2_amd64.deb

sudo dpkg -i ./libcudnn8_8.2.1.32-1+cuda10.2_amd64.deb
sudo dpkg -i ./libcudnn8-dev_8.2.1.32-1+cuda10.2_amd64.deb

pip3 install torch
pip3 install torchvision
pip3 install matplotlib

pip3 install --global-option=build_ext --global-option="-I/usr/local/cuda-10.2/targets/x86_64-linux/include/" --global-option="-L/usr/local/cuda-10.2/targets/x86_64-linux/lib/" pycuda

pip3 install opencv-python
pip3 install albumentations==0.5.2


2021年9月27日 星期一

TensorRT custom ONNX model c++

This article is intent to describe how to run custom ONNX model with TensorRT. By modify TensorRT sample code sampleOnnxMnist to make it happen.

Hardware - nVidia Jetson NX Xavier
Software - Jetpack 4.6 / TensorRT 8
TensorRT sample - /usr/src/tensorrt/samples/sampleOnnxMNIST
ONNX model - https://github.com/PINTO0309/PINTO_model_zoo/tree/main/081_MiDaS_v2

The sampleOnnxMNIST is to detect hand write numbers from 0 ~ 9. I will make some change on this sample to get it work with MiDasV2 depth inference.

1.MiDasV2 Model

get PINTO_model_zoo and download MiDasV2 ONNX model

git clone https://github.com/PINTO0309/PINTO_model_zoo.git
cd PINTO_model_zoo/081_MiDaS_v2
./download_256x256.sh
cd

After successful downloading, file PINTO_model_zoo/081_MiDaS_v2/saved_model/model_float32.onnx is the custom ONNX model.

Now we need to know the input and output dimensions of the model. A tool netron will help it.

pip install netron
export PATH=$PATH:${HOME}/.local/bin
netron PINTO_model_zoo/081_MiDaS_v2/saved_model/model_float32.onnx

The netron will display all layers of the model on browser. Open url localhost:8080 from browser


Now we know the input layer name is inputs:0 and it's dimension is 1 x 256 x 256 x 3
Since the model requires an image input, so I guess the four dimension means batch x height x width x channel.


Go to bottom of the page. the output name is Identity:0 and it's dimension is 1 x 256 x 256. Since the model output depth map, so I guess the three dimension means batch x height x width

That's all we need to know about the model.

2.sample code

sudo -s

Copy ONNX model file to tensorrt sample folder

mkdir /usr/src/tensorrt/data/midas
cp PINTO_model_zoo/081_MiDaS_v2/saved_model/model_float32.onnx /usr/src/tensorrt/data/midas/

Copy source image 

cp PINTO_model_zoo/081_MiDaS_v2/openvino/midasv2_small_256x256/FP16/dog.jpg /usr/src/tensorrt/bin

Create new sample from sampleOnnxMNIST

cd /usr/src/tensorrt/samples
cp -a sampleOnnxMNIST sampleOnnxMiDasV2
cd sampleOnnxMiDasV2
mv sampleOnnxMNIST.cpp sampleOnnxMiDasV2.cpp

Modify Makefile

--- ../sampleOnnxMNIST/Makefile 2021-06-26 08:17:31.000000000 +0800
+++ Makefile 2021-09-27 17:10:13.212404761 +0800
@@ -1,6 +1,8 @@
-OUTNAME_RELEASE = sample_onnx_mnist
-OUTNAME_DEBUG   = sample_onnx_mnist_debug
+OUTNAME_RELEASE = sample_onnx_midasv2
+OUTNAME_DEBUG   = sample_onnx_midasv2_debug
 EXTRA_DIRECTORIES = ../common
 SAMPLE_DIR_NAME = $(shell basename $(dir $(abspath $(firstword $(MAKEFILE_LIST)))))
+COMMON_FLAGS = -I/usr/include/opencv4/opencv -I/usr/include/opencv4
+EXTRA_LIBS = -L/usr/lib/aarch64-linux-gnu/ -lopencv_dnn -lopencv_gapi -lopencv_highgui -lopencv_ml -lopencv_objdetect -lopencv_photo -lopencv_stitching -lopencv_video -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_videoio -lopencv_imgcodecs -lopencv_imgproc -lopencv_core
 MAKEFILE ?= ../Makefile.config
 include $(MAKEFILE)

Modify ../Makefile.config to get opencv correctly linked

$(OUTDIR)/$(OUTNAME_RELEASE) : $(OBJS) $(CUOBJS)
        $(ECHO) Linking: $@
-         $(AT)$(CC) -o $@ $(LFLAGS) -Wl,--start-group $(LIBS) $^ -Wl,--end-group
+        $(AT)$(CC) -o $@ $(LFLAGS) -Wl,--start-group $(LIBS) $^ -Wl,--end-group $(EXTRA_LIBS)

$(OUTDIR)/$(OUTNAME_DEBUG) : $(DOBJS) $(CUDOBJS)
        $(ECHO) Linking: $@
-        $(AT)$(CC) -o $@ $(LFLAGSD) -Wl,--start-group $(DLIBS) $^ -Wl,--end-group
+        $(AT)$(CC) -o $@ $(LFLAGSD) -Wl,--start-group $(DLIBS) $^ -Wl,--end-group $(EXTRA_LIBS)

The whole story is to read dog.jpg as input of depth inference and display image of dog.jpg and depth map of result on screen

Source code of sampleOnnxMiDadV2.cpp 

3.Build & run

make
cd ../../bin
./sample_onnx_midasv2




4.Diff from sampleOnnxMNIST.cpp

--- ../sampleOnnxMNIST/sampleOnnxMNIST.cpp 2021-06-26 08:17:31.000000000 +0800
+++ sampleOnnxMiDasV2.cpp 2021-09-27 16:49:44.045143887 +0800
@@ -15,11 +15,11 @@
  */
 
 //!
-//! sampleOnnxMNIST.cpp
-//! This file contains the implementation of the ONNX MNIST sample. It creates the network using
-//! the MNIST onnx model.
+//! sampleOnnxMiDasV2.cpp
+//! This file contains the implementation of the ONNX MiDasV2 sample. It creates the network using
+//! the MiDasV2 onnx model.
 //! It can be run with the following command line:
-//! Command: ./sample_onnx_mnist [-h or --help] [-d=/path/to/data/dir or --datadir=/path/to/data/dir]
+//! Command: ./sample_onnx_MiDasV2 [-h or --help] [-d=/path/to/data/dir or --datadir=/path/to/data/dir]
 //! [--useDLACore=<int>]
 //!
 
@@ -37,18 +37,21 @@
 #include <iostream>
 #include <sstream>
 
+#include <opencv2/opencv.hpp>
+
+
 using samplesCommon::SampleUniquePtr;
 
-const std::string gSampleName = "TensorRT.sample_onnx_mnist";
+const std::string gSampleName = "TensorRT.sample_onnx_midas";
 
-//! \brief  The SampleOnnxMNIST class implements the ONNX MNIST sample
+//! \brief  The SampleOnnxMiDasV2 class implements the ONNX MiDasV2 sample
 //!
 //! \details It creates the network using an ONNX model
 //!
-class SampleOnnxMNIST
+class SampleOnnxMiDasV2
 {
 public:
-    SampleOnnxMNIST(const samplesCommon::OnnxSampleParams& params)
+    SampleOnnxMiDasV2(const samplesCommon::OnnxSampleParams& params)
         : mParams(params)
         , mEngine(nullptr)
     {
@@ -74,7 +77,7 @@
     std::shared_ptr<nvinfer1::ICudaEngine> mEngine; //!< The TensorRT engine used to run the network
 
     //!
-    //! \brief Parses an ONNX model for MNIST and creates a TensorRT network
+    //! \brief Parses an ONNX model for MiDasV2 and creates a TensorRT network
     //!
     bool constructNetwork(SampleUniquePtr<nvinfer1::IBuilder>& builder,
         SampleUniquePtr<nvinfer1::INetworkDefinition>& network, SampleUniquePtr<nvinfer1::IBuilderConfig>& config,
@@ -83,23 +86,23 @@
     //!
     //! \brief Reads the input  and stores the result in a managed buffer
     //!
-    bool processInput(const samplesCommon::BufferManager& buffers);
+    bool processInput(const samplesCommon::BufferManager& buffers, cv::Mat & image);
 
     //!
     //! \brief Classifies digits and verify result
     //!
-    bool verifyOutput(const samplesCommon::BufferManager& buffers);
+    bool verifyOutput(const samplesCommon::BufferManager& buffers, cv::Mat & originImage);
 };
 
 //!
 //! \brief Creates the network, configures the builder and creates the network engine
 //!
-//! \details This function creates the Onnx MNIST network by parsing the Onnx model and builds
-//!          the engine that will be used to run MNIST (mEngine)
+//! \details This function creates the Onnx MiDasV2 network by parsing the Onnx model and builds
+//!          the engine that will be used to run MiDasV2 (mEngine)
 //!
 //! \return Returns true if the engine was created successfully and false otherwise
 //!
-bool SampleOnnxMNIST::build()
+bool SampleOnnxMiDasV2::build()
 {
     auto builder = SampleUniquePtr<nvinfer1::IBuilder>(nvinfer1::createInferBuilder(sample::gLogger.getTRTLogger()));
     if (!builder)
@@ -162,24 +165,24 @@
 
     ASSERT(network->getNbInputs() == 1);
     mInputDims = network->getInput(0)->getDimensions();
-    ASSERT(mInputDims.nbDims == 4);
+    ASSERT(mInputDims.nbDims == 4); // Input is 1 x 256 x 256 x 3 
 
     ASSERT(network->getNbOutputs() == 1);
     mOutputDims = network->getOutput(0)->getDimensions();
-    ASSERT(mOutputDims.nbDims == 2);
+    ASSERT(mOutputDims.nbDims == 3); // Output is 1 x 256 x 256
 
     return true;
 }
 
 //!
-//! \brief Uses a ONNX parser to create the Onnx MNIST Network and marks the
+//! \brief Uses a ONNX parser to create the Onnx MiDasV2 Network and marks the
 //!        output layers
 //!
-//! \param network Pointer to the network that will be populated with the Onnx MNIST network
+//! \param network Pointer to the network that will be populated with the Onnx MiDasV2 network
 //!
 //! \param builder Pointer to the engine builder
 //!
-bool SampleOnnxMNIST::constructNetwork(SampleUniquePtr<nvinfer1::IBuilder>& builder,
+bool SampleOnnxMiDasV2::constructNetwork(SampleUniquePtr<nvinfer1::IBuilder>& builder,
     SampleUniquePtr<nvinfer1::INetworkDefinition>& network, SampleUniquePtr<nvinfer1::IBuilderConfig>& config,
     SampleUniquePtr<nvonnxparser::IParser>& parser)
 {
@@ -212,9 +215,9 @@
 //! \details This function is the main execution function of the sample. It allocates the buffer,
 //!          sets inputs and executes the engine.
 //!
-bool SampleOnnxMNIST::infer()
+bool SampleOnnxMiDasV2::infer()
 {
-    // Create RAII buffer manager object
+    // Create RAII buffer manager object
     samplesCommon::BufferManager buffers(mEngine);
 
     auto context = SampleUniquePtr<nvinfer1::IExecutionContext>(mEngine->createExecutionContext());
@@ -222,28 +225,29 @@
     {
         return false;
     }
-
+    cv::Mat image = cv::imread("dog.jpg");
+    if (image.cols == 0 || image.rows == 0)
+    {
+        printf("image is empty\n");
+        return false;
+    }
     // Read the input data into the managed buffers
     ASSERT(mParams.inputTensorNames.size() == 1);
-    if (!processInput(buffers))
+    if (!processInput(buffers, image))
     {
         return false;
     }
-
     // Memcpy from host input buffers to device input buffers
     buffers.copyInputToDevice();
-
     bool status = context->executeV2(buffers.getDeviceBindings().data());
     if (!status)
     {
         return false;
     }
-
     // Memcpy from device output buffers to host output buffers
     buffers.copyOutputToHost();
-
     // Verify results
-    if (!verifyOutput(buffers))
+    if (!verifyOutput(buffers, image))
     {
         return false;
     }
@@ -254,31 +258,30 @@
 //!
 //! \brief Reads the input and stores the result in a managed buffer
 //!
-bool SampleOnnxMNIST::processInput(const samplesCommon::BufferManager& buffers)
+bool SampleOnnxMiDasV2::processInput(const samplesCommon::BufferManager& buffers, cv::Mat & image)
 {
-    const int inputH = mInputDims.d[2];
-    const int inputW = mInputDims.d[3];
+    const int inputChannels = mInputDims.d[3];
+    const int inputH = mInputDims.d[1];
+    const int inputW = mInputDims.d[2];
 
-    // Read a random digit file
-    srand(unsigned(time(nullptr)));
-    std::vector<uint8_t> fileData(inputH * inputW);
-    mNumber = rand() % 10;
-    readPGMFile(locateFile(std::to_string(mNumber) + ".pgm", mParams.dataDirs), fileData.data(), inputH, inputW);
-
-    // Print an ascii representation
-    sample::gLogInfo << "Input:" << std::endl;
-    for (int i = 0; i < inputH * inputW; i++)
-    {
-        sample::gLogInfo << (" .:-=+*#%@"[fileData[i] / 26]) << (((i + 1) % inputW) ? "" : "\n");
-    }
-    sample::gLogInfo << std::endl;
+    printf("inputs:0 - %d x %d x %d x %d\n", mInputDims.d[0], mInputDims.d[1], mInputDims.d[2], mInputDims.d[3]);
 
-    float* hostDataBuffer = static_cast<float*>(buffers.getHostBuffer(mParams.inputTensorNames[0]));
-    for (int i = 0; i < inputH * inputW; i++)
-    {
-        hostDataBuffer[i] = 1.0 - float(fileData[i] / 255.0);
-    }
+    cv::Mat resized_image;
+    cv::resize(image, resized_image, cv::Size(inputW, inputH));
 
+    int batchIndex = 0;
+    int batchOffset = batchIndex * inputW * inputH * inputChannels;
+    float* hostDataBuffer = static_cast<float*>(buffers.getHostBuffer(mParams.inputTensorNames[0]));
+    // input shape [B,H,W,C]
+    // inputs:0 - 1 x 256 x 256 x 3
+        for (size_t h = 0; h < inputH; h++) {
+            for (size_t w = 0; w < inputW; w++) {
+ for (size_t c = 0; c < inputChannels; c++) {
+                hostDataBuffer[batchOffset + (h * inputW + w) * inputChannels + c] =
+                    float(float(resized_image.at<cv::Vec3b>(h, w)[c]) / 255.0); // Division 255.0 is to convert uint8_t color to float_t
+ }
+            }
+        }
     return true;
 }
 
@@ -287,39 +290,27 @@
 //!
 //! \return whether the classification output matches expectations
 //!
-bool SampleOnnxMNIST::verifyOutput(const samplesCommon::BufferManager& buffers)
+bool SampleOnnxMiDasV2::verifyOutput(const samplesCommon::BufferManager& buffers, cv::Mat & originImage )
 {
-    const int outputSize = mOutputDims.d[1];
     float* output = static_cast<float*>(buffers.getHostBuffer(mParams.outputTensorNames[0]));
-    float val{0.0f};
-    int idx{0};
-
-    // Calculate Softmax
-    float sum{0.0f};
-    for (int i = 0; i < outputSize; i++)
-    {
-        output[i] = exp(output[i]);
-        sum += output[i];
-    }
-
-    sample::gLogInfo << "Output:" << std::endl;
-    for (int i = 0; i < outputSize; i++)
-    {
-        output[i] /= sum;
-        val = std::max(val, output[i]);
-        if (val == output[i])
-        {
-            idx = i;
-        }
-
-        sample::gLogInfo << " Prob " << i << "  " << std::fixed << std::setw(5) << std::setprecision(4) << output[i]
-                         << " "
-                         << "Class " << i << ": " << std::string(int(std::floor(output[i] * 10 + 0.5f)), '*')
-                         << std::endl;
-    }
-    sample::gLogInfo << std::endl;
-
-    return idx == mNumber && val > 0.9f;
+    const int output0_row = mOutputDims.d[1];
+    const int output0_col = mOutputDims.d[2];
+    
+    printf("Identity:0 - %d x %d x %d\n", mOutputDims.d[0], mOutputDims.d[1], mOutputDims.d[2]);
+    
+    cv::Mat image = cv::Mat::zeros(cv::Size(output0_row, output0_col), CV_8U);
+    for (int row = 0; row < output0_row; row++) {
+    for (int col = 0;col < output0_col; col++) {
+        image.at<uint8_t>(row, col) = (uint8_t)(*(output + (row * output0_col) + col) / 8);
+    }
+    }
+    
+    cv::imshow("img", image);
+    cv::imshow("orgimg", originImage);
+    int key = cv::waitKey(0);
+    cv::destroyAllWindows();
+    
+ return true;
 }
 
 //!
@@ -330,16 +321,15 @@
     samplesCommon::OnnxSampleParams params;
     if (args.dataDirs.empty()) //!< Use default directories if user hasn't provided directory paths
     {
-        params.dataDirs.push_back("data/mnist/");
-        params.dataDirs.push_back("data/samples/mnist/");
+        params.dataDirs.push_back("data/midas/");
     }
     else //!< Use the data directory provided by the user
     {
         params.dataDirs = args.dataDirs;
     }
-    params.onnxFileName = "mnist.onnx";
-    params.inputTensorNames.push_back("Input3");
-    params.outputTensorNames.push_back("Plus214_Output_0");
+    params.onnxFileName = "model_float32.onnx";
+    params.inputTensorNames.push_back("inputs:0");
+    params.outputTensorNames.push_back("Identity:0");
     params.dlaCore = args.useDLACore;
     params.int8 = args.runInInt8;
     params.fp16 = args.runInFp16;
@@ -353,12 +343,12 @@
 void printHelpInfo()
 {
     std::cout
-        << "Usage: ./sample_onnx_mnist [-h or --help] [-d or --datadir=<path to data directory>] [--useDLACore=<int>]"
+        << "Usage: ./sample_onnx_MiDasV2 [-h or --help] [-d or --datadir=<path to data directory>] [--useDLACore=<int>]"
         << std::endl;
     std::cout << "--help          Display help information" << std::endl;
     std::cout << "--datadir       Specify path to a data directory, overriding the default. This option can be used "
                  "multiple times to add multiple directories. If no data directories are given, the default is to use "
-                 "(data/samples/mnist/, data/mnist/)"
+                 "(data/samples/MiDasV2/, data/MiDasV2/)"
               << std::endl;
     std::cout << "--useDLACore=N  Specify a DLA engine for layers that support DLA. Value can range from 0 to n-1, "
                  "where n is the number of DLA engines on the platform."
@@ -387,9 +377,9 @@
 
     sample::gLogger.reportTestStart(sampleTest);
 
-    SampleOnnxMNIST sample(initializeSampleParams(args));
+    SampleOnnxMiDasV2 sample(initializeSampleParams(args));
 
-    sample::gLogInfo << "Building and running a GPU inference engine for Onnx MNIST" << std::endl;
+    sample::gLogInfo << "Building and running a GPU inference engine for Onnx MiDasV2" << std::endl;
 
     if (!sample.build())
     {