Source: Gifer

So far, our camera is able to create a depth map and determine the distance of a pixel upon the click of a mouse.

But that’s just half a Terminator. We need our ‘terminator’ to be able to say: “There is a person standing there. The person is x meters away. Attack!”

If you haven’t read through Part 1 and Part 2 of this series, please do so to better understand this part!

How this will work

  1. Run SSD-Mobilenet-v2 Object Detection model using TensorRT.
  2. Combine the object detection with our Depth Map.
  3. Determine the centroid of the object detection bounding box.
  4. Map the (x…

End product!

Welcome to Part 2 of this series! Now that we understand the concept of Stereo Vision (Part 1), let’s finally move on to creating our depth map.

Let’s start this camera!

All the scripts we will be using for this tutorial are written in Python.

To get started, clone my StereoVision repository.


If you have not set-up your Jetson Nano yet, please visit this site.

Assuming that you have got your Jetson Nano up and running, let’s install all the requirements for this project.

Since we are going to be working on Python3.6, let’s install

pip for Python3 : sudo apt-get install python3-pip


Have you ever wondered how Terminator robots actually see the world? How do they tell if an enemy is approaching?

The answer is in the two red demonic eyes you see in the picture above. Terminators have two cameras capable of facing the same direction which enables Binocular Vision.

In fact, these sci-fi robots in James Cameron’s all-time film are based almost entirely on the human body…and this includes the eyes.

Credits: Matterport, Inc

See that picture above? That is what this part is all about.

We will use Matterport’s pre-trained COCO dataset (mask_rcnn_balloon.h5) & and annotated images ( to test how the model works and what to expect.

To test the model, we will be using a Jupyter Notebook!

  1. Head to your Anaconda Navigator and open a Terminal window in the SAME environment that we installed all the requirements in (in the previous part).
  2. Navigate to your ‘Mask-R-CNN’ directory that was cloned from GitHub in the previous part.
  3. Type in jupyter-notebook . This will open up our Jupyter Notebook!
  4. Navigate to /samples/balloon/

This Part focuses on setting up our machine to run this model. We will be learning how to use Anaconda’s virtual environments and cloning the repository from GitHub.

Setting up your environment

It is ALWAYS better to use a virtual environment for projects.

That is something I have heard numerous times in the past month since I started my journey in Computer Vision & Machine Learning, and I have to say it is true.

Step 1 — Installing Anaconda on your machine

For those of us who are unaware, Anaconda is a Python (and R) distribution that makes package management and deployment easier.

To install Anaconda for your system, head to this…

No. Image Detection and Segmentation does not require a PhD.

Yes. It truly is amazing that we can access and use numerous libraries for free without understanding the complexities behind them.

This series mainly focuses on getting started with Image Segmentation using Mask Regional Convolutional Neural Network (MRCNN) using Python!

Before anything, I would like to credit Matterport, Inc for making this model so easily accessible. You can check out their repository and tutorial on how to get started with this model.

Although their tutorial is well-detailed, starting off as a complete beginner (like myself) can be extremely challenging. Therefore…

Aryan Vij

Machine Learning. CNN. Computer Vision. Looking to share simple and in-depth tutorials with the world! Sharing knowledge to help kick-start people’s interest!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store