top of page

Start Python for Image Processing with OpenCV and ML -Part 3 MASK & ROI

In this part, I am going to introduce Mask and ROI-Region of Interest.

To start with, it is necessary to understand image thresholding. Image thresholding is a way to transfer the original gray scale image to a particular one where we modified the pixel value of the original gray picture. In most cases, we may want to get binary one, which means that the value of each pixel will become either 0 or 1. Certainly there are several methods we could use to do so. Here are some common and classic ways applied widely in OpenCV.

Let’s see the example.

As you can see, cv2.THRESH_TRUNC, cv2.THRESH_TOZERO, cv2.THRESH_TOZERO_INV are not binarizing the image, while cv2.THRESH_TOZERO is the most similar one to the original gray image.

The binary methods, cv2.THRESH_BINARY and cv2.THRESH_BINARY_INV are the classic ways in practice, although the thresh value could be customized properly. However, in this example we could see that the grassland in the bottom of the original image is gone by these two methods.

In order to retain the original element, we could alternatively choose the last two methods, which are cv2.THRESH_BINARY + cv2.THRESH_OTSU and cv2.THRESH_BINARY + cv2.THRESH_TRIANGLE. They are comparatively stable and unbiased in this case, and as a matter of fact, cv2.THRESH_BINARY + cv2.THRESH_OTSU is widely applied in text detection due to its good performance of results.

(Notice: you may want to try different thresh values to get different results in your own cases!)

Now let’s see a simple application of image thresholding. Consider that there is a background picture and there is another small image. What if we want to paste that small image into the bigger one and to make it show in the front? Well, it is easy now using some off-the-shelf image modifying software, like Photoshop and so on, but what if we want to solve it by Python?

So, this can be separated into two steps. First, we need to get the mask of the small picture to extract the element we are interested in. Then we get ROI in the large picture, apply the mask to change the pixel values of interest into 0. Look at the source below.

Now we prepared two digital pictures.

Point1: Get the MASK of the element of the small picture.

In this step, we get only the element we want to handle in the picture by splitting the picture into a binary one, 0 for the background that we are not interested, and 1 for the foreground element we are going to extract from.

Point2: Get the ROI in the large picture. And modify the pixel values.

To get the ROI, we need to locate by getting the pixel array of interest. Since we have a mask of 0 and 1, we only check the nonzero area to split the ROI into the first picture below. Now add the original value of the small picture to that of ROI.

Let’s see the result finally! We have the modified image below.

Please feel free to contact us if you have any comments.

Stay tuned for the next part, coming next month!



bottom of page