Introduction to OpenCV: The Basics You Need to Know

Shubham Gupta Leave a Comment September 27, 2024

Introduction to OpenCV: The Basics You Need to Know

What is OpenCV?

OpenCV (Open Source Computer Vision Library) is a powerful and widely-used open-source library designed for computer vision tasks. Developed initially by Intel, it provides tools and functions to perform image processing, object detection, facial recognition, machine learning, and much more. OpenCV supports multiple programming languages such as Python, C++, and Java, making it accessible for developers from various backgrounds.

In this article, we’ll cover the basics of OpenCV, including installation, reading and displaying images, performing basic image processing operations, and an introduction to real-time object detection.

Getting Started with OpenCV

Installation

OpenCV can be installed using Python’s package manager, pip. To install OpenCV in Python, use the following command:

python -m venv venv
venv/Scripts/activate
python -m pip install --upgrade pip
pip install opencv-python

For extended support (i.e., additional OpenCV modules like opencv-contrib-python), you can install the full package:

pip install opencv-contrib-python

Once installed, you can start using OpenCV by importing the library:

import cv2

Basic Operations with OpenCV

1. Reading and Displaying an Image

To begin working with OpenCV, you need to load an image into memory. You can use the cv2.imread() function to read an image from your system, and cv2.imshow() to display it in a new window.

Example:

import cv2

# Load an image
image = cv2.imread('example.jpg')

# Display the image
cv2.imshow('Image', image)

# Wait for a key press and close the window
cv2.waitKey(0)
cv2.destroyAllWindows()

cv2.imread(): Reads the image from the given path.

cv2.imshow(): Opens a window to display the image.

cv2.waitKey(0): Waits indefinitely until a key is pressed.

cv2.destroyAllWindows(): Closes all OpenCV windows

2. Converting Images to Grayscale

Many computer vision algorithms work better on grayscale images rather than colored ones. You can convert an image to grayscale using cv2.cvtColor().

import cv2

# Load an image
image = cv2.imread('tesla-roadster-new.jpeg')

# Convert the image to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Display the grayscale image
cv2.imshow('Grayscale Image', gray_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

Here, cv2.COLOR_BGR2GRAY is used to specify the conversion from a BGR (Blue, Green, Red) image to grayscale.

3. Saving an Image

You can save images after processing them with the cv2.imwrite() function.

import cv2

# Load an image
image = cv2.imread('tesla-roadster-new.jpeg')

# Convert the image to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

cv2.imwrite('gray_example.jpg', gray_image)

# Display the grayscale image
cv2.imshow('Grayscale Image', gray_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

This will save the processed image in your directory with the specified filename.

4. Drawing Shapes on Images

OpenCV provides functions to draw basic shapes like rectangles, circles, and lines. For example, to draw a rectangle:

import cv2

# Load an image
image = cv2.imread('tesla-roadster-new.jpeg')

# Draw a blue rectangle on the image
cv2.rectangle(image, (50, 50), (200, 200), (255, 0, 0), 3)

# Display the grayscale image
cv2.imshow('Grayscale Image', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

Parameters:

The first argument is the image where the rectangle will be drawn.
(50, 50) are the coordinates of the top-left corner.
(200, 200) are the coordinates of the bottom-right corner.
(255, 0, 0) is the color of the rectangle (in BGR format).
3 is the thickness of the rectangle’s border.

Other shapes like circles (cv2.circle()) and lines (cv2.line()) can also be drawn similarly.

Image Processing with OpenCV

OpenCV provides a variety of image processing tools. Let’s cover a few essential operations:

1. Blurring an Image

Blurring is often used to reduce noise or detail in an image. You can apply Gaussian blurring with cv2.GaussianBlur().

import cv2

# Load an image
image = cv2.imread('tesla-roadster-new.jpeg')

# Apply Gaussian blur to the image
blurred_image = cv2.GaussianBlur(image, (15, 15), 0)

# Display the grayscale image
cv2.imshow('Grayscale Image', blurred_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this case, (15, 15) is the kernel size (the larger the size, the stronger the blur).

Make sure both dimensions of the kernel size are positive odd numbers, like (3, 3), (5, 5), (7, 7), etc.

2. Edge Detection

One of the most common image processing tasks is edge detection. OpenCV provides the Canny edge detection algorithm, which is very efficient.

import cv2

# Load an image
image = cv2.imread('tesla-roadster-new.jpeg')

edges = cv2.Canny(image, 100, 200)

# Display the grayscale image
cv2.imshow('Grayscale Image', edges)
cv2.waitKey(0)
cv2.destroyAllWindows()

import cv2

# Load the video
video = cv2.VideoCapture('car_video.mp4')

# Get video properties
frame_width = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(video.get(cv2.CAP_PROP_FPS))

# Define the codec and create VideoWriter object to save the output video
out = cv2.VideoWriter('car_video_edges.mp4', cv2.VideoWriter_fourcc(*'mp4v'), fps, (frame_width, frame_height), isColor=False)

while video.isOpened():
    # Read a frame from the video
    ret, frame = video.read()
    
    # Check if the frame was successfully read
    if not ret:
        break
    
    # Apply Canny edge detection to the frame
    edges = cv2.Canny(frame, 100, 200)
    
    # Write the frame to the output video
    out.write(edges)
    
    # Display the edge-detected frame (optional)
    cv2.imshow('Edge Detection', edges)
    
    # Exit when 'q' is pressed
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the video and close the display window
video.release()
out.release()
cv2.destroyAllWindows()

The arguments 100 and 200 are the lower and upper threshold values for detecting edges.

Real-Time Object Detection with OpenCV

1. Face Detection

OpenCV includes pre-trained models like Haar Cascades, which can be used for face detection. First, download the haarcascade_frontalface_default.xml file from OpenCV’s GitHub repository. Then, use it to detect faces:

# Load the Haar Cascade for face detection
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

# Convert the image to grayscale for detection
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Detect faces in the image
faces = face_cascade.detectMultiScale(gray_image, scaleFactor=1.1, minNeighbors=5)

# Draw rectangles around detected faces
for (x, y, w, h) in faces:
    cv2.rectangle(image, (x, y), (x + w, y + h), (255, 0, 0), 2)

# Display the result
cv2.imshow('Face Detection', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

This will detect faces and draw rectangles around them.

Real-Time Video Processing

You can also perform image processing in real time by accessing your webcam using OpenCV.

# Capture video from the webcam
cap = cv2.VideoCapture(0)

while True:
    # Read each frame
    ret, frame = cap.read()

    # Convert the frame to grayscale
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # Display the resulting frame
    cv2.imshow('Video Feed', gray_frame)

    # Break the loop on pressing 'q'
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the capture and close windows
cap.release()
cv2.destroyAllWindows()

In this example, the video feed is converted to grayscale and displayed in real time. You can apply additional image processing steps (e.g., edge detection, face detection) within the loop.

Conclusion

OpenCV is a versatile and powerful tool for computer vision, offering functionalities ranging from basic image manipulation to complex object detection. With its extensive library of pre-trained models and easy-to-use functions, OpenCV is an excellent resource for both beginners and experienced developers alike.