Python OpenCV & ESP32 Cam based DIY Security Surveillance Camera

Python OpenCV & ESP32 Cam:


Python OpenCV & ESP32 Cam based DIY Security Surveillance Camera- By using Python, OpenCV, MediaPipe, and an ESP32 Cam module; we are going to create the most advanced DIY Security Camera or Surveillance Camera. This advanced security system eliminates the need for physically installing laser sensors and motion sensors.

Python OpenCV & ESP32 Cam

So, if anyone crosses this virtual laser, a buzzer connected to an Arduino board will be activated. Currently, I have used just one virtual laser, but you can use multiple virtual lasers if you wish. You can make rectangular, circular, and even irregular shapes to define a specific area. Then all you need is to track those landmarks within those particular areas.

Python OpenCV & ESP32 Cam

You can also place this virtual laser above a wall, so if an intruder comes over the wall, the buzzer will be activated. Not only that, you can define a specific area, and whenever someone enters or exits that area, the buzzer will be activated. The same concept can be applied to different objects and animals as well.

Python OpenCV & ESP32 Cam

It’s based on pose landmark detection to accurately identify and monitor individuals within a given space. Pose landmark detection involves identifying and tracking distinctive features of a person’s body pose, such as the positions of joints and body parts. By using this approach, we can achieve a significantly higher level of accuracy in recognizing and distinguishing individuals, even under challenging conditions such as low light. This not only enhances the effectiveness of security systems but also minimizes false positives and false negatives, reducing unnecessary alarms and improving overall operational efficiency.

By harnessing the potential of pose landmark detection technology, the Pose Landmarks based security system offers a significant advancement in security measures. Its ability to accurately identify and track individuals adds an extra layer of protection, leading to more reliable and efficient security systems. Join us as we explore the integration of ESP32 Cam and Python OpenCV to unlock the potential of pose landmark-based security and pave the way for a safer future.

I will talk about the pose landmarks and its keypoints that represent the positions of various joints and body parts within a human pose, later in this article.

Anyway, this project is entirely based on my previous two tutorials. In first tutorial “ESP32 Object detection and Identification”, I explained the most basic things, such as:

How to perform wireless live video streaming using the ESP32 Cam module?

How to install Python, OpenCV, and Yolo V3? and

How to detect and identify different objects?

In my studio, I detected and identified various objects, and not only did I identify and track birds and cats, but I also displayed alert messages on the screen.

In the second tutorial “ESP32 Cam based Car Barrier/Gate control system”, I created an automatic car barrier/gate opening and closing system. I used the ESP32 Camera module along with Python OpenCV Yolo V3 for car identification and tracking. In this project, I used two lines to control the car barrier. When the car crossed the first line, the barrier would open, and when the car crossed the second line, the barrier would close.

So, I have already explained all of these things, and I won’t repeat them today. Today, I will only explain new things. Including,

  1. Pose landmarks.
  2. How to install mediapipe, it’s a powerful library and provides easy-to-use Python APIs for various tasks, including landmark detection.
  3. The Arduino Circuit diagram and programming. and finally
  4. Python programming.

So, without any further delay let’s get started!!!

Amazon Links:

ESP32 Camera Module

ESP32 CAM W-BT Board

Other Tools and Components:

Top Arduino Sensors:

Super Starter kit for Beginners

Digital Oscilloscopes

Variable Supply

Digital Multimeter

Soldering iron kits

PCB small portable drill machines

*Please Note: These are affiliate links. I may make a commission if you buy the components through these links. I would appreciate your support in this way!

Types of Landmarks in Python:

First, let’s start with types of Landmarks. We have mainly three types of Landmarks.

  1. Pose Landmarks.
  2. Facial Landmarks. And
  3. Hand Landmarks.

The Facial and Hand Landmarks I will explain and use in one of my upcoming videos. In this particular project we will only focus on Pose Landmarks.

Pose Landmarks:

Python OpenCV & ESP32 Cam

In Python, pose landmarks refer to the specific points or keypoints that represent the positions of various joints and body parts within a human pose. These landmarks are typically detected and tracked using computer vision techniques and libraries.

Pose estimation involves identifying and tracking the positions of joints, such as shoulders, elbows, wrists, hips, knees, and ankles, to infer the overall body pose and its orientation. By detecting and analyzing pose landmarks, it becomes possible to understand the structural configuration and movement of a person’s body.

Python OpenCV & ESP32 Cam

Python provides several libraries and frameworks that enable pose landmark detection and analysis. One popular library for this purpose is OpenCV (Open Source Computer Vision Library). OpenCV offers a range of functionalities, including pose estimation using pre-trained models and algorithms.

Another widely used library for pose estimation in Python is MediaPipe. MediaPipe provides a set of pre-trained models and tools specifically designed for landmark detection and tracking in real-time applications, including pose estimation.

Pose landmarks can be used for various applications, such as activity recognition, motion analysis, human-computer interaction, sports analytics, and augmented reality. By understanding the body pose and its movements, it becomes possible to develop applications that can respond to or analyze human actions and gestures.

Using Python’s libraries and tools for pose landmark detection, developers and researchers can build sophisticated applications that rely on understanding and interpreting human poses. These applications have the potential to revolutionize areas such as healthcare, gaming, sports, animation, and human-computer interaction.

Python OpenCV & ESP32 Cam

Anyway, we have got multiple keypoints and the good thing is we can detect and track any of these keypoints. You are free to use all the keypoints, or some of these keypoints, or a single keypoint it’s totally upto you. Well in my case, I am going to detect a specific landmark 31 on a person’s body using the Mediapipe library in python. So, whenever this landmark crosses a line or a virtual laser the buzzer is turned ON.

Python OpenCV & ESP32 Cam

If you learned how to track the X-axis and Y-axis location of a single keypoint on the body then you can do it for all these keypoints and then you would be able to detect and identify any pose.

Let’s say, you want to turn ON the buzzer when the person’s hand is up in the air and turn off the buzzer when the hand is down. For this we will track the X-axis and Y-axis location of any of these keypoints on the hand. So, when that particular landmark’s Y-axis location is above the keypoint 7 or any of these other keypoints then the buzzer will turn ON and when its Y-axis location is below the keypoint 23 the buzzer will be turned OFF.

Using this similar technique, you can find if the person is standing, sitting, or his arms are stretched or if the person is walking, and even you can count the biceps rips, and so on.

Now, let’s go ahead and take a look at the Arduino circuit diagram.

Buzzer interfacing with Arduino:

Python OpenCV & ESP32 Cam

The 5V buzzer is connected to the Arduino digital pin D8. I am using 2n2222 NPN transistor and a 10K ohm resistor as a driver to control the buzzer. Now, let’s go ahead and take a look at the Arduino programming.

Python Arduino Programming:

For this project you don’t need to add any library. You can see the buzzer is connected to the Arduino digital pin D8.

In the void setup() function, I set the buzzer as output using the pinMode() function and I also activated the serial communication and 9600 is the baud rate.

In the void loop() function, we constantly check the serial port, if the data is received from the Python and its available on the serial port then we simply read the serial port and store the received character in variable signal. Then using these two if condition we check; if the received character is 1 or 0. If its one then it means someone has crossed the line or virtual laser and the buzzer is turned ON. Else if there is no one. Then the python sends 0 to the Arduino and then the Arduino turns OFF the buzzer. So, that’s all about the Arduino programming.

ESP32 Cam with Python:

For the live video streaming, I am using ESP32 Cam module and as I said earlier, I am not going to explain how to setup your ESP32 camera for the live video streaming, because I have already explained it in my previous article based on the Object detection and identification using ESP32 Cam, Python, OpenCV, and Yolo V3. I am using the same setup and nothing has changed. In that article, I have also explained how to install Python and OpenCV. So, these are the things that I have already covered.

The only thing that I didn’t cover is the MediaPipe library installation. Because, in previous two protjects I used Yolo V3. This time round I am not using Yolo V3. I am just using Python, OpenCV, and MediaPipe. So, if you have installed Python and OpenCV then you can continue reading this article and if not then you can go back; read that article and after you have installed the Python and OpenCV then you can resume from here. Anyway, let’s go ahead and install MediaPipe library.

MediaPipe Library installation:

MediaPipe is a powerful library for building multimodal (audio, video, and sensor) applied machine learning pipelines. It provides easy-to-use Python APIs for various tasks, including landmark detection. You can install MediaPipe using pip with the command.

Simply open the command prompt on your PC or Laptop

Python OpenCV & ESP32 Cam

And paste the below code and press the enter button

Python OpenCV & ESP32 Cam

Python OpenCV & ESP32 Cam

As you can see “Requirement already satisfied” because I have already installed it. Now, let’s go ahead and take a look at the Python program that detects and track a specific landmark and send commands to the Arduino to control a buzzer.

Python Landmarks programming:

Code Explanation:

This code is designed to detect a specific landmark (landmark 31) on a person’s body using the Mediapipe library in Python. The purpose is to activate a buzzer when the landmark crosses a defined line which I call as a virtual laser. Anyway, Let’s go through the code step by step.

First, the necessary libraries are imported: cv2 for computer vision operations, mediapipe for pose estimation, numpy for numerical operations, urllib.request for opening a URL, and serial for establishing a serial connection with an Arduino.

Next, the URL for the video feed is specified. In this case, it is set to this url ‘’, which suggests that the code is accessing a video stream from an IP camera.

A video capture object is created using cv2.VideoCapture(0) to access the default camera of the device.

The code then initializes the mpPose object for pose estimation using Mediapipe and creates an instance of the pose estimation model.

The line coordinates for the line that needs to be crossed are defined. In this case, it is set to (200, 0) and (200, 600), indicating a vertical line starting at x-coordinate 200 and spanning the full height of 600 pixels.

A serial connection is established with an Arduino board using the serial.Serial function. Make sure you select the correct communication port and baud rate. You can check in the device manager, which port your Arduino board is connected to. In my case, its connected to COM5.

A boolean variable buzzer_active is initialized as False. This variable keeps track of whether the landmark has crossed the line or not.

Inside the main loop, the code retrieves an image from the specified URL using urllib.request.urlopen and converts it to a NumPy array.

The captured frame from the video feed is read using If the frame is not successfully captured, the loop breaks.

The RGB image is obtained by converting the captured frame from BGR to RGB using cv2.cvtColor.

The pose estimation model processes the RGB image to detect the landmarks using pose.process. The detected landmarks are stored in the results variable.

The line is drawn on the image using cv2.line based on the defined line coordinates.

If there are pose landmarks detected in the results, the code proceeds to draw the landmarks on the image using mpDraw.draw_landmarks.

The specific landmark of interest, landmark 31, is extracted, and its x and y coordinates on the image are calculated. A circle is drawn on the image at the location of landmark 31 using

If the visibility of landmark 31 is greater than 0.5, it means the landmark is clearly visible. The code checks if the x-coordinate of the landmark is less than both x-coordinates of the line. If this condition is met, it means the landmark has crossed the line.

If the buzzer_active variable is False, the code sends the signal ‘1’ to the Arduino board to turn ON the buzzer and the Buzzer_active flag is set to True.

If the visibility of landmark 31 is below the threshold or the x-coordinate is not less than both x-coordinates of the line, it means the landmark has not crossed the line. In this case, the buzzer_active variable is set to False, and the code sends the signal ‘0’ to the Arduino board to turn OFF the buzzer.

The resulting image with drawn landmarks and the line is displayed using cv2.imshow.

The loop continues until the ‘q’ key is pressed, at this point the video capture is released using cap.release() and all windows are closed using cv2.destroyAllWindows().

So, that’s all about the programming. For the practical demonstration; watch the video tutorial and don’t forget to like, share, and Subscribe; if you don’t want to miss any of my upcoming videos and articles.

Watch Video Tutorial:


Engr Fahad

My name is Shahzada Fahad and I am an Electrical Engineer. I have been doing Job in UAE as a site engineer in an Electrical Construction Company. Currently, I am running my own YouTube channel "Electronic Clinic", and managing this Website. My Hobbies are * Watching Movies * Music * Martial Arts * Photography * Travelling * Make Sketches and so on...

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button