ESP32 CAM

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

ESP32 CAM OpenCV Yolo V3:

 

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification- In this article, I am going to use the ESP32 Camera module with Python OpenCV Yolo V3 for object detection and Identification. I am only using the ESP32 Camera module for the live video streaming whereas for the image processing, I am using Python OpenCV Yolo V3.  

I will test it on three different machines and you will be amazed with the end results. First I will test it using the Raspberry Pi 4 and it has 8GB RAM. Then I will test it on Core i3 Laptop. And finally, I will test it on my MSI Intel Core i7 with Nvidia Geforce 16GB GPU and 16GB RAM. I specially purchased this laptop for video editing and image processing.

Anyway, after performing initial tests then I will share with you the final code, which can be used for the detection and identification of specific objects. Let’s say you want to send an alert message when a specific object is detected.

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

In my case, I send an alert message when a bird and a cat are detected at the same time. While all the other objects are totally ignored.

We have a long list of objects that we can detect. So, after reading this article or watching my video you will be able to detect all the objects at the same time or you can select one or multiple objects of your choice, and this way you can build amazing image processing-based projects.

So, without any further delay let’s get started!!!

Note: Read my article on ESP32 Cam and Arduino-based Car Parking Barrier control system.




Amazon Links:

ESP32 Camera Module

ESP32 CAM W-BT Board

MSI Intel Core i7 Laptop check this out.

Other Tools and Components:

Top Arduino Sensors:

Super Starter kit for Beginners

Digital Oscilloscopes

Variable Supply

Digital Multimeter

Soldering iron kits

PCB small portable drill machines

*Please Note: These are affiliate links. I may make a commission if you buy the components through these links. I would appreciate your support in this way!

About ESP32Cam Module:

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

The ESP32-CAM is a small, low-cost development board based on the ESP32 microcontroller and a camera module. It combines Wi-Fi and Bluetooth connectivity with a camera, making it suitable for projects requiring image capture and wireless communication capabilities.

Here are some key features of the ESP32-CAM:

  • ESP32 Microcontroller: The board is built around the ESP32, a powerful and versatile microcontroller that supports both Wi-Fi and Bluetooth connectivity. It has a dual-core processor, ample RAM, and various peripherals.
  • Camera Module: The ESP32-CAM integrates a small camera module, typically an OV2640 or OV7670, capable of capturing images and video. The camera can be used to capture still images or stream video to a host device.
  • GPIO Pins: The ESP32-CAM features a set of general-purpose input/output (GPIO) pins that allow you to connect additional sensors, actuators, or other components to expand the functionality of your project.
  • Storage Options: The board offers different storage options for storing images and other data. It includes a microSD card slot for external storage, as well as built-in flash memory for storing firmware and other files.
  • Programming: The ESP32-CAM can be programmed using the Arduino IDE, which provides a user-friendly development environment for writing and uploading code. There are also alternative programming options available, such as MicroPython or the Espressif IDF (IoT Development Framework).
  • Power Supply: The board can be powered through a USB connection or an external power source. It has a voltage regulator to provide a stable power supply to the ESP32 and camera module.
  • The ESP32-CAM is commonly used in applications such as surveillance systems, home automation, robotics, and IoT projects that require image capture and wireless connectivity. Its compact size and affordable price make it a popular choice for hobbyists and developers.

Please note that specific details about the ESP32-CAM’s features and specifications may vary depending on the manufacturer or version of the board.



What is yolo v3?

YOLO (You Only Look Once) v3 is an object detection algorithm that is widely used in computer vision and image recognition tasks. It is an improvement over its predecessors, YOLO and YOLO v2, and offers better accuracy and performance.

The key idea behind YOLO v3 is to divide an input image into a grid and predict bounding boxes and class probabilities directly on the grid cells. Instead of sliding a window or using a region proposal network, YOLO v3 performs detection in a single pass. This makes it extremely fast and efficient compared to other object detection algorithms.

YOLO v3 uses a deep convolutional neural network (CNN) to process the input image and predict the bounding boxes and class probabilities. The network architecture comprises several convolutional layers, which are subsequently followed by fully connected layers. It also incorporates skip connections, which allow information from earlier layers to be used in later layers, enhancing the detection performance.

One of the significant improvements in YOLO v3 is the introduction of multiple detection scales. It applies detection at three different scales to detect objects of varying sizes in the image. This multi-scale approach helps improve detection accuracy, particularly for small objects.

YOLO v3 is capable of detecting and localizing multiple objects within an image in real time. It has been widely used in various applications, including autonomous vehicles, surveillance systems, and video analysis.

Why ESP32 CAM & Yolo V3?

The combination of the ESP32 camera module and Python YOLOv3 (You Only Look Once version 3) can be a powerful solution for various computer vision applications. Here’s why:

ESP32 Camera Module: The ESP32 is a versatile microcontroller with built-in Wi-Fi and Bluetooth capabilities. It also has sufficient processing power to handle basic image-processing tasks. The ESP32 camera module integrates a camera sensor with the microcontroller, allowing you to capture images or videos directly. This makes it convenient for applications that require real-time image processing or analysis.

Python: Python is a popular programming language for machine learning and computer vision. It has a rich ecosystem of libraries and frameworks that simplify the development of complex applications. By using Python, you can leverage the extensive support available for computer vision tasks and easily integrate with other libraries or tools.

YOLOv3: YOLOv3 is a state-of-the-art object detection algorithm that can accurately detect and classify objects in real-time. It operates by dividing the input image into a grid and predicting bounding boxes and class probabilities for each grid cell. YOLOv3 is known for its speed and accuracy, making it suitable for applications that require real-time object detection, such as surveillance, robotics, or smart home systems.

Combining the ESP32 camera module and YOLOv3 in Python allows you to perform real-time object detection on images or video streams captured by the camera. The ESP32 can capture the images, send them to a computer or a server running the YOLOv3 algorithm, and receive the object detection results back to take further actions.

This combination is particularly useful for resource-constrained environments where running complex computer vision algorithms on the microcontroller itself may not be feasible due to memory or processing limitations. Instead, offloading the heavy computation to a more powerful machine running YOLOv3 in Python can provide better performance and accuracy.



Altium Designer + Altium 365 + Octopart:

Arduino LoRa Free SMS

Altium 365 lets you hold the fastest design reviews ever. Share your designs from anywhere and with anyone with a single click. it’s easy, leave a comment tagging your teammate and they’ll instantly receive an email with a link to the design. Anyone you invite can open the design using a web browser. Using the browser interface, you’re able to comment, markup, cross probe, inspect, and more. Comments are attached directly to the project, making them viewable within Altium designer as well as through the browser interface. Design, share, and manufacture, all in the same space with nothing extra to install or configure. Connect to the platform directly from Altium Designer without changing how you already design electronics. Altium 365 requires no additional licenses and comes included with your subscription plan.

Get real-time component insights as you design with Octopart built into Altium 365. Octopart is the fastest search engine for electronic parts and gives you the most up-to-date part data like specs, datasheets, cad models, and how much the part costs at different amounts etc. Right in the design environment so you can focus on your designs. Start with Altium Designer and Activate Altium 365. Search for electronic parts on Octopart.

Python and OpenCV installation:

To install Python and OpenCV, please follow the steps outlined below:

Python Installation:

Visit the official Python website’s releases page.

An image encouraging users to visit the official Python website. The screenshot displays a web browser window with the Python website open. The Python logo and the website's navigation menu are visible. This image serves as a visual prompt for users to access the official Python website, where they can find official documentation, downloads, tutorials, community resources, and other information related to the Python programming language.

Scroll down to the “Files” section and download the appropriate installer for your operating system (Windows, macOS, or Linux) based on your system architecture (32-bit or 64-bit). In my case, I am going to download and install the Windows x86-64 executable installer.

An image depicting the process of downloading the appropriate Python installer. The screenshot showcases a web browser window with a download page open for Python installation. It displays the options to choose the appropriate Python installer based on the operating system and version requirements. This step is essential for installing Python on the user's system, allowing them to run Python programs and execute various Python-based projects and applications.

Run the installer and follow the instructions to install Python 3.6.1. Make sure to select the option to add Python to the system PATH during the installation process.

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

If you want to check whether Python is installed on your system, you can use the following command “python –version” in your command prompt or terminal. As you can see Python version 3.6.1 is successfully installed.

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification



Install OpenCV:

Open a command prompt or terminal. And paste the below code in the terminal:

pip install opencv-python==4.5.3.56

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

And press the enter button. But in my case it shows the following message because I already installed it.

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

If you want to check whether Opencv is installed on your system, you can use the following command in your Python IDLE shell:

An image demonstrating how to check if OpenCV is installed on Windows 10. The screenshot displays the steps to verify the installation status of OpenCV on a Windows 10 system. It includes navigating to the command prompt or terminal window and executing the appropriate command to check the version or presence of OpenCV. This process ensures that users can confirm the successful installation of OpenCV on their Windows 10 machine.

As you can see the Opencv is installed successfully.




Download yolov3 weight and cfg files:

Step 1: Visit the Darknet Website Head over to the official Darknet website. Darknet is the open-source framework developed by Joseph Redmon, the creator of YOLO.

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

Step 2: Download YOLO Weights On the Darknet website, scroll down to the “YOLO” section. You’ll find a link to download the YOLO weights file. Click on the link to start the download. The weights file is typically named “yolov3.weights”.

An image illustrating the download of the YOLOv3 weight file for ESP32-CAM. The screenshot displays a web browser window with the download prompt for the YOLOv3 weight file specifically tailored for the ESP32-CAM module. This weight file contains the pre-trained model parameters necessary for running the YOLOv3 object detection algorithm on the ESP32-CAM platform. By downloading this file, users can leverage the trained model to accurately detect objects in real-time using the ESP32-CAM.

Make sure you download the cfg and weights files of the YOLOv3-320 as you can see in the image above.

Step 3: Download YOLO Configuration Files (CFG) While still on the Darknet website, navigate to the “Configuration” section. You’ll find a link to download the YOLO configuration files (CFG) there. The configuration files contain the architecture and settings for the YOLO model. Click on the link to begin the download. The configuration file for YOLOv3 is usually named “yolov3.cfg”.

An image showcasing the download of the YOLOv3 cfg (configuration) file for ESP32-CAM. The screenshot displays a web browser window with the download prompt for the YOLOv3 cfg file specifically designed for the ESP32-CAM module. This cfg file contains the network architecture and configuration parameters required to run the YOLOv3 object detection algorithm on the ESP32-CAM platform, enabling accurate and efficient detection of objects in real-time.

Step 4: Get the Class Names (Optional) To download the coco.names from the Darknet GitHub repository, open the darknet GitHub repository and copy all the classes names and save in your project directory as well with the file extension coco.names.

An image showing the process of copying COCO names from the Darknet GitHub repository. The screenshot displays a code snippet or text section from the Darknet GitHub repository that contains the COCO names. This information can be copied for use in various applications or projects that require the names of objects or classes from the COCO (Common Objects in Context) dataset, widely used for object detection and image recognition tasks.

An image illustrating the process of saving the COCO names file in the project directory. The screenshot shows a file explorer window with the COCO names file highlighted or selected, ready to be saved in the project directory. This step is crucial for organizing and managing the necessary files for a project that involves object detection or image recognition using the COCO (Common Objects in Context) dataset. The COCO names file contains a list of object class names used in the dataset for accurate labeling and identification.

Now you have successfully downloaded the coco.names classes list from the Darknet GitHub repository. This file contains the names of the object classes used in the COCO dataset, which can be useful for object detection and recognition tasks. Make sure the file should be saved with the extension .names. If there is a .txt extension remove it. “Wrong: coco.names.txt” “correct is: coco.names”

Step 5: keep the coco.names, yolov3.cfg, and yolov3.weights files in the same folder with the main programming file. The catAndBirdDetection.py is the main programming file.

An image illustrating the process of copying weight, cfg, and COCO names files to the project directory. The screenshot displays a file explorer window showing the weight file, cfg file, and COCO names file being copied or moved from one location to another within a project directory. This step is crucial for setting up the project environment and ensuring that the necessary files are in the correct location for successful execution of the project, which likely involves object detection or image recognition tasks.

Next, we are going to start with the ESP32 Camera module.



ESP32 Cam Live Video Streaming in Python OpenCV:

You will need to upload the following program into the ESP32 Camera module for the Live Video streaming.

#include <WebServer.h>
#include <WiFi.h>
#include <esp32cam.h>
 
const char* WIFI_SSID = "Fawad";
const char* WIFI_PASS = "computer007";
 
WebServer server(80);
 
 
static auto loRes = esp32cam::Resolution::find(320, 240);
static auto midRes = esp32cam::Resolution::find(350, 530);
static auto hiRes = esp32cam::Resolution::find(800, 600);
void serveJpg()
{
  auto frame = esp32cam::capture();
  if (frame == nullptr) {
    Serial.println("CAPTURE FAIL");
    server.send(503, "", "");
    return;
  }
  Serial.printf("CAPTURE OK %dx%d %db\n", frame->getWidth(), frame->getHeight(),
                static_cast<int>(frame->size()));
 
  server.setContentLength(frame->size());
  server.send(200, "image/jpeg");
  WiFiClient client = server.client();
  frame->writeTo(client);
}
 
void handleJpgLo()
{
  if (!esp32cam::Camera.changeResolution(loRes)) {
    Serial.println("SET-LO-RES FAIL");
  }
  serveJpg();
}
 
void handleJpgHi()
{
  if (!esp32cam::Camera.changeResolution(hiRes)) {
    Serial.println("SET-HI-RES FAIL");
  }
  serveJpg();
}
 
void handleJpgMid()
{
  if (!esp32cam::Camera.changeResolution(midRes)) {
    Serial.println("SET-MID-RES FAIL");
  }
  serveJpg();
}
 
 
void  setup(){
  Serial.begin(115200);
  Serial.println();
  {
    using namespace esp32cam;
    Config cfg;
    cfg.setPins(pins::AiThinker);
    cfg.setResolution(hiRes);
    cfg.setBufferCount(2);
    cfg.setJpeg(80);
 
    bool ok = Camera.begin(cfg);
    Serial.println(ok ? "CAMERA OK" : "CAMERA FAIL");
  }
  WiFi.persistent(false);
  WiFi.mode(WIFI_STA);
  WiFi.begin(WIFI_SSID, WIFI_PASS);
  while (WiFi.status() != WL_CONNECTED) {
    delay(500);
  }
  Serial.print("http://");
  Serial.println(WiFi.localIP());
  Serial.println("  /cam-lo.jpg");
  Serial.println("  /cam-hi.jpg");
  Serial.println("  /cam-mid.jpg");
 
  server.on("/cam-lo.jpg", handleJpgLo);
  server.on("/cam-hi.jpg", handleJpgHi);
  server.on("/cam-mid.jpg", handleJpgMid);
 
  server.begin();
}
 
void loop()
{
  server.handleClient();
}

But first, you will need to download the esp32cam.h library. For this go to Github and download the esp32cam Zip.

An image illustrating the ESP32-CAM library for live video streaming in Python. The screenshot displays a code snippet or terminal window showcasing the Python code that utilizes the ESP32-CAM library. This library enables the ESP32-CAM module to capture live video and stream it in real-time using Python. This functionality allows users to implement video streaming applications, such as surveillance systems or remote monitoring, utilizing the ESP32-CAM board and Python programming language.

Then go back to Arduino IDE, click on the Sketch Menu > Include Library >, and click on Add .Zip Library.

An image depicting the ESP32-CAM library for the Arduino IDE. The screenshot showcases the Arduino IDE with the ESP32-CAM library integrated into the development environment. This library provides pre-built functions and examples specifically designed for the ESP32-CAM module, making it easier to program and control the module using the Arduino programming language. The library offers various features and capabilities for capturing images, video streaming, and interfacing with other components, expanding the possibilities for projects utilizing the ESP32-CAM module.

Browse to the location and select the esp32cam-main.zip folder. If you face any difficulty then you can watch my video tutorial given at the end of this article.

An image of the ESP32-CAM B-WT development board. The picture showcases the ESP32-CAM B-WT, which is a variant of the ESP32-CAM development board. The board features an ESP32 microcontroller, a camera module, and various input/output pins for connecting sensors, actuators, and other components. It is designed specifically for projects that involve image capturing and processing, making it suitable for applications such as surveillance systems, IoT cameras, and computer vision projects.

For uploading the program I am using the ESP32 Camera development board. This way I don’t need to use Arduino. But if you don’t have this development board then you can use the Arduino Uno for uploading the program. For this, you can read my getting started article on the ESP32 Camera module.

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

Simply insert the ESP32 Camera module into the Development board and connect it to your laptop or computer. Now, select the esp32 cam board from the boards list in the Arduino IDE.

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

Then check the communication port and click on the upload button.

 ESP32 Camera Video Streaming Test in Python OpenCV:

After uploading the program, restart your ESP32 Camera Module, then open the Serial monitor, and wait for the ESP32 Camera module to connect. Copy the IP Address.

An image displaying the ESP32-Cam IP address configuration for Python-YOLO. The screenshot shows a console or terminal window with the output or prompt indicating the IP address assigned to the ESP32-Cam module. This IP address is used for communication and connectivity between the ESP32-Cam and the Python-YOLO program, enabling the transmission of video or image data for object detection and recognition using the YOLO algorithm.

For testing the Live Video streaming in Python. You will need the following Python OpenCV code.



Python Code for Video Streaming using ESP32 CAM:

import cv2
import urllib.request
import numpy as np

# Replace the URL with the IP camera's stream URL
url = 'http://192.168.43.219/cam-hi.jpg'
cv2.namedWindow("live Cam Testing", cv2.WINDOW_AUTOSIZE)


# Create a VideoCapture object
cap = cv2.VideoCapture(url)

# Check if the IP camera stream is opened successfully
if not cap.isOpened():
    print("Failed to open the IP camera stream")
    exit()

# Read and display video frames
while True:
    # Read a frame from the video stream
    img_resp=urllib.request.urlopen(url)
    imgnp=np.array(bytearray(img_resp.read()),dtype=np.uint8)
    #ret, frame = cap.read()
    im = cv2.imdecode(imgnp,-1)

    cv2.imshow('live Cam Testing',im)
    key=cv2.waitKey(5)
    if key==ord('q'):
        break
    

cap.release()
cv2.destroyAllWindows()

in the code, you can see this line

url = 'http://192.168.43.219/cam-hi.jpg'

I just pasted that URL.

On the Serial monitor, under the IP address, you would also see three different image resolutions lo, hi, and mid. Use the one as per your needs. Remove the extra spaces and Run the program.

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

It is working. I can use my ESP32 Camera module for live video streaming. So the ESP32 Camera module is ready for object detection and identification using Yolo v3.

Let me tell you, we are only using the ESP32 Camera module for the live video streaming, we are not doing image processing on the ESP32 Camera module. The image processing, object detection, and identification will be done on a laptop or raspberry pi. So, let’s go ahead and do it.



Python OpenCV YoLo V3 Testing on different systems:

I am going to test the Final projects codes on Raspberry Pi 4 and which has 8GB RAM, Acer Core i3 laptop, and the Award-winning MSI Intel Core i7 9th Generation and which has the Nvidia Geforce GTX 16GB GPU and 16GB RAM.

Python OpenCV Yolo V3 Code for Laptops/PCs:

import cv2
import numpy as np
import urllib.request

url = 'http://192.168.43.219/cam-hi.jpg'

cap = cv2.VideoCapture(url)
whT=320
confThreshold = 0.5
nmsThreshold = 0.3
classesfile='coco.names'
classNames=[]
with open(classesfile,'rt') as f:
    classNames=f.read().rstrip('\n').split('\n')


modelConfig = 'yolov3.cfg'
modelWeights= 'yolov3.weights'
net = cv2.dnn.readNetFromDarknet(modelConfig,modelWeights)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)
def findObject(outputs,im):
    hT,wT,cT = im.shape
    bbox = []
    classIds = []
    confs = []
    found_cat = False
    found_bird = False
    for output in outputs:
        for det in output:
            scores = det[5:]
            classId = np.argmax(scores)
            confidence = scores[classId]
            if confidence > confThreshold:
                w,h = int(det[2]*wT), int(det[3]*hT)
                x,y = int((det[0]*wT)-w/2), int((det[1]*hT)-h/2)
                bbox.append([x,y,w,h])
                classIds.append(classId)
                confs.append(float(confidence))
    
    indices = cv2.dnn.NMSBoxes(bbox,confs,confThreshold,nmsThreshold)
    print(indices)
   
    for i in indices:
        i = i[0]
        box = bbox[i]
        x,y,w,h = box[0],box[1],box[2],box[3]
        if classNames[classIds[i]] == 'bird':
            found_bird = True
        elif classNames[classIds[i]] == 'cat':
            found_cat = True
            
        cv2.rectangle(im,(x,y),(x+w,y+h),(255,0,255),2)
        cv2.putText(im, f'{classNames[classIds[i]].upper()} {int(confs[i]*100)}%', (x,y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255,0,255), 2)
       


while True:
    img_resp=urllib.request.urlopen(url)
    imgnp=np.array(bytearray(img_resp.read()),dtype=np.uint8)
    im = cv2.imdecode(imgnp,-1)
    sucess, img= cap.read()
    blob=cv2.dnn.blobFromImage(im,1/255,(whT,whT),[0,0,0],1,crop=False)
    net.setInput(blob)
    layernames=net.getLayerNames()
    outputNames = [layernames[i[0]-1] for i in net.getUnconnectedOutLayers()]

    outputs = net.forward(outputNames)

    findObject(outputs,im)


    cv2.imshow('IMage',im)
    cv2.waitKey(1)

So, first, let’s go ahead and check this test code written for the detection of all the objects. By all objects I mean, only those objects which are available in the coco.names list. And make sure you keep the coco.names, yolov3.cfg, and yolov3.weights file in the same folder with the main programming file, I have already explained this. So, first, let’s start with the Raspberry Pi.




Python OpenCV Yolo V3 Code for Raspberry Pi:

So, guys, this is the smallest Raspberry Pi 4 PC and it has 8GB RAM.

An image demonstrating Python YOLOv3 for image processing on Raspberry Pi 4. The picture showcases a Raspberry Pi 4 board connected to a monitor, running Python code for image processing using the YOLOv3 algorithm. The setup highlights the utilization of the Raspberry Pi's processing power and Python programming language to perform advanced object detection and recognition tasks on images. This integration allows for efficient and accurate image processing capabilities on the Raspberry Pi 4 platform.

I got it from SunFounder. The reason I am doing this test is just to let you know that is it powerful enough to handle image processing using Python OpenCV YoloV3. I already have a camera connected to my raspberry pi, so I am going to use this camera.

And for this test, I am not using the above, but I am going to use the below code. As I said for this test I am going to use the camera which is connected to the Raspberry Pi. It’s just to check if Raspberry Pi can handle it.

Raspberry Pi Yolo V3 Code:

import cv2
import numpy as np
cap = cv2.VideoCapture(0)
whT=320
confThreshold = 0.5
nmsThreshold = 0.3
classesfile='coco.names'
classNames=[]
with open(classesfile,'rt') as f:
    classNames=f.read().rstrip('\n').split('\n')
#print(classNames)

modelConfig = 'yolov3.cfg'
modelWeights= 'yolov3.weights'
net = cv2.dnn.readNetFromDarknet(modelConfig,modelWeights)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)
def findObject(outputs,img):
    hT,wT,cT = img.shape
    bbox = []
    classIds = []
    confs = []
    for output in outputs:
        for det in output:
            scores = det[5:]
            classId = np.argmax(scores)
            confidence = scores[classId]
            if confidence > confThreshold:
                w,h = int(det[2]*wT), int(det[3]*hT)
                x,y = int((det[0]*wT)-w/2), int((det[1]*hT)-h/2)
                bbox.append([x,y,w,h])
                classIds.append(classId)
                confs.append(float(confidence))
    #print(len(bbox))
    indices = cv2.dnn.NMSBoxes(bbox,confs,confThreshold,nmsThreshold)
    print(indices)
    
    for i in indices:
        i = i[0]
        box = bbox[i]
        x,y,w,h = box[0],box[1],box[2],box[3]
        cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,255),2)
        cv2.putText(img, f'{classNames[classIds[i]].upper()} {int(confs[i]*100)}%', (x,y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255,0,255), 2)


while True:
    sucess, img= cap.read()
    blob=cv2.dnn.blobFromImage(img,1/255,(whT,whT),[0,0,0],1,crop=False)
    net.setInput(blob)
    layernames=net.getLayerNames()
    #print(layernames)
    outputNames = [layernames[i[0]-1] for i in net.getUnconnectedOutLayers()]

    #print(net.getUnconnectedOutLayers())
    outputs = net.forward(outputNames)
    #print(outputs[0].shape)
    #print(outputs[1].shape)
    #print(outputs[2].shape)
    #print(outputs[0][0])
    findObject(outputs,img)



    cv2.imshow('IMage',img)
    cv2.waitKey(1)

An image demonstrating Python YOLOv3 for image processing on Raspberry Pi 4. The picture showcases a Raspberry Pi 4 board connected to a monitor, running Python code for image processing using the YOLOv3 algorithm. The setup highlights the utilization of the Raspberry Pi's processing power and Python programming language to perform advanced object detection and recognition tasks on images. This integration allows for efficient and accurate image processing capabilities on the Raspberry Pi 4 platform.

Raspberry Pi 4  is perfectly detecting all objects but it’s really slow, so, Raspberry Pi 4 isn’t good for image processing. For the practical demonstration watch my video tutorial available on my YouTube channel “Electronic Clinic”. Although the 8GB variant of the Raspberry Pi 4 is quite popular, you can even play games with it.

An image depicting the combination of YOLOv3 and Raspberry Pi 4 for image processing. The picture showcases a Raspberry Pi 4 board connected to a monitor, running the YOLOv3 algorithm for image processing. It illustrates the powerful capabilities of Raspberry Pi 4, combined with the YOLOv3 algorithm, to perform advanced object detection and recognition tasks on images. This integration enables efficient and accurate image processing using the YOLOv3 model on the Raspberry Pi 4 platform.

But when it comes to High-end image processing it fails unless you add some kind of external hardware to it.



Yolo V3 on Core i3 Laptop:

Next, I am going to test this using Core i3 Laptop and for this, I am going to use the Python OpenCV Yolo V3 Code for Laptops/PCs given above because from now on we will use the ESP32 Camera module.

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

I was able to detect all the objects. Image processing on a Core i3 laptop is better than the Raspberry Pi 4 but still it’s slow. But, as a beginner, you can use a similar laptop as Raspberry Pi 4 with 8GB Ram is more expensive than the Core i3 Laptop.

Yolo V3 on Core i7, 9th generation:

Next, I am going to test it on my MSI Intel Core i7 9th Generation Gaming Laptop with award-winning Nvidia Geforce GTX 1660 Ti GPU. This is one of the most expensive laptops. Anyway, let’s see if it will make any difference.

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

Image processing on this machine is quite impressive. Although it’s not very fast but still acceptable for me, and I can use it in my future image processing-based projects. And by the way, during recording the video, I forgot to turn on the GPU L.



Final ESP32 and YoLo V3 Code:

Now, let’s check this final code written only for the detection and identification of birds and cats. It will ignore all the other objects.

import cv2
import numpy as np
import urllib.request

url = 'http://192.168.43.219/cam-hi.jpg'

cap = cv2.VideoCapture(url)
whT=320
confThreshold = 0.5
nmsThreshold = 0.3
classesfile='coco.names'
classNames=[]
with open(classesfile,'rt') as f:
    classNames=f.read().rstrip('\n').split('\n')
#print(classNames)

modelConfig = 'yolov3.cfg'
modelWeights= 'yolov3.weights'
net = cv2.dnn.readNetFromDarknet(modelConfig,modelWeights)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)
def findObject(outputs,im):
    hT,wT,cT = im.shape
    bbox = []
    classIds = []
    confs = []
    found_cat = False
    found_bird = False
    for output in outputs:
        for det in output:
            scores = det[5:]
            classId = np.argmax(scores)
            confidence = scores[classId]
            if confidence > confThreshold:
                w,h = int(det[2]*wT), int(det[3]*hT)
                x,y = int((det[0]*wT)-w/2), int((det[1]*hT)-h/2)
                bbox.append([x,y,w,h])
                classIds.append(classId)
                confs.append(float(confidence))
    #print(len(bbox))
    indices = cv2.dnn.NMSBoxes(bbox,confs,confThreshold,nmsThreshold)
    print(indices)
   
    for i in indices:
        i = i[0]
        box = bbox[i]
        x,y,w,h = box[0],box[1],box[2],box[3]
        if classNames[classIds[i]] == 'bird':
            found_bird = True
        elif classNames[classIds[i]] == 'cat':
            found_cat = True
            
        if classNames[classIds[i]]=='bird':
            
            cv2.rectangle(im,(x,y),(x+w,y+h),(255,0,255),2)
            cv2.putText(im, f'{classNames[classIds[i]].upper()} {int(confs[i]*100)}%', (x,y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255,0,255), 2)
            print('bird')
            print(found_bird)
            
        if classNames[classIds[i]]=='cat':
             
            cv2.rectangle(im,(x,y),(x+w,y+h),(255,0,255),2)
            cv2.putText(im, f'{classNames[classIds[i]].upper()} {int(confs[i]*100)}%', (x,y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255,0,255), 2)
            print('cat')
            print(found_cat)
            
            
        if found_cat and found_bird:
            print('alert')


while True:
    img_resp=urllib.request.urlopen(url)
    imgnp=np.array(bytearray(img_resp.read()),dtype=np.uint8)
    im = cv2.imdecode(imgnp,-1)
    sucess, img= cap.read()
    blob=cv2.dnn.blobFromImage(im,1/255,(whT,whT),[0,0,0],1,crop=False)
    net.setInput(blob)
    layernames=net.getLayerNames()
    #print(layernames)
    outputNames = [layernames[i[0]-1] for i in net.getUnconnectedOutLayers()]

    #print(net.getUnconnectedOutLayers())
    outputs = net.forward(outputNames)
    #print(outputs[0].shape)
    #print(outputs[1].shape)
    #print(outputs[2].shape)
    #print(outputs[0][0])
    findObject(outputs,im)



    cv2.imshow('IMage',im)
    cv2.waitKey(1)

My designed 5V and 3A power supply and my created 4S lithium Ion battery, make the ESP32 Camera module completely portable. I can freely move around with my ESP32 Camera module or I can place it somewhere and then I can wirelessly monitor a specific region.

An image showcasing the detection of birds using an ESP32-CAM module and Python with YOLO v3. The picture illustrates the setup, including an ESP32-CAM module capturing a video feed and Python code running the YOLO v3 object detection algorithm. The system employs computer vision techniques to detect and recognize birds in real-time, leveraging the ESP32-CAM module's capabilities and Python with YOLO v3 for accurate and efficient bird detection.

As you can see it can detect birds and cats flawlessly.



An image illustrating the detection of cats using an ESP32 microcontroller, Python, OpenCV, and the YOLO v3 object detection algorithm. The image showcases the setup, including the ESP32 board, a computer running Python code, and the application of YOLO v3 for cat detection. The system utilizes computer vision techniques to analyze video data captured by the ESP32 and processes it with Python and OpenCV, enabling real-time identification and tracking of cats.

When both a bird and a cat are detected at the same time, it generates an alert. Now, you might be wondering why birds and cats?

An image illustrating the cats detection system using ESP32, Python, OpenCV, and YOLO v3. The image showcases the setup, including an ESP32 microcontroller, a computer running Python code, and the implementation of the YOLO v3 object detection algorithm. The system utilizes the power of computer vision to detect and identify cats in real-time by analyzing the video feed captured by the ESP32 and processed using Python and OpenCV libraries.

Well, in our house, this particular area is a favorite spot for birds, and there are nests in those trees. So, when a cat comes, the birds start chirping and making noise. My idea is that when birds are eating and, during that time, a cat comes, I should receive an alert.

ESP32 CAM with Python OpenCV Yolo V3 for object detection and Identification

I can send the alert to myself via email. And I can also use Arduino and GSM to send an SMS to myself. Once the alert is generated, we can take any necessary action.

You can use the same technique for any other object. You can create a high-level security system. You can use it in more than a million ways. In my upcoming video, I will explain how to train your own object that is not available in the coco.names list. So, that’s all for now. 




Watch the Video Tutorial:

Engr Fahad

My name is Shahzada Fahad and I am an Electrical Engineer. I have been doing Job in UAE as a site engineer in an Electrical Construction Company. Currently, I am running my own YouTube channel "Electronic Clinic", and managing this Website. My Hobbies are * Watching Movies * Music * Martial Arts * Photography * Travelling * Make Sketches and so on...

4 Comments

  1. سلام من دانشجوی رشته برق هستم از ایران چندتا سوال راجب برنامه نویسی پایتون برای ماژول esp32 cam داشتم میخواستم اگر میشه منو راهنمایی کنید ممنونم ازتون
    و اینکه به تازگی با شما آشنا شدم

  2. can you help modify the codes (both the c++ and python code) to help the image transmission to be done without internet connection access, but with the hotspot credentials of your raspberry pi?

  3. Thank you so much fir the enlightenment, Engr. Fahad. It was really worth it. But I am trying to modify the code to be able to transmit this image data wirelessly (not making use of an internet access, but the network credentials of the raspberry pi) to the raspberry pi. If it can be written in such way that both the C++ and the python code will be written, I will really appreciate it, because I am currently working on a project with this regard. Thanks in advance.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button