Object Detection Using ImageAI

Ojash Shrestha
3y
9.3k
0
6

Article

This article takes a walkthrough of the various aspects of Computer Vision and the libraries and features, a hands-on experience of programming to detect objects with a short programming task. Computer Vision has been enabled with the rise of Machine Learning Algorithms and easily accessible libraries. Let us, take a look and experience the example in this article.

Object Detection

Object Detection is a process of detecting, locating, and identifying objects from a visual image or a video. It is a kind of Image Processing and Computer Vision technology, which are performed mainly with Machine Learning and Deep Learning implementation.

OpenCV

OpenCV ie. Open Computer Vision is a library that enables cross-platform usage to develop computer vision applications with the capability to perform real-time detection. It has been widely used for object detection, face detection, and multitudes of other use cases for image processing and video analysis applications.

Numpy

NumPy is a library that supports numerous programming languages including Python for numerical computation. It helps as an extension that adds support for huge, multi-dimensional arrays & matrices. It also consists of a large library of high-level mathematical functions which can be used to operate on these arrays.

Keras

Keras is an extremely powerful library that helps to evaluate and develop deep-learning models. It is especially used for artificial neural networks and performs as an interface for another library, Tensorflow.

Tensorflow

Tensorflow is developed by Google as an open-source library to provide an end-to-end platform for machine learning with a focus on the inference of deep neural networks.

Jupyter Notebook

Jupyter Notebook is an amalgamation of an IDE and also an educational tool for presentation which is used extensively and widely mostly for programming for scientific computing.

Python

Python is one of the easiest and widely used programming languages across the globe,

Taught as a beginning programming language to students
Clear syntax facilitates, ease of understanding and code indentation
Active communities of libraries and modules developers

Anaconda

Anaconda is a distribution for scientific computing which is an easy-to-install free package manager and environment manager and has a collection of over 720 open-source packages offering free community support for R and Python programming languages. It supports Windows, Linux, and Mac OS and also ships with Jupyter Notebook.

For this hand-on experience, using the anaconda environment will make it easier to follow up the programming below.

Managing setup for this experiment

First, Install Anaconda Environment, from its official site. It provides a free individual edition.

After, installation, open Anaconda Prompt terminal and check for any updates.

conda update --all

Now, Install Tensorflow:

pip install tensorflow==2.4.0

Install Keras

pip install keras==2.4.3

For, Image AI dependencies

pip numpy==1.19.3 pillow==7.0.0scipy==1.4.1 h5py==2.10.0 matplotlib==3.3.2opencv-python keras-resnet==0.2.0

Install ImageAI

pip install imageai

Now, activate Jupyter Notebook from Anaconda and open a Python3 Shell in the notebook.

Now, we import tensorflow and check its version,

import tensorflow as tf
print(tf.__version__)

Import, os, keras and sys,

import os
from tensorflow import keras
import sys

Now, we import ImageClassification from ImageAI,

from imageai.Classification import ImageClassification

For obtaining the execution link of the current notebook,

executionpath = os.getcwd()
print(executionpath)

ResNet50

ResNet stands for Residual Network which is a classic neural network extensively use for tasks of computer vision and is a backbone for CV-related problems. ResNet-50 is one of the variants of the ResNet Model which has 50 layers deep convolutional layers with 1 Average Pool layer and 1 MaxPool.

Variable (detection) calls in ImageClassification to detect our input imagesand Deep Learning Model type is set for ResNet50,

detection = ImageClassification()
detection.setModelTypeAsResNet50()
detection.setModelPath(executionpath + "/resnet50_imagenet_tf.2.0.h5")

“resnet50_imagenet_tf.2.0.h5” must be downloaded to the execution path location from the source code at Github.

detection.setModelPath(executionpath + "/resnet50_imagenet_tf.2.0.h5")

Now loading, the model.

detection.loadModel()

Save the image you want to classify on your executionpath location.

detections, percentprobabs = detection.classifyImage("image1.png", result_count=5)

Now, the detection is performed.

for Index in range(len(detections)):
print(detections[Index], ":", percentprobabs[Index])

The image, used can be displayed on the notebook using,

from IPython.display import display, Image
display(Image(filename='image1.png'))

The output, shows, 59% probability that the image is of a sports car, 22.9% of a racer, 10.057985037565231% that it's of a car wheel, 2.2967111319303513% that it’s a convertible, and 1.2311830185353756% that it’s a go-kart.

In this article, we had a hands-on experience of using libraries for the detection of objects and programmed in python using jupyter notebook on the anaconda platform. With such minimal, programming we could detect a random object. This is the output of the years of research and investment of multitudes of scientists and engineers to create the libraries and software described in the article above. With this example, it is established how easy it is to experiment with computer vision tasks with this example of object detection.