Data Preprocessing

What is DATA PREPROCESSING ?

question mark gif

Data preprocessing in machine learning is an important step that helps enhance the quality of data before feeding it into the model to extract meaningful output from the data.

Why Data Preprocessing?

Data preprocessing helps make the raw data cleaner and more meaningful so that the machine, to which we feed the data, understands our data and learns the patterns in it, allowing us to obtain the best output from the data.

Missing values are handled using techniques such as imputation or removal of rows if the dataset is large. This ensures that analytical models are not skewed by the absence of crucial data points.

Data that are feed into the ML or Deep Learning Models are of different data type like :

  1. Numerical data
  2. Categorical data
  3. Tex data
  4. Image data
  5. Audio data

Each of the above data needs to be preprocessed differently according to there own recruitment

Introduction to image preprocessing.

Image preprocessing is the essential first step in teaching computers to understand pictures. In AI, it’s like cleaning and organizing images before the computer learns from them. This process is crucial because it helps the computer recognize patterns more effectively, like making sure all pictures are in the same format before learning.The clearer the data, the better the computer performs.

Without Image Preprocessing With Image Preprocessing
If you give these pictures directly to the computer, it might get confused. It may not understand that a small, dark cat is the same as a big, bright cat. It’s like trying to teach someone with blurry glasses; they might not see things clearly. Now, imagine before teaching, you make all the pictures the same size, adjust the colors, and make sure they are not too bright or dark. This is like cleaning your glasses before learning. Image preprocessing helps the computer see things more clearly and understand, “Oh, all these pictures are cats, even if they look a bit different.”

preprocessing

Challenges with raw data

Working with raw data presents lots of challenges for AI applications to make accurate predictions.

Types of images.

  1. Grayscale image
  1. RGB image.
  1. Floating point image

Common image preprocessing technique.

  1. Re Scaling and NOrmalization
  1. Grayscale Conversion
  1. Image Resizing and Cropping
  1. Edge Detection