Photo by Markus Winkler on Unsplash
Supervised and Unsupervised Learning in Machine Learning
3 min read
To put up simply, Machine Learning(ML) is a way by which a computer learns to predict the output of data. To achieve this, an algorithm (called an ML algorithm) is trained on a specific dataset.
Mainly there are 2 types of Machine Learning named as:
Here, the model is trained on data of desired Input and output mappings. You give your algorithm the input and the desired output data at first and then train your model. After the model has learned from your data, you give a completely new value to the model and it tries to produce a new output.
Spam filtering and speech recognition are some of the most used applications of supervised learning.
But there are two subcategories to supervised learning too:
In regression, the algorithm gives output in numbers from many possible output numbers. Let's take an example of house price prediction.
You will need to have a sample dataset of houses with a certain number of rooms, washrooms, and floors and their price. Now, your data will be used to train your ML model
If you need to predict the price of a house with 10 rooms, 3 washrooms and 3 floors, then your ML model will tell you the price.
In classification, your algorithm will return a category (also called a class) as its output. One of the classic examples of classification is spam filtering
The algorithm is trained on a large dataset of sample emails with input as the sender, the subject line, and the content of the email and the output of a particular email (spam or not spam) is already present there
Then when a new email is sent to a user, the algorithm tries to classify it as spam or not depending on the parameters like the sender, the subject line, and the content of the email
In supervised learning, we used data that had input and the corresponding correct output pairs. But in unsupervised learning, that output is not present in that data. The algorithm often predicts patterns in data and the data gets categorized by the rules made by the algorithm itself.
It is commonly used for tasks such as customer segmentation or document classification. In customer classification, the algorithm tries to classify the customers based on certain parameters like "paying users", "members", "blog readers" etc. etc. When an algorithm does this kind of task of classifying, it is often known as Clustering Algorithm
It aims at identifying any unusual data points that don't fit in any similar pattern of data. It is used in detecting transaction fraud detection
It involves reducing parameters from data while having access to much information in the dataset. It is used in tasks like data compression