[editorskit display=”wordcount” before=”Reading Time: ” after=” min”]
In most of the practical Machine Learning cases, we use Supervised Learning category. As the name suggests, supervised learning category is concentrated on mapping patterns by establishing the relationship between variables and known outcomes while working with labelled datasets. Simply saying, it’s like having an input variable (x) and output variable (y) and use an algorithm to make it learn to establish a mapping function between the input and output.
y = f(x)
In Supervised learning, we provide the machine with sample data with various features (called as input variables, x) and the correct output of data (called as output variable, y). Since the output and feature values are known, we call the dataset as “labelled”. The algorithm then establishes a pattern that exists in the data and creates a model.
For example, to predict the market rate for the purchase of a house, a supervised algorithm can make predictions by analysing the relationship between the house attributes like year of construction, number of storeys, surrounding neighbourhood etc. and the selling price of other houses based on historical data available. Since in the supervised algorithms, we already know the prices of houses sold, we can work backward in it to determine the relationship between the characteristics of the house and its value.
Once the machine establishes the rules and patterns of the data provided, a model is created which is an algorithmic equation for producing an outcome with new data based on the rules derived from the training data. After the model is prepared, it is applied to a new data and tested for its accuracy. The model must pass both the training and test data stages successfully to be ready to be used in the real world.
Our goal is to establish a mapping function so well that it can later predict the output variable (y) for the new input variable (x). This method is called supervised learning because the process of algorithmic learning is from training by the known datasets and can be thought of as a teacher supervising the learning process. We already know the correct answer in advance and make algorithm iteratively predict on the training data which is similar to teacher correcting the answer and stop training once the algorithm achieves an acceptable or satisfactory level of performance.
Supervised Learning is classified into two categories of algorithms:
Classification: A classification problem is when the output variable is a category such as “Red” or “Blue” or “disease” and “no disease”.
Regression: A regression problem is when the output variable is a real value, such as “dollars” or “weight”.