Data Science

A simple guide to Linear Regression Algorithm

Linear regression algorithm

The simplest machine learning algorithm

In this algorithm we use labeled data (e.g. size of the house and the price of it)

Using this data and linear regression we can predict the house price regarding its size.

How this algorithm works ?

1. We start by plotting our data in a scatter plot

Where the size is on the x-axis and price is on the y-axis

2. Our work now is to draw a line that best fits these points.

we know that the equation of the straight line is h(x)= theta0 + x*theta1 . we call this equation the Hypotheses equation

We should find theta0 and theta1 that makes the line fits the data

Cost Function :

it shows how close our predicted values were close to actual values.

Now we have to minimize the cost . and we can do it using gradient decent

Gradient Decent :
the gradient decent is an algorithm that minimize the cost function by finding the best values of theta 1 and theta 2

Where alpha is the learning rate. We should choose it correctly or the gradient decent will fail to converge

After finding theta0 and theta1 we can now draw our fit line and it will looks like this

Our linear regression algorithm is done by now. Given any size of a house (x) we can predict its price (h(x)

But what if our data but what if our data looks more like a quadratic curve ? will the straight line be the best choice ?
If our data is a quadratic curve we can add additional parameters to the hypotheses function.
Now our hypotheses function will look like this :

And we will apply the previous steps

Feature Scaling :
idea :Make sure features are on a similar scale
we should make sure that the scale is similar for all of our features ( like price and size )
If we ignore that our gradient decent may to longer time and even fail to converge

Step too apply feature Scaling
1. for each future substract the mean from it
2. divide the new values by its range ( Max - Min)
Summary
1. Apply feature scaling Draw the data using a scatter plot
2. Chose the best hypotheses function based on this scatter plot
3. Start with random values of theta0 and theta1
4. Calculate the cost function
5. use the gradient decent to adjust the values of theta0 and theta1 and minimize the cost
6. Now you are ready to predict the price of a house depending on its size

Mohamed Gad
Mar, 27 2022