In this section, I am going to explain how to use scikit learn /sk learn(a machine learning package in python) to do Linear regression for a set of data points.
Please go through the previous section - Linear Regression theory for better understanding.
I am not going to explain training-testing data, model evaluation concepts here, but they are important.
We know that the equation of a line is given by:
where 'm' is the slope and 'b' is the intercept.
Our goal is to find the best values of slope (m) and and intercept (b) to fit our data.
The Linear Regression uses Ordinary Least Squares method to fit our data points,
from sklearn import linear_model
I have some height and weight data of some people. Lets use this data to do linear regression and try to predict weight of other people.
height=[[4.0],[4.5],[5.0],[5.2],[5.4],[5.8],[6.1],[6.2],[6.4],[6.8]] weight=[ 42 , 44 , 49, 55 , 53 , 58 , 60 , 64 , 66 , 69] print("height weight") for row in zip(height, weight): print(row,"->",row)
height weight 4.0 -> 42 4.5 -> 44 5.0 -> 49 5.2 -> 55 5.4 -> 53 5.8 -> 58 6.1 -> 60 6.2 -> 64 6.4 -> 66 6.8 -> 69
import statement to plot graph using matplotlib:
import matplotlib.pyplot as plt
plotting the height and weight data:
plt.scatter(height,weight,color='black') plt.xlabel("height") plt.ylabel("weight")
Declaring the Linear Regression Function and calling fit method to learn from data:
slope and intercept:
m=reg.coef_ b=reg.intercept_ print("slope=",m, "intercept=",b)
slope= 10.1936218679 intercept= -0.4726651480
plt.scatter(height,weight,color='black') predicted_values = [reg.coef_ * i + reg.intercept_ for i in height] plt.plot(height, predicted_values, 'b') plt.xlabel("height") plt.ylabel("weight")output:
Now we can go ahead and predict the weight of people whose data is not there with us: