Linear Regression: How to find the distance between the points and the prediction line?

2024/9/16 22:58:55

I'm looking to find the distance between the points and the prediction line. Ideally I would like the results to be displayed in a new column which contains the distance, called 'Distance'.

My Imports:

import os.path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import preprocessing
from sklearn.linear_model import LinearRegression
%matplotlib inline 

Sample of my data:

idx  Exam Results  Hours Studied
0       93          8.232795
1       94          7.879095
2       92          6.972698
3       88          6.854017
4       91          6.043066
5       87          5.510013
6       89          5.509297

My code so far:

x = df['Hours Studied'].values[:,np.newaxis]
y = df['Exam Results'].valuesmodel = LinearRegression()
model.fit(x, y)plt.scatter(x, y,color='r')
plt.plot(x, model.predict(x),color='k')
plt.show()

My plot

Any help would be greatly appreciated. Thanks

Answer

You simply need to assign the difference between y and model.predict(x) to a new column (or take absolute value if you just want the magnitude if the difference):

#df["Distance"] = abs(y - model.predict(x))  # if you only want magnitude
df["Distance"] = y - model.predict(x)
print(df)
#   Exam Results  Hours Studied  Distance
#0            93       8.232795 -0.478739
#1            94       7.879095  1.198511
#2            92       6.972698  0.934043
#3            88       6.854017 -2.838712
#4            91       6.043066  1.714063
#5            87       5.510013 -1.265269
#6            89       5.509297  0.736102

This is because your model predicts a y (dependent variable) for each independent variable (x). The x coordinates are the same, so the difference in y is the value you want.

https://en.xdnf.cn/q/72418.html

Related Q&A

How to draw a Tetrahedron mesh by matplotlib?

I want to plot a tetrahedron mesh by matplotlib, and the following are a simple tetrahedron mesh: xyz = np.array([[-1,-1,-1],[ 1,-1,-1], [ 1, 1,-1],[-1, 1,-1],[-1,-1, 1],[ 1,-1, 1], [ 1, 1, 1],[-1, 1, …

How to set seaborn jointplot axis to log scale

How to set axis to logarithmic scale in a seaborn jointplot? I cant find any log arguments in seaborn.jointplot Notebook import seaborn as sns import pandas as pddf = pd.read_csv("https://storage…

Convert decision tree directly to png [duplicate]

This question already has answers here:graph.write_pdf("iris.pdf") AttributeError: list object has no attribute write_pdf(10 answers)Closed 7 years ago.I am trying to generate a decision tree…

Python: can I modify a Tuple?

I have a 2 D tuple (Actually I thought, it was a list.. but the error says its a tuple) But anyways.. The tuple is of form: (floatnumber_val, prod_id) now I have a dictionary which contains key-> p…

Saving scatterplot animations

Ive been trying to save an animated scatterplot with matplotlib, and I would prefer that it didnt require totally different code for viewing as an animated figure and for saving a copy. The figure show…

Pandas: Bin dates into 30 minute intervals and calculate averages

I have a Pandas dataframe with two columns which are speed and time.speed date 54.72 1:33:56 49.37 1:33:59 37.03 1:34:03 24.02 7:39:58 28.02 7:40:01 24.04 7:40:04 24.02 7:40:07 25.35 …

Regular expression for UK Mobile Number - Python

I need a regular expression that only validates UK mobile numbers. A UK mobile number can be between 10-14 digits and either starts with 07, or omits the 0 and starts with 447. Importantly, if the user…

Iterate through all the rows in a table using python lxml xpath

This is the source code of the html page I want to extract data from.Webpage: http://gbgfotboll.se/information/?scr=table&ftid=51168 The table is at the bottom of the page <html><tab…

Django: Serializing a list of multiple, chained models

Given two different models, with the same parent base class. Is there any way, using either Django Rest Framework Serializers or serpy, to serialize a chained list containing instances of both the chil…

Formatting cells in Excel with Python

How do I format cells in Excel with python?In particular I need to change the font of several subsequent rows to be regular instead of bold.