Python: Sklearn.linear_model.LinearRegression working weird

2024/11/17 14:29:37

I am trying to do multiple variables linear regression. But I find that the sklearn.linear_model working very weird. Here's my code:

import numpy as np
from sklearn import linear_modelb = np.array([3,5,7]).transpose() ## the right answer I am expecting
x = np.array([[1,6,9],   ## 1*3 + 6*5 + 7*9 = 96[2,7,7],   ## 2*3 + 7*5 + 7*7 = 90[3,4,5]])  ## 3*3 + 4*5 + 5*7 = 64
y = np.array([96,90,64]).transpose()clf = linear_model.LinearRegression()
clf.fit([[1,6,9],[2,7,7],[3,4,5]], [96,90,64])
print clf.coef_ ## <== it gives me [-2.2  5  4.4] NOT [3, 5, 7]
print np.dot(x, clf.coef_) ## <== it gives me [ 67.4  61.4  35.4]
Answer

In order to find your initial coefficients back you need to use the keyword fit_intercept=False when construction the linear regression.

import numpy as np
from sklearn import linear_modelb = np.array([3,5,7])
x = np.array([[1,6,9],  [2,7,7],   [3,4,5]])  
y = np.array([96,90,64])clf = linear_model.LinearRegression(fit_intercept=False)
clf.fit(x, y)
print clf.coef_
print np.dot(x, clf.coef_)

Using fit_intercept=False prevents the LinearRegression object from working with x - x.mean(axis=0), which it would otherwise do (and capture the mean using a constant offset y = xb + c) - or equivalently by adding a column of 1 to x.

As a side remark, calling transpose on a 1D array doesn't have any effect (it reverses the order of your axes, and you only have one).

https://en.xdnf.cn/q/71540.html

Related Q&A

Implementation of Gaussian Process Regression in Python y(n_samples, n_targets)

I am working on some price data with x = day1, day2, day3,...etc. on day1, I have lets say 15 price points(y), day2, I have 30 price points(y2), and so on.When I read the documentation of Gaussian Proc…

Converting a list of points to an SVG cubic piecewise Bezier curve

I have a list of points and want to connect them as smoothly as possible. I have a function that I evaluate to get these points. I could simply use more sampling points but that would only increase the…

Python Class Inheritance AttributeError - why? how to fix?

Similar questions on SO include: this one and this. Ive also read through all the online documentation I can find, but Im still quite confused. Id be grateful for your help.I want to use the Wand class…

Is it possible to display pandas styles in the IPython console?

Is it possible to display pandas styles in an iPython console? The following code in a Jupyter notebookimport pandas as pd import numpy as npnp.random.seed(24) df = pd.DataFrame({A: np.linspace(1, 10,…

HEAD method not allowed after upgrading to django-rest-framework 3.5.3

We are upgrading django-rest-framework from 3.1.3 to 3.5.3. After the upgrade all of our ModelViewSet and viewsets.GenericViewSet views that utilize DefaultRouter to generate the urls no longer allow …

How do I specify server options?

Im trying to run gRPC server in Python. I found a way to do it like this:import grpc from concurrent import futuresserver = grpc.server(futures.ThreadPoolExecutor(max_workers=100)) ... # add my grpc se…

How to find collocations in text, python

How do you find collocations in text? A collocation is a sequence of words that occurs together unusually often. python has built-in func bigrams that returns word pairs. >>> bigrams([more, i…

How to set size of a Gtk Image in Python

How can I set the width and height of a GTK Image in Python 3.

Numpy Vectorized Function Over Successive 2d Slices

I have a 3D numpy array. I would like to form a new 3d array by executing a function on successive 2d slices along an axis, and stacking the resulting slices together. Clearly there are many ways to do…

MySQL and Python Select Statement Issues

Thanks for taking the time to read this. Its going to be a long post to explain the problem. I havent been able to find an answer in all the usual sources.Problem: I am having an issue with using the …