How to create a simple Gradient Descent algorithm

2024/10/4 9:34:09

I'm studying simple machine learning algorithms, beginning with a simple gradient descent, but I've got some trouble trying to implement it in python.

Here is the example I'm trying to reproduce, I've got data about houses with the (living area (in feet2), and number of bedrooms) with the resulting price :

Living area (feet2) : 2104

#bedrooms : 3

Price (1000$s) : 400

I'm trying to do a simple regression using the gradient descent method, but my algorithm won't work... The form of the algorithm is not using vectors on purpose (I'm trying to understand it step by step).

i = 1
import sys
derror=sys.maxint
error = 0
step = 0.0001
dthresh = 0.1
import randomtheta1 = random.random()
theta2 = random.random()
theta0 = random.random()
while derror>dthresh:diff = 400 - theta0 - 2104 * theta1 - 3 * theta2theta0 = theta0 + step * diff * 1theta1 = theta1 + step * diff * 2104theta2 = theta2 + step * diff * 3hserror = diff**2/2derror = abs(error - hserror)error = hserrorprint 'iteration : %d, error : %s' % (i, error)i+=1

I understand the math, I'm constructing a predicting function $$h_{\theta}(x) = \theta_0 + \theta_1 x_1 + \theta_2 x_2$$with $x_1$and $x_2$being the variables (living area, number of bedrooms) and $h_{\theta}(x)$the estimated price.

I'm using the cost function ($hserror$ ) (for one point) : $$hserror = \frac{1}{2} (h_{\theta}(x) - y)^2$$This is a usual problem, but I'm more of a software engineer and I'm learning one step at a time, can you tell me what's wrong ?

I got it working with this code :

data = {(2104, 3) : 400, (1600,3) : 330, (2400, 3) : 369, (1416, 2) : 232, (3000, 4) : 540}
for x in range(10):i = 1import sysderror=sys.maxinterror = 0step = 0.00000001dthresh = 0.0000000001import randomtheta1 = random.random()*100theta2 = random.random()*100theta0 = random.random()*100while derror>dthresh:diff = 400 - (theta0 + 2104 * theta1 + 3 * theta2)theta0 = theta0 + step * diff * 1theta1 = theta1 + step * diff * 2104theta2 = theta2 + step * diff * 3hserror = diff**2/2derror = abs(error - hserror)error = hserror#print 'iteration : %d, error : %s, derror : %s' % (i, error, derror)i+=1print ' theta0 : %f, theta1 : %f, theta2 : %f' % (theta0, theta1, theta2)print ' done : %f' %(theta0 + 2104 * theta1 + 3*theta2)

which ends up with answers like this :

 theta0 : 48.412337, theta1 : 0.094492, theta2 : 50.925579done : 400.000043theta0 : 0.574007, theta1 : 0.185363, theta2 : 3.140553done : 400.000042theta0 : 28.588457, theta1 : 0.041746, theta2 : 94.525769done : 400.000043theta0 : 42.240593, theta1 : 0.096398, theta2 : 51.645989done : 400.000043theta0 : 98.452431, theta1 : 0.136432, theta2 : 4.831866done : 400.000043theta0 : 18.022160, theta1 : 0.148059, theta2 : 23.487524done : 400.000043theta0 : 39.461977, theta1 : 0.097899, theta2 : 51.519412done : 400.000042theta0 : 40.979868, theta1 : 0.040312, theta2 : 91.401406done : 400.000043theta0 : 15.466259, theta1 : 0.111276, theta2 : 50.136221done : 400.000043theta0 : 72.380926, theta1 : 0.013814, theta2 : 99.517853done : 400.000043
Answer

First issue is that running this with only one piece of data gives you an underdetermined system... this means it may have an infinite number of solutions. With three variables, you'd expect to have at least 3 data points, preferably much higher.

Secondly using gradient descent where the step size is a scaled version of the gradient is not guaranteed to converge except in a small neighbourhood of the solution. You can fix that by switching to either a fixed size step in the direction of the negative gradient (slow) or a linesearch in the direction of the negative gradient ( faster, but slightly more complicated)

So for fixed step size instead of

theta0 = theta0 - step * dEdtheta0
theta1 = theta1 - step * dEdtheta1
theta2 = theta2 - step * dEdtheta2

You do this

n = max( [ dEdtheta1, dEdtheta1, dEdtheta2 ] )    
theta0 = theta0 - step * dEdtheta0 / n
theta1 = theta1 - step * dEdtheta1 / n
theta2 = theta2 - step * dEdtheta2 / n

It also looks like you may have a sign error in your steps.

I'm also not sure that derror is a good stopping criteria. (But stopping criteria are notoriously hard to get "right")

My final point is that gradient descent is horribly slow for parameter fitting. You probably want to use conjugate-gradient or Levenberg-Marquadt methods instead. I suspect that both of these methods already exist for python in the numpy or scipy packages (which aren't part of python by default but are pretty easy to install)

https://en.xdnf.cn/q/70619.html

Related Q&A

login_required decorator on a class based view in django

I have a working class based view. But when adding @login_required I get the error:AttributeError: function object has no attribute as_viewSomething is happening to the ResultListView here:from django.…

Generic way to get primary key from declaratively defined instance in SQLAlchemy

Does SQLAlchemy offer a generic way to get the primary key from a declaratively defined instance, so that if:Base = declarative_base()class MyClass(Base):__tablename__ = mytablekey = Column(Integer, pr…

Add column after another column

How can I add a column after another column to a database using Alembic or SQLAlchemy? That would be equivalent to this SQL clause: ALTER TABLE foo CHANGE COLUMN bar bar COLUMN_DEFINITION_HERE AFTER …

Scrapy Images Downloading

My spider runs without displaying any errors but the images are not stored in the folder here are my scrapy files:Spider.py:import scrapy import re import os import urlparse from scrapy.spiders import …

A full and minimal example for a class (not method) with Python C Extension?

Everywhere, I can easily find an example about writing a method with Python C Extensions and use it in Python. Like this one: Python 3 extension example$ python3 >>> import hello >>> …

Python: Grouping into timeslots (minutes) for days of data

I have a list of events that occur at mS accurate intervals, that spans a few days. I want to cluster all the events that occur in a per-n-minutes slot (can be twenty events, can be no events). I have …

signal.alarm not triggering exception on time

Ive slightly modified the signal example from the official docs (bottom of page).Im calling sleep 10 but I would like an alarm to be raised after 1 second. When I run the following snippet it takes way…

Execute Python (selenium) script in crontab

I have read most of the python/cron here in stackoverflow and yet couldnt make my script run. I understood that I need to run my script through shell (using zsh & ipython by the way), but really I …

Get post data from ajax post request in python file

Im trying to post some data with an ajax post request and execute a python file, retrieving the data in the python file, and return a result.I have the following ajax code$(function () {$("#upload…

How to implement maclaurin series in keras?

I am trying to implement expandable CNN by using maclaurin series. The basic idea is the first input node can be decomposed into multiple nodes with different orders and coefficients. Decomposing singl…