I have been following this tutorial: https://kb.objectrocket.com/postgresql/scrape-a-website-to-postgres-with-python-938
My app.py
file looks like this (taken from the above tutorial):
from flask import Flask # needed for flask-dependent libraries below
from flask import render_template # to render the error page
from selenium import webdriver # to grab source from URL
from bs4 import BeautifulSoup # for searching through HTML
import psycopg2 # for database access# set up Postgres database connection and cursor.
t_host = "localhost" # either "localhost", a domain name, or an IP address.
t_port = "5432" # default postgres port
t_dbname = "scrape"
t_user = "postgres"
t_pw = "********"
db_conn = psycopg2.connect(host=t_host, port=t_port, dbname=t_dbname, user=t_user, password=t_pw)
db_cursor = db_conn.cursor()app = Flask(__name__)@app.route('/import_temp')
def import_temp():# set up your webdriver to use Chrome web browsermy_web_driver = webdriver.Chrome("/usr/lib/chromium-browser/chromedriver")# designate the URL we want to scrape# NOTE: the long string of characters at the end of this URL below is a clue that# maybe this page is so dynamic, like maybe refers to a specific web session and/or day/time,# that we can't necessarily count on it to be the same more than one time.# Which means... we may want to find another source for our data; one that is more# dependable. That said, whatever URL you use, the methodology in this lesson stands.t_url = "https://weather.com/weather/today/l/7ebb344012f0c5ff88820d763da89ed94306a86c770fda50c983bf01a0f55c0d"# initiate scrape of website page datamy_web_driver.get("<a href='" + t_url + "'>" + t_url + "</a>")# return entire page into "t_content"t_content = my_web_driver.page_source# use soup to make page content easily searchablesoup_in_bowl = BeautifulSoup(t_content)# search for the UNIQUE span and class for the data we are looking for:o_temp = soup_in_bowl.find('span', attrs={'class': 'deg-feels'})# from the resulting object, "o_temp", get the text parameter and assign it to "n_temp"n_temp = o_temp.text# Build SQL for purpose of:# saving the temperature data to a new rows = ""s += "INSERT INTO tbl_temperatures"s += "("s += "n_temp"s += ") VALUES ("s += "(%n_temp)"s += ")"# Trap errors for opening the filetry:db_cursor.execute(s, [n_temp, n_temp])db_conn.commit()except psycopg2.Error as e:t_msg = "Database error: " + e + "/n open() SQL: " + sreturn render_template("error_page.html", t_msg = t_msg)# Success!# Show a message to user.t_msg = "Successful scrape!"return render_template("progress.html", t_msg = t_msg)# Clean up the cursor and connection objectsdb_cursor.close()db_conn.close()
When I run the code, and head over to http://127.0.0.1:5000 I receive the 404 Error msg:
Not Found
The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.
Here is the output from the command line:
FLASK_APP = app.py
FLASK_ENV = development
FLASK_DEBUG = 0
In folder /home/lloyd/PycharmProjects/flaskProject
/home/lloyd/PycharmProjects/flaskProject/venv/bin/python -m flask run* Serving Flask app 'app.py' (lazy loading)* Environment: development* Debug mode: off* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [28/Dec/2021 08:25:01] "GET / HTTP/1.1" 404 -
I did run a test 'Hello World' project which was successful.
Any insight as to why I'm receiving this error would be greatly appreciated.
Lloyd