Having trouble in getting page source with beautifulsoup

2024/10/6 11:24:27

I am trying to get the HTML source of a web page using beautifulsoup.

import bs4 as bs
import requests
import urllib.request
sourceUrl='https://www.pakwheels.com/forums/t/planing-a-trip-from-karachi-to-lahore-by-road-in-feb-2017/414115/2.html'
source=urllib.request.urlopen(sourceUrl).read()
soup=bs.BeautifulSoup(source,'html.parser')
print(soup)

I want the HTML source of the page. This is what I am getting now:

'ps.store("siteSettings", {"title":"PakWheels Forums","contact_email":"[email protected]","contact_url":"https://www.pakwheels.com/main/contact_us","logo_url":"https://www.pakwheels.com/assets/logo.png","logo_small_url":"/images/d-logo-sketch-small.png","mobile_logo_url":"data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiPz4NCjwhLS0gR2VuZXJhdG9yOiBBZG9iZSBJbGx1c3RyYXRvciAxNi4wLjAsIFNWRyBFeHBvcnQgUGx1Zy1JbiAuIFNWRyBWZXJzaW9uOiA2LjAwIEJ1aWxkIDApICAtLT4NCjwhRE9DVFlQRSBzdmcgUFVCTElDICItLy9XM0MvL0RURCBTVkcgMS4xLy9FTiIgImh0dHA6Ly93d3cudzMub3JnL0dyYXBoaWNzL1NWRy8xLjEvRFREL3N2ZzExLmR0ZCI+DQo8c3ZnIHZlcnNpb249IjEuMSIgaWQ9IkxheWVyXzEiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgeG1sbnM6eGxpbms9Imh0dHA6Ly93d3cudzMub3JnLzE5OTkveGxpbmsiIHg9IjBweCIgeT0iMHB4Ig0KCSB3aWR0aD0iMjQwcHgiIGhlaWdodD0iNjBweCIgdmlld0JveD0iMCAwIDI0MCA2MCIgZW5hYmxlLWJhY2tncm91bmQ9Im5ldyAwIDAgMjQwIDYwIiB4bWw6c3BhY2U9InByZXNlcnZlIj4NCjxwYXRoIGZpbGw9IiNGRkZGRkYiIGQ9Ik02LjkwMiwyMy4yODZDMzQuNzc3LDIwLjI2Miw1Ny4yNC'
Answer

Have a look at this code:

from urllib import request
from bs4 import BeautifulSoupurl_1 = "http://www.google.com"
page = request.urlopen(url_1)
soup = BeautifulSoup(page)
print(soup.prettify())

Import everything you need correctly. Read this.

https://en.xdnf.cn/q/119500.html

Related Q&A

Cannot convert from pandas._libs.tslibs.timestamps.Timestamp to datetime.datetime

Im trying to convert from pandas._libs.tslibs.timestamps.Timestamp to datetime.datetime but the change is not saved: type(df_unix.loc[45,LastLoginDate])OUTPUT: pandas._libs.tslibs.timestamps.Timestampt…

string index out of range Python, Django

im using Django to develop a web application. When i try and run it on my web form i am receiving string index out of range error. However, when i hardcode a dictionary in to a python test file it work…

PyQt UI event control segregation

I am a beginner in python but my OOPS concept from Java and Android are strong enough to motivate me in making some tool in python. I am using PyQt for developing the application. In my application th…

How to re-run process Linux after crash? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.This question does not appear to be about a specific programming problem, a software algorithm, or s…

Number of pairs

I am trying to write a code that takes m. a, a list of integers n. b, an integer and returns the number of pairs (m,n) with m,n in a such that |m-n|<=b. So far, Ive got this def nearest_pairs(a, b):…

xml.etree.ElementTree.ParseError: not well-formed

I have the following code:from xml.etree import ElementTreefile_path = some_file_pathdocument = ElementTree.parse(file_path, ElementTree.XMLParser(encoding=utf-8))If my XML looks like the following it …

Convert nested XML content into CSV using xml tree in python

Im very new to python and please treat me as same. When i tried to convert the XML content into List of Dictionaries Im getting output but not as expected and tried a lot playing around.XML Content<…

How to decode binary file with for index, line in enumerate(file)?

I am opening up an extremely large binary file I am opening in Python 3.5 in file1.py:with open(pathname, rb) as file:for i, line in enumerate(file):# parsing hereHowever, I naturally get an error beca…

how to install pyshpgeocode from git [duplicate]

This question already has answers here:The unauthenticated git protocol on port 9418 is no longer supported(10 answers)Closed 2 years ago.I would like to install the following from Git https://github.c…

How to export dictionary as CSV using Python?

I am having problems exporting certain items in a dictionary to CSV. I can export name but not images (the image URL).This is an example of part of my dictionary: new = [{ "name" : "pete…