python BeautifulSoup get all href in Children of div

2024/10/5 19:15:34

I am new to python and I've been trying to get links and inner text from this html code :

<div class="someclass"><ul class="listing"><li><a href="http://link1.com" title="">title1</a></li><li><a href="http://link2.com" title="">title2</a></li><li><a href="http://link3.com" title="">title3</a></li><li><a href="http://link4.com" title="">title4</a></li></ul>
</div>

I want only and all links from href http://link.com and the inner text title

i tried this code

    div = soup.find_all('ul',{'class':'listing'})
for li in div:all_li = li.find_all('li')for link in all_li.find_all('a'):print(link.get('href'))

but no luck can someone help me

Answer

The problem is that you are using find_all which returns a list in your second forloop where you should use find()

>>> for ul in soup.find_all('ul', class_='listing'):
...     for li in ul.find_all('li'):
...         a = li.find('a')
...         print(a['href'], a.get_text())
... 
http://link1.com title1
http://link2.com title2
http://link3.com title3
http://link4.com title4

You can also use a CSS selector instead of nested forloop

>>> for a in soup.select('.listing li a'):
...     print(a['href'], a.get_text(strip=True))
... 
http://link1.com title1
http://link2.com title2
http://link3.com title3
http://link4.com title4
https://en.xdnf.cn/q/70453.html

Related Q&A

Python TypeError: sort() takes no positional arguments

I try to write a small class and want to sort the items based on the weight. The code is provided, class Bird:def __init__(self, weight):# __weight for the private variableself.__weight = weightdef wei…

Trouble with basemap subplots

I need to make a plot with n number of basemap subplots. But when I am doing this the all the values are plotted on the first subplot.My data is a set of n matrixes, stored in data_all.f, map = plt.sub…

Remove circular references in dicts, lists, tuples

I have this following really hack code which removes circular references from any kind of data structure built out of dict, tuple and list objects.import astdef remove_circular_refs(o):return ast.liter…

how to change image format when uploading image in django?

When a user uploads an image from the Django admin panel, I want to change the image format to .webp. I have overridden the save method of the model. Webp file is generated in the media/banner folder b…

Write info about nodes to a CSV file on the controller (the local)

I have written an Ansible playbook that returns some information from various sources. One of the variables I am saving during a task is the number of records in a certain MySQL database table. I can p…

Python minimize function: passing additional arguments to constraint dictionary

I dont know how to pass additional arguments through the minimize function to the constraint dictionary. I can successfully pass additional arguments to the objective function.Documentation on minimiz…

PyQt5 triggering a paintEvent() with keyPressEvent()

I am trying to learn PyQt vector painting. Currently I am stuck in trying to pass information to paintEvent() method which I guess, should call other methods:I am trying to paint different numbers to a…

A python regex that matches the regional indicator character class

I am using python 2.7.10 on a Mac. Flags in emoji are indicated by a pair of Regional Indicator Symbols. I would like to write a python regex to insert spaces between a string of emoji flags.For exampl…

Importing modules from a sibling directory for use with py.test

I am having problems importing anything into my testing files that I intend to run with py.test.I have a project structure as follows:/ProjectName | |-- /Title | |-- file1.py | |-- file2.py | …

Uploading and processing a csv file in django using ModelForm

I am trying to upload and fetch the data from csv file uploaded by user. I am using the following code. This is my html form (upload_csv1.html):<form action="{% url myapp:upload_csv %}" me…