How can I stop find_next_sibling() once I reach a certain tag?

2024/11/19 13:40:57

I am scraping athletic.net, a website that stores track and field times. So far I have printed event titles and times, but my output contains all times from that season rather than only times for that specific event. I am using a for loop with an arbitrary number of loops, but instead I would like to find_next_sibling() until that sibling is an h5 tag, because h5 tags are the titles of each event. In short, how can I stop my for loop when find_next_sibling is an h5 tag? I think this should be a simple while loop, but I have struggled to implement it.

for text in soup.find_all('h5'):if "Season" in str(text):text_file.write(('\n' + '\n' + str(text.contents[0])) + '\n')else:text_file.write(str(text.contents[0]) + '\n')block = ""for i in range(0,100):try:text = text.find_next_sibling()block = block + str(text) + '\n'except:print("miss")soupBlock = BeautifulSoup(block)for t in soupBlock.select('tr td:nth-of-type(2) [href^="/result"]'):text_file.write(str(t.contents[0]) + '\n')

Output:

2021 Outdoor Season 800 Meters
2:14.81
2:12.32
4:43.62
4:44.21
4:42.11
10:26.85
10:09.89
10:21.49
1600 Meters
4:43.62
4:44.21
4:42.11
10:26.85
10:09.89
10:21.49
3200 Meters
10:26.85
10:09.89
10:21.49

Desired output:

2021 Outdoor Season 800 Meters
2:14.81
2:12.32
1600 Meters
4:43.62
4:44.21
4:42.11
3200 Meters
10:26.85
10:09.89
10:21.49
Answer

This is a very simple problem, I was overthinking it. I simply had to check for an h5 tag when sifting through the siblings.

for i in range(0,100):try:text = text.find_next_sibling()block = block + str(text) + '\n'if text.name == 'h5':breakexcept:print("miss")
https://en.xdnf.cn/q/118540.html

Related Q&A

How can I make a map editor?

I am making a map editor. For 3 tiles I have to make 3 classes: class cloud:def __init__(self,x,y,height,width,color):self.x = xself.y = yself.height = heightself.width = widthself.color = colorself.im…

Counting total number of unique characters for Python string

For my question above, Im terribly stuck. So far, the code I have come up with is:def count_bases():get_user_input()amountA=get_user_input.count(A)if amountA == 0:print("wrong")else:print (&q…

adding a newly created and uploaded package to pycharm

I created a package (thompcoUtils) on test.pypi.org and pypi.org https://pypi.org/project/thompcoUtils/ and https://test.pypi.org/project/thompcoUtils/ show the package is installed in both the test an…

Using builtin name as local variable but also as builtin [duplicate]

This question already has answers here:UnboundLocalError trying to use a variable (supposed to be global) that is (re)assigned (even after first use)(14 answers)Closed 1 year ago.I have the following f…

How to print the results of a SQLite query in python?

Im trying to print the results of this SQLite query to check whether it has stored the data within the database. At the moment it just prints None. Is there a way to open the database in a program like…

python sort strings with leading numbers alphabetically

I have a list of filenames, each of them beginning with a leading number:10_file 11_file 1_file 20_file 21_file 2_file ...I need to put it in this order:1_file 10_file 11_file 2_file 21_file 22_file ..…

Javascript is not recognizing a Flask variable

Im passing a set of variables into a Flask template, and I would like to first manipulate them with Javascript. The problem is that when I use the {{ var }} syntax, Javascript isnt recognizing it. The …

Float sum broken? [duplicate]

This question already has answers here:Is floating-point math broken?(36 answers)Closed 9 years ago.print(0.1 + 0.2 == 0.3)returnsFalseWhy?

SyntaxError: EOL while scanning string literal -Python [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.This question was caused by a typo or a problem that can no longer be reproduced. While similar q…

Formatting a return value from a serial device

I am reading a value from a device over serial, and the return value has the format: [Theoretical position in mm, Encoder position in mm], for example, b\r#-0.001504,-0.001516\n I would like to format …