Python BS: Fetching rows with and without color attribute

2024/11/17 19:24:10

I have some html that looks like this (this represents rows of data in a table, i.e the data between tr and /tr is one row in a table)

<tr bgcolor="#f4f4f4">
<td height="25" nowrap="NOWRAP">&nbsp;CME_ES&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;07:58:46&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;Connected&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;0&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;0&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;0&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;0&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;07:58:00&nbsp;</td>
**<td height="25" nowrap="NOWRAP" bgcolor="#55aa2a">&nbsp;--:--:--&nbsp;</td>**
<td height="25" nowrap="NOWRAP">&nbsp;0&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;0&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;01:25:00 &nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp; 22:00:00&nbsp;</td>
</tr>
.
.
.
<tr bgcolor="#ffffff">
<td height="25" nowrap="NOWRAP">&nbsp;CME_NQ&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;07:58:46&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;Connected&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;0&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;0&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;191&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;0&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;07:58:01&nbsp;</td>
**<td height="25" nowrap="NOWRAP">&nbsp;--:--:--&nbsp;</td>**
<td height="25" nowrap="NOWRAP">&nbsp;0&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;0&nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp;01:25:00 &nbsp;</td>
<td height="25" nowrap="NOWRAP">&nbsp; 22:00:00&nbsp;</td>
</tr>

I have code that grabs the color from each row set:

mrkt_stat = []
for td in site.findAll('td'):if 'bgcolor' in td.attrs:mrkt_stat.append(td.attrs['bgcolor'])

Issue is that when the row set has no bgcolor attribute, no data is added to mrkt_stat list.

How do I scrape this so that even if a row has no bgcolor attr, it will still be added to the list as NULL or N/A?

It is useful to know that the bgcolor attr (that may or may not be present) will always appear in the 9th line of a row set whether that row has the attr or not (look at the html lines enclosed with **)

EDIT: Output should look like the following (a list of all color attrs from row 9 of each row set and display 'N/A' if there is no color attr present):

['#55aa2a',...,'N/A'] 
Answer

You could add an else statement to your if statement:

mrkt_stat = []for td in site.findAll('td'):if 'bgcolor' in td.attrs:mrkt_stat.append(td.attrs['bgcolor'])else:mrkt_stat.append('N/A')
https://en.xdnf.cn/q/118767.html

Related Q&A

Python multiple number guessing game

I am trying to create a number guessing game with multiple numbers. The computer generates 4 random numbers between 1 and 9 and then the user has 10 chances to guess the correct numbers. I need the fee…

How to produce a graph of connected data in Python?

Lets say I have a table of words, and each word has a "related words" column. In practice, this would probably be two tables with a one-to-many relationship, as each word can have more than o…

Syntax for reusable iterable?

When you use a generator comprehension, you can only use the iterable once. For example.>>> g = (i for i in xrange(10)) >>> min(g) 0 >>> max(g) Traceback (most recent call la…

Buildozer Problem. I try to make apk file for android, but i cant

artur@DESKTOP-SMKQONQ:~/Suka$ lsbuildozer.spec main.pyartur@DESKTOP-SMKQONQ:~/Suka$ buildozer android debugTraceback (most recent call last):File "/usr/local/bin/buildozer", line 10, in <…

how to run python script with ansible-playbook?

I want to print result in ansible-playbook but not working. python script: #!/usr/bin/python3import timewhile True:print("Im alive")time.sleep(5)deploy_python_script.yml:connection: localbeco…

How to concatenate pairs of row elements into a new column in a pandas dataframe?

I have this DataFrame where the columns are coordinates (e.g. x1,y1,x2,y2...). The coordinate columns start from the 8th column (the previous ones are irrelevant for the question) I have a larger exam…

Python: using threads to call subprocess.Popen multiple times

I have a service that is running (Twisted jsonrpc server). When I make a call to "run_procs" the service will look at a bunch of objects and inspect their timestamp property to see if they s…

Find a substring [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.Want to improve this question? Update the question so it focuses on one problem only by editing this post.Closed 5…

Image processing with single to multiple images

I have an Image showing below: I need to crop the order using python coding. What I need is only the card. So I want to crop the border. How to do it??This is the output I got using the code mentione…

SQLAlchemy Automap not loading table

I am using SQLAlchemy version 2.0.19 (latest public release). I am trying to map existing tables as documented in https://docs.sqlalchemy.org/en/20/orm/extensions/automap.html#basic-use I created a SQL…