Beautifulsoup results to pandas dataframe

2024/11/16 14:58:44

The below code returns me a table with the following results

r = requests.get(url)
soup = bs4.BeautifulSoup(r.text, 'lxml')mylist = soup.find(attrs={'class': 'table_grey_border'})
print(mylist)

results - it stretches on for 1700 rows

<table cellpadding="0" cellspacing="2" class="table_grey_border" width="100%">
<tr valign="top">
<td class="verd_black12" width="18%"><b>STOCK CODE</b></td>
<td class="verd_black12" width="42%"><b>NAME OF LISTED SECURITIES</b></td>
<td class="verd_black12" width="19%"><b>BOARD LOT</b></td>
<td class="verd_black12" colspan="4" width="12%"><b>REMARK</b></td>
</tr>
<tr class="tr_normal">
<td class="verd_black12" width="18%">00001</td>
<td class="verd_black12" width="42%"><a href="../../../invest/company/profile_page_e.asp?WidCoID=00001&amp;WidCoAbbName=&amp;Month=&amp;langcode=e" target="_parent">CKH HOLDINGS</a></td>
<td class="verd_black12" width="19%">500</td>
<td align="center" class="verd_black12" width="3%">#</td>
<td align="center" class="verd_black12" width="3%">H</td>
<td align="center" class="verd_black12" width="3%">O</td>
<td align="center" class="verd_black12" width="3%">F</td>
</tr>
<tr class="tr_normal">
<td class="verd_black12" width="18%">00002</td>
<td class="verd_black12" width="42%"><a href="../../../invest/company/profile_page_e.asp?WidCoID=00002&amp;WidCoAbbName=&amp;Month=&amp;langcode=e" target="_parent">CLP HOLDINGS</a></td>
<td class="verd_black12" width="19%">500</td>
<td align="center" class="verd_black12" width="3%">#</td>
<td align="center" class="verd_black12" width="3%">H</td>
<td align="center" class="verd_black12" width="3%">O</td>
<td align="center" class="verd_black12" width="3%">F</td>
</tr>
...

My question is, how do I put each of these rows into Pandas Dataframe? I tried the below code, but i'm returned with an error

a = pandas.read_html(mylist)
print(a)

error

TypeError: 'NoneType' object is not callable
Answer

Document:

pandas.read_html(url, attrs={'class': 'table_grey_border'})
https://en.xdnf.cn/q/71643.html

Related Q&A

XGBoost CV and best iteration

I am using XGBoost cv to find the optimal number of rounds for my model. I would be very grateful if someone could confirm (or refute), the optimal number of rounds is: estop = 40res = xgb.cv(params, d…

Whats the correct way to implement a metaclass with a different signature than `type`?

Say I want to implement a metaclass that should serve as a class factory. But unlike the type constructor, which takes 3 arguments, my metaclass should be callable without any arguments:Cls1 = MyMeta()…

Python -- Regex -- How to find a string between two sets of strings

Consider the following:<div id=hotlinklist><a href="foo1.com">Foo1</a><div id=hotlink><a href="/">Home</a></div><div id=hotlink><a…

Kivy TextInput horizontal and vertical align (centering text)

How to center a text horizontally in a TextInput in Kivy?I have the following screen:But I want to centralize my text like this:And this is part of my kv language:BoxLayout: orientation: verticalLabe…

How to capture python SSL(HTTPS) connection through fiddler2

Im trying to capture python SSL(HTTPS) connections through Fiddler2 local proxy. But I only got an error.codeimport requests requests.get("https://www.python.org", proxies={"http": …

removing leading 0 from matplotlib tick label formatting

How can I change the ticklabels of numeric decimal data (say between 0 and 1) to be "0", ".1", ".2" rather than "0.0", "0.1", "0.2" in matplo…

How do I check if an iterator is actually an iterator container?

I have a dummy example of an iterator container below (the real one reads a file too large to fit in memory):class DummyIterator:def __init__(self, max_value):self.max_value = max_valuedef __iter__(sel…

Python Terminated Thread Cannot Restart

I have a thread that gets executed when some action occurs. Given the logic of the program, the thread cannot possibly be started while another instance of it is still running. Yet when I call it a sec…

TypeError: NoneType object is not subscriptable [duplicate]

This question already has an answer here:mysqldb .. NoneType object is not subscriptable(1 answer)Closed 8 years ago.The error: names = curfetchone()[0]TypeError: NoneType object is not subscriptable. …

Where can I find numpy.where() source code? [duplicate]

This question already has answers here:How do I use numpy.where()? What should I pass, and what does the result mean? [closed](2 answers)Closed 4 years ago.I have already found the source for the num…