I have a small project working on web-scraping Google search with a list of keywords. I have built a nested For loop for scraping the search results. The problem is that a for loop for searching keywords in the list does not work as I intended to, which is scraping the data from each searching result. The results get only the result of the last keyword, except for the first two search results.
Here is the code:
browser = webdriver.Chrome(r"C:\...\chromedriver.exe")df = pd.DataFrame(columns = ['ceo', 'value'])baseUrl = 'https://www.google.com/search?q='html = browser.page_source
soup = BeautifulSoup(html)ceo_list = ["Bill Gates", "Elon Musk", "Warren Buffet"]
values =[]for ceo in ceo_list:browser.get(baseUrl + ceo)r = soup.select('div.g.rhsvw.kno-kp.mnr-c.g-blk')df = pd.DataFrame()for i in r:value = i.select_one('div.Z1hOCe').text ceo = i.select_one('.kno-ecr-pt.PZPZlf.gsmt.i8lZMc').text values = [ceo, value]s = pd.Series(values)df = df.append(s,ignore_index=True)print(df)
The output:
0 1
0 Warren Buffet Born: October 28, 1955 (age 64 years), Seattle...
The output that I am expecting is as this:
0 1
0 Bill Gates Born:..........
1 Elon Musk Born:...........
2 Warren Buffett Born: August 30, 1930 (age 89 years), Omaha, N...Any suggestions or comments are welcome here.