scrape text in python from https://brainly.co.id/tugas/148

2024/10/5 23:31:05

scrape "Jawaban terverifikasi ahli" in green box from the url https://brainly.co.id/tugas/148, possibly the color of green tick icon to the left of it also(tag <use xlink:href="#icon-verified"></use>)

code

from pydash import get
from bs4 import BeautifulSoup
import requestsr=requests.get('https://brainly.co.id/tugas/148')
r=r.content#text
bsoup=BeautifulSoup(r,'html.parser')
for xlink_href in bsoup.find_all('use'):if xlink_href.has_attr('xlink:href'):print(xlink_href)icon=get(xlink_href,'xlink:href')green=(icon=='#icon-verified')if green: print('verified found ',green)
#<use xlink:href="#icon-verified"></use>
l=bsoup.find_all('h3')
print([i.text for i in l])
#<h3 class="sg-headline">Jawaban terverifikasi ahli

output

<use xlink:href="#icon-search"></use>
<use xlink:href="#icon-menu"></use>
<use xlink:href="#icon-messages"></use>
<use xlink:href="#icon-plus"></use>
<use xlink:href="#icon-points"></use>
<use xlink:href="#icon-check"></use>
<use xlink:href="#icon-arrow_left"></use>
<use xlink:href="#icon-arrow_right"></use>
<use xlink:href="#icon-plus"></use>
<use xlink:href="#icon-close"></use>
<use xlink:href="#icon-plus"></use>
<use xlink:href="#icon-plus"></use>
<use xlink:href="#icon-plus"></use>
<use xlink:href="#icon-arrow_down"></use>
<use xlink:href="#icon-arrow_up"></use>
['Pertanyaan baru di Biologi', '\nTentang kami\n', '\nBantuan\n', '\nDapatkan App Brainly\n']

not able to get #icon-verified use tag and "Jawaban terverifikasi ahli" from h3 tag

Answer

The SVG icon data is loaded dynamically, we can use Selenium to scrape the page.

Install it with: pip install selenium.

Download the correct ChromeDriver from here.

from selenium import webdriver
from bs4 import BeautifulSoup
from time import sleepURL = "https://brainly.co.id/tugas/148"driver = webdriver.Chrome(r"C:\path\to\chromedriver.exe") 
driver.get(URL)
sleep(5)soup = BeautifulSoup(driver.page_source, "html.parser")
print(soup.find('h3', class_='sg-headline').text)
print(soup.find(id='icon-verified').find('path'))driver.quit()

Output:

Jawaban terverifikasi ahli
<path d="M458.68 177.07l-51.58-25.6-8.93-53.33a24.5 24.5 0 0 0-26.66-19.2l-55.99 8.53L275 49.07a22.52 22.52 0 0 0-31.45 0l-44.66 38.4-55.98-8.53a23.1 23.1 0 0 0-26.93 19.2l-8.93 53.33-51.45 25.6a23.92 23.92 0 0 0-11.2 29.86L71.08 256 46.8 305.07a25.59 25.59 0 0 0 8.94 29.86l51.45 25.6 8.93 53.34a24.6 24.6 0 0 0 26.92 19.2l55.99-8.54 40.25 38.4a22.37 22.37 0 0 0 31.32 0l40-38.4 55.98 8.54a23.1 23.1 0 0 0 27.05-19.2l8.93-53.34 51.46-25.6a21 21 0 0 0 8.93-29.86L443.08 256h-.13l24.66-49.07a25.77 25.77 0 0 0-8.93-29.86zm-112.17 48.76L256 316.34a21.4 21.4 0 0 1-30.17 0L165.5 256a21.33 21.33 0 1 1 30.17-30.17l45.26 45.25 75.42-75.42a21.33 21.33 0 0 1 30.17 30.17z"></path>
https://en.xdnf.cn/q/119013.html

Related Q&A

Percentage of how similar strings are in Python? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic…

A Python dictionary with repeated fields

Im constructing a dictionary with Python to use with a SOAP API.My SOAP API takes an input like this:<dataArray><AccountingYearData><Handle><Year>string</Year></Handle&…

psexec run python script passed from host

I am trying to run a python script on a remote computer via psexec. I am able to connect and run python.exe with the following:C:\test>psexec \\192.168.X.X -u domain\administrator -p password -i C:…

TypeError: main() missing 1 required positional argument: self

My code and error is below and I was trying to understand why I am getting the error and how to fix it. I tried this without self and got another error TypeError: load_data() takes 0 positional argumen…

Python not calling external program

I am having problems with a python program that I wrote. It is actually plpython3u. I am running the program as a Trigger from postgres. I am pretty sure the trigger part works. My test python prog…

Selenium: How do I retry browser/URL when ValueError(No tables found)

I have a code that scrapes oddsportal website. Sometimes while scraping, I get ValueError("No tables found") and when I manually refresh browser, page loads. How do I do it via code? My code…

For loop for web scraping in python

I have a small project working on web-scraping Google search with a list of keywords. I have built a nested For loop for scraping the search results. The problem is that a for loop for searching keywor…

operation on a variable inside a class in python

Im new with oop and python. Ive been trying to do a simple thing: there is class called Foo(),it contains a variable called x which is initially set to zero.>>>a = Foo() >>>a.x >&g…

Print several sentences with different colors

Im trying to print several sentences with different colors, but it wont work, I only got 2 colors, the normal blue and this redimport sys from colorama import init, AnsiToWin32stream = AnsiToWin32(sys.…

Discord bot to send a random image from the chosen file

I am making a discord bot that randomly chooses an image (images) which is in the same directory (Cats) as the python file(cats.py). This is what my code looks like right now: Cats = os.path.join(os.pa…