Question 1

I'm trying to scrape this Page https://rarity.tools/thecryptodads Using Selenium in python.

At the top of the right of each card below, there's the owner name that contains a link once pressed, it takes you to that owner's page.

When I inspect the element I can clearly see the a tag with the href link as shown below:

enter image description here

However, When I try to scrape it. it gets neither that text within the a tag nor the href. I tried to get the div above it which contains this a tag along with another div that contains the number on the card located top left, but when I get the innerText of the div. it only gets the text of the first div AKA the number. (prints 1 for the first card).

here's the code on how I'm trying to get the link:

PATH = "C:\Program Files (x86)\chromedriver"
driver = webdriver.Chrome(PATH)
driver.implicitly_wait(10)
driver.get("https://rarity.tools/thecryptodads")try:click = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "/html/body/div/div/div/div[2]/div[2]/div[8]/div[1]/div[1]/div/div[1]/a")))print(click.text)
except:print()

I tried to get the item by class name, css selector, xpath, full xpath. still cant get the href. BUT when I go into the debug mood and go through it line by line, I can see that this object is holding the text I want and it prints it at the end of the execution. which is so weird to me. I assume that this text is using some sort of encryption! that prevents me form scaping it!

Question 2

The Website is taking time to load all the attribute values.

There are two ways to get that output.

1: Apply time.sleep(10) after driver.get(URL).

driver.get("https://rarity.tools/thecryptodads")
time.sleep(10)ele = driver.find_element_by_xpath("//div[contains(@class,'flex-1')]/div[2]/div[8]/div[1]/div[1]/div/div[1]/a")
print(ele.get_attribute("href"))
print(ele.text)

2: Apply Explicit wait till the href attribute has a value starting from https.

driver.get("https://rarity.tools/thecryptodads")
wait= WebDriverWait(driver,30)wait.until(EC.presence_of_element_located((By.XPATH,"//div[contains(@class,'flex-1')]/div[2]/div[8]/div[1]/div[1]/div/div[1]/a[contains(@href,'https')]")))
ele = driver.find_element_by_xpath("//div[contains(@class,'flex-1')]/div[2]/div[8]/div[1]/div[1]/div/div[1]/a")
print(ele.get_attribute("href"))
print(ele.text)

Output for both 1 and 2:

https://opensea.io/accounts/0x8f612b1a1afcc4a55879bb02212454ae79cab04b?ref=0x5c5321ae45550685308a405827575e3d6b4a84aa
0x8f61

Its better to go for Relative xpath compared to Absolute.

Web scraping, cant get the href of a tag

Related Q&A

Using Python Pandas to fill new table with NaN values

sympy AttributeError: Pow object has no attribute sin

Tkinter unbinding key event issue

Is there a way to find the largest change in a pandas dataframe column?

Updating scikit-learn to latest version with Anaconda environment fails with http error 000

Python RegEx remove new lines (that shouldnt be there)

Python CSV writer

How to perform standardization on the data in GridSearchCV?

how to find the permutations of string? python [closed]

Unicode category for commas and quotation marks