Web scraping, cant get the href of a tag

2024/7/7 3:03:46

I'm trying to scrape this Page https://rarity.tools/thecryptodads Using Selenium in python.

At the top of the right of each card below, there's the owner name that contains a link once pressed, it takes you to that owner's page.

When I inspect the element I can clearly see the a tag with the href link as shown below:

enter image description here

However, When I try to scrape it. it gets neither that text within the a tag nor the href. I tried to get the div above it which contains this a tag along with another div that contains the number on the card located top left, but when I get the innerText of the div. it only gets the text of the first div AKA the number. (prints 1 for the first card).

here's the code on how I'm trying to get the link:

PATH = "C:\Program Files (x86)\chromedriver"
driver = webdriver.Chrome(PATH)
driver.implicitly_wait(10)
driver.get("https://rarity.tools/thecryptodads")try:click = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "/html/body/div/div/div/div[2]/div[2]/div[8]/div[1]/div[1]/div/div[1]/a")))print(click.text)
except:print()

I tried to get the item by class name, css selector, xpath, full xpath. still cant get the href. BUT when I go into the debug mood and go through it line by line, I can see that this object is holding the text I want and it prints it at the end of the execution. which is so weird to me. I assume that this text is using some sort of encryption! that prevents me form scaping it!

Answer

The Website is taking time to load all the attribute values.

There are two ways to get that output.

1: Apply time.sleep(10) after driver.get(URL).

driver.get("https://rarity.tools/thecryptodads")
time.sleep(10)ele = driver.find_element_by_xpath("//div[contains(@class,'flex-1')]/div[2]/div[8]/div[1]/div[1]/div/div[1]/a")
print(ele.get_attribute("href"))
print(ele.text)

2: Apply Explicit wait till the href attribute has a value starting from https.

driver.get("https://rarity.tools/thecryptodads")
wait= WebDriverWait(driver,30)wait.until(EC.presence_of_element_located((By.XPATH,"//div[contains(@class,'flex-1')]/div[2]/div[8]/div[1]/div[1]/div/div[1]/a[contains(@href,'https')]")))
ele = driver.find_element_by_xpath("//div[contains(@class,'flex-1')]/div[2]/div[8]/div[1]/div[1]/div/div[1]/a")
print(ele.get_attribute("href"))
print(ele.text)

Output for both 1 and 2:

https://opensea.io/accounts/0x8f612b1a1afcc4a55879bb02212454ae79cab04b?ref=0x5c5321ae45550685308a405827575e3d6b4a84aa
0x8f61

Its better to go for Relative xpath compared to Absolute.

https://en.xdnf.cn/q/119760.html

Related Q&A

Using Python Pandas to fill new table with NaN values

Ive imported data from a csv file which has columns NAME, ADDRESS, PHONE_NUMBER. Sometimes, at least 1 of the columns has a missing value for that row. e.g0 - Bill, Flat 2, 555123 1 - Katie, NaN, NaN 2…

sympy AttributeError: Pow object has no attribute sin

I have read this SO post which says namespace conflict is one reason for this error. I am falling to this error frequently. So, Id like to learn what exactly is happening here? What is expected by the…

Tkinter unbinding key event issue

In the code below, pressing the space bar twice results in two successive beeps. I want to avoid this and instead disable the key while the first beep is happening. I thought unbinding the space key mi…

Is there a way to find the largest change in a pandas dataframe column?

Im trying to find the largest difference between i and j in a series where i cannot be before j. Is there an efficient way to do this in pandas:x = [1, 2, 5, 4, 2, 4, 2, 1, 7] largest_change = 0for i i…

Updating scikit-learn to latest version with Anaconda environment fails with http error 000

I use Anaconda3 installed on my pc Win10 64bits. I noticed it runs with an outdated scikit learn version (0.21.3), and I am trying to update it (0.24.1 available on https://repo.anaconda.com/pkgs/main/…

Python RegEx remove new lines (that shouldnt be there)

I got some text extracted and wish to clean it up by RegEx.I have learned basic RegEx but not sure how to build this one:str = this is a line that has been cut. This is a line that should start on a …

Python CSV writer

I have a csv that looks like this:HA-MASTER,CategoryID 38231-S04-A00,14 39790-S10-A03,14 38231-S04-A00,15 39790-S10-A03,15 38231-S04-A00,16 39790-S10-A03,16 38231-S04-A00,17 39790-S10-A03,17 38231-S04-…

How to perform standardization on the data in GridSearchCV?

How to perform standardizing on the data in GridSearchCV?Here is the code. I have no idea on how to do it.import dataset import warnings warnings.filterwarnings("ignore")import pandas as pd …

how to find the permutations of string? python [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.Questions asking for code must demonstrate a minimal understanding of the problem being solved. Incl…

Unicode category for commas and quotation marks

I have this helper function that gets rid of control characters in XML text:def remove_control_characters(s): #Remove control characters in XML textt = ""for ch in s:if unicodedata.category(c…