Scrape latitude and longitude (Google Maps) inside Script type=text/javascript

2024/11/19 17:20:29

I'm beginner in Web Scrapping. I'm trying to get latitude and longitude from this web:

https://urbania.pe/inmueble/proyecto/ememhvin-proyecto-mariscal-castilla-lima-santiago-de-surco-tale-inmobiliaria-65659522

A part containing such data is:

<script type="text/javascript"> ==$0const POSTING = {{[...] "locationId":"V1-B-4368","name":"Lima","label":"PROVINCIA","depth":1,"parent":{"locationId":"V1-A-111","name":"Peru urbania","label":"PAIS","depth":0,"parent":null,"acronym":null},"acronym":null},"acronym":null},"acronym":null},"postingGeolocation":{"geolocation":{"latitude":-12.133920500000000,"longitude":-77.014942900000000},[...]<script>

I'm trying to do, but not works:

import requests
import pandas as pd
import re
from bs4 import BeautifulSoup
import time
from selenium.webdriver.common.keys import Keys
from selenium import webdriver
import urllib.parsesa_key = 'ea69223fa47f72fac0907759' # TOKEN from a web 
sa_api = 'https://api.scrapingant.com/v2/general'page='https://urbania.pe/inmueble/proyecto/ememhvin-proyecto-mariscal-castilla-lima-santiago-de-surco-tale-inmobiliaria-65659522'qParams = {'url':page , 'x-api-key': sa_key}  #OJO: aqui tener cuidado con /proyecto/ y /clasificado/  , estructura para 1°
reqUrl = f'{sa_api}?{urllib.parse.urlencode(qParams)}'  r = requests.get(reqUrl)
soup = BeautifulSoup(r.content, 'html.parser')list_geolocalization=[]# trying to get latitude and lingitude
geolocalization=soup.find_all('script',{'type': 'text/javascript'})for tag in geolocalization:list_geolocalization.append(tag.find('latitude'))df_geolocalization=pd.DataFrame(list_geolocalization,columns = ["geolocalization"])#other
lat, long=re.findall(r'(?is)("latitude":|"longitude":)([0-9.]+)',geolocalization)

Can someone help me please? Thanks in advance!

Answer

In this situation, You can take advantages of Regular Expression as follows:

import requests
from bs4 import BeautifulSoup
import re
import jsonheaders = {'User-Agent': 'Mozilla/5.0'}
url = "https://urbania.pe/inmueble/proyecto/ememhvin-proyecto-mariscal-castilla-lima-santiago-de-surco-tale-inmobiliaria-65659522"response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
print(response)
r = re.search(r'const POSTING = {.*}',str(soup))
if r:j = json.loads(r.group(0).replace('const POSTING = ', ''))lat = j.get('postingLocation', {}).get('postingGeolocation', {}).get('geolocation', {}).get('latitude')print(lat)long = j.get('postingLocation', {}).get('postingGeolocation', {}).get('geolocation', {}).get('longitude')print(long)
else:print("No match found.")

Outout:

-12.1339205
-77.0149429
https://en.xdnf.cn/q/118514.html

Related Q&A

How to delete a button that is made by a loop

from tkinter import *class Main:def __init__(self, root):for i in range(0, 9):for k in range(0, 9):Button(root, text=" ").grid(row=i, column=k)root.mainloop()root = Tk()x = Main(root)How do I…

Invalid array shape with neural network using Keras?

Currently studying the Deep Learning with Python book by Francios Chollet. I am very new to this and I am getting this error code despite following his code verbatim. Can anyone interpret the error mes…

How to download PDF files from a list of URLs in Python?

I have a big list of links to PDF files that I need to download (500+) and I was trying to make a program to download them all because I dont want to manually do them. This is what I have and when I tr…

Training on GPU much slower than on CPU - why and how to speed it up?

I am training a Convolutional Neural Network using Google Colabs CPU and GPU. This is the architecture of the network: Model: "sequential" ____________________________________________________…

Check list item is present in Dictionary

Im trying to extend Python - Iterate thru month dates and print a custom output and add an addtional functionality to check if a date in the given date range is national holiday, print "NH" a…

a list of identical elements in the merge list

I need to merge the list and have a function that can be implemented, but when the number of merges is very slow and unbearable, I wonder if there is a more efficient way Consolidation conditions:Sub-…

How To Get A Contour Of More/Less Of The Expected Area In OpenCV Python

I doing some contour detection on a image and i want to find a contour based on a area that i will fix in this case i want the contour marked in red. So i want a bounding box around the red contour Fol…

Storing output of SQL Query in Python Variable

With reference to this, I tried modifying my SQL query as follows:query2 ="""insert into table xyz(select * from abc where date_time > %s and date_time <= ( %s + interval 1 hour))&…

file modification and creation

How would you scan a dir for a text file and read the text file by date modified, print it to screen having the script scan the directory every 5 seconds for a newer file creadted and prints it. Is it …

How to share a file between modules for logging in python

I wanted to log messages from different module in python to a file. Also I need to print some messages to console for debugging purpose. I used logger module for this purpose . But logger module will l…