Geocoding using Geopy and Python

2024/9/8 10:40:25

I am trying to Geocode a CSV file that contains the name of the location and a parsed out address which includes Address number, Street name, city, zip, country. I want to use GEOPY and ArcGIS Geocodes through Geopy.I wanted to create a code that loops through my csv of 5000+ entries and gives me the latitude and longitude in separate columns in my CSV. I want to use ArcGIS Geocoding service through Geopy. Can anyone provide me with a code to get started? Thanks!

Here is my script:

import csv
from geopy.geocoders import ArcGISgeolocator = ArcGIS()     # here some parameters are neededwith open('C:/Users/v-albaut/Desktop/Test_Geo.csv', 'rb') as csvinput:with open('output.csv', 'w') as csvoutput:output_fieldnames = ['Name','Address', 'Latitude', 'Longitude']writer = csv.DictWriter(csvoutput, delimiter=',', fieldnames=output_fieldnames)reader = csv.DictReader(csvinput)for row in reader:# here you have to replace the dict item by your csv column namesquery = ','.join(str(x) for x in (row['Name'], row['Address']))Address, (latitude, longitude) = geolocator.geocode(query)# here is the writing sectionoutput_row = {}output_row['Name'] = Nameoutput_row['Address'] = Addressoutput_row['Latitude'] = Latitudeoutput_row['Longitude'] =Longitudewriter.writerow(output_row)
Answer

I've been using this script to do some batch-geocoding from .csv. It requires that one column contain the complete text address that you wish to geocode, and that one column be titled 'UniqueID', which has a unique identifier for each item in the .csv. It will also print out a list of any addresses that it failed to geocode. It also does a quick check to see if the zip code might be incorrect/throwing off the geocoding:

def main(path, filename):
# path to where your .csv lives, and the name of the csv.import geopyfrom geopy.geocoders import ArcGISimport pandas as pdTarget_Addresses = pd.read_csv(path+'\\'+filename)Target_Addresses['Lat'] = np.nanTarget_Addresses['Long'] = np.nanIndexed_Targets = Target_Addresses.set_index('UniqueID')geolocator = ArcGIS() #some parameters hereFails = []for index, row in Indexed_Targets.iterrows():Address = row['Address']Result = geolocator.geocode(Address)if Result == None:Result = geolocator.geocode(Address[:-7])if Result == None:Fails.append[Address]else:Indexed_Targets.set_value(index, 'Lat', Result.latitude)Indexed_Targets.set_value(index, 'Long', Result.longitude)else:Indexed_Targets.set_value(index, 'Lat', Result.latitude)Indexed_Targets.set_value(index, 'Long', Result.longitude)for address in Fails:print addressIndexed_Targets.to_csv(filename[:-4]+"_RESULTS.csv")if __name__ == '__main__':main(path, filename) # whatever these are for you...

This will output a new csv with "_RESULTS" (e.g., an input of 'addresses.csv' will output 'addresses_RESULTS.csv') with two new columns for 'Lat' and 'Long'.

https://en.xdnf.cn/q/72428.html

Related Q&A

Making Python scripts work with xargs

What would be the process of making my Python scripts work well with xargs? For instance, I would like the following command to work through each line of text file, and execute an arbitrary command:c…

TypeError: expected string or buffer in Google App Engines Python

I want to show the content of an object using the following code:def get(self):url="https://www.googleapis.com/language/translate/v2?key=MY-BILLING-KEY&q=hello&source=en&target=ja&quo…

Returning a row from a CSV, if specified value within the row matches condition

Ahoy, Im writing a Python script to filter some large CSV files.I only want to keep rows which meet my criteria.My input is a CSV file in the following formatLocus Total_Depth Average_Depth_sa…

Python multiprocessing pool: dynamically set number of processes during execution of tasks

We submit large CPU intensive jobs in Python 2.7 (that consist of many independent parallel processes) on our development machine which last for days at a time. The responsiveness of the machine slows …

TypeError: cant escape psycopg2.extensions.Binary to binary

I try to store binary file into postgresql through sqlalchemy and file is uploaded from client. A bit google on the error message brings me to this source file:" wrapped object is not bytes or a…

Keras: Cannot Import Name np_utils [duplicate]

This question already has answers here:ImportError: cannot import name np_utils(19 answers)Closed 6 years ago.Im using Python 2.7 and a Jupyter notebook to do some basic machine learning. Im following…

Python 3 string index lookup is O(1)?

Short story:Is Python 3 unicode string lookup O(1) or O(n)?Long story:Index lookup of a character in a C char array is constant time O(1) because we can with certainty jump to a contiguous memory loca…

Using PIL to detect a scan of a blank page

So I often run huge double-sided scan jobs on an unintelligent Canon multifunction, which leaves me with a huge folder of JPEGs. Am I insane to consider using PIL to analyze a folder of images to detec…

Pandas: Filling data for missing dates

Lets say Ive got the following table:ProdID Date Val1 Val2 Val3 Prod1 4/1/2019 1 3 4 Prod1 4/3/2019 2 3 54 Prod1 4/4/2019 3 4 54 Prod2 4/1/2019 1 3 3…

Linear Regression: How to find the distance between the points and the prediction line?

Im looking to find the distance between the points and the prediction line. Ideally I would like the results to be displayed in a new column which contains the distance, called Distance.My Imports:impo…