How to measure pairwise distances between two sets of points?

2024/7/8 7:06:29

I have two datasets (csv files). Both of them contains latitudes-longitudes of two sets (220 and 4400) of points. Now I want to measure pairwise distances (miles) between these two sets of points (220 x 4400). How can I do that in python? Similar to this problem: https://gist.github.com/rochacbruno/2883505

Example of one dataset

Answer

Best is to use sklearn which has exactly what you ask for.

Say we have some sample data

towns = pd.DataFrame({"name" : ["Merry Hill", "Spring Valley", "Nesconset"],"lat" : [36.01, 41.32, 40.84],"long" : [-76.7, -89.20, -73.15]
})museum = pd.DataFrame({"name" : ["Motte Historical Car Museum, Menifee", "Crocker Art Museum, Sacramento", "World Chess Hall Of Fame, St.Louis", "National Atomic Testing Museum, Las", "National Air and Space Museum, Washington", "The Metropolitan Museum of Art", "Museum of the American Military Family & Learning Center"],"lat" : [33.743511, 38.576942, 38.644302, 36.114269, 38.887806, 40.778965, 35.083359],"long" : [-117.165161, -121.504997, -90.261154, -115.148315, -77.019844, -73.962311, -106.381531]
})

You can use sklearn distance metrics, which has the haversine implemented

from sklearn.neighbors import DistanceMetricdist = DistanceMetric.get_metric('haversine')

After you extract the numpy array values with

places_gps = towns[["lat", "long"]].values
museum_gps = museum[["lat", "long"]].values

you simply

EARTH_RADIUS = 6371.009haversine_distances = dist.pairwise(np.radians(places_gps), np.radians(museum_gps) )
haversine_distances *= EARTH_RADIUS

to get the distances in KM. If you need miles, multiply with constant.

If you are only interested in the closest few, or all within radius, check out sklearn BallTree algorithm which also has the haversine implemented. It is much faster.


Edit: To convert the output to a dataframe use for instance

pd_distances = pd.DataFrame(haversine_distances, columns=museum.name, index=towns.name, )
pd_distances
https://en.xdnf.cn/q/119983.html

Related Q&A

Interactively Re-color Bars in Matplotlib Bar Chart using Confidence Intervals

Trying to shade the bars in this chart based on the confidence that a selected y-value (represented by the red line) lies within a confidence interval. See recolorBars() method in the class example bel…

Unlock password protected Workbook using VBA or Python

I have a workbook name m.xlsx, but its password protected and Ive forgotten the password. How can I open it or un-protect it?The following code does not work:Unprotect workbook without password I need…

How do I make a variable detect if it is greater than or less than another one?

I am currently learning Python, and I decided to build a small "Guess the Number" type of game. I am using the random feature, and trying to make it so it will detect if the users input is eq…

Python Regular Expression from File

I want to extract lines following some sequence from a file. E.g. a file contains many lines and I want line in sequencejourney (a,b) from station south chennai to station punjab chandigarh journey (c,…

Changing the words keeping its meaning intact [closed]

Its difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying thi…

How to create index for a SQLite3 database using SQLAlchemy?

I have multiple SQLite3 databases for which the models are not available. def index_db(name, tempdb):print(f{name.ljust(padding)} Indexing file: {tempdb})if tempdb.endswith(primary.sqlite):conn = sqlit…

Implementing ast.literal_eval on a numpy array

With the following expression, you can convert a string to a python dict.>>> import ast >>> a = ast.literal_eval("{muffin : lolz, foo : kitty}") >>> a {muffin: lolz…

Best way to make argument parser accept absolute number and percentage?

I am trying to write a Nagios style check to use with Nagios. I have working script that takes in something like -w 15 -c 10 and interprets that as "Warning at 15%, Critical at 10%". But I ju…

Python calculating prime numbers

I have to define a function called is_prime that takes a number x as input, then for each number n from 2 to x - 1, test if x is evenly divisible by n. If it is, return False. If none of them are, then…

Why am I getting a column does not exist error when it does exist? I am modifying the Flask tutorial

I have a column named ticker_symbol, but I am getting a error when I run the error that there is no such column. Here is my auth.py code so far. It is similar to the Flask tutorial code. I get my get_d…