🐍 An updated notebook is available for this demonstration on GitHub.
Also look at my demonstration using the KDTree method (
scipy.spatial.KDTree.query) to find nearest neighbor for
many points.
I deal a lot with weather model output (specifically HRRR model data) and often need to verify a data point with observational data. Weather stations we get observations from are not located at the exact grid points as the model's grid points. The hard part is finding the index value of a latitude and longitude grid point nearest to the observation location. This is made easy with numpy's
maximum function (not max).
First, you need to have three arrays, one for latitude, one for longitude, and one for the environment variable such as temperature. Load these values into a numpy array. I use pygrib to read model output in grib2 format.
lat = latitude # a 2D numpy array of your latitudes
lon = longitude # a 2D numpy array of your longitudes
temp = temperature # a 2D numpy array of your temperatures, or other variable
Next you need to know the latitude and longitude for the observation point. We are looking for the nearest grid point in the lat and lon arrays for that grid point.
stn_lat = station's latitude # eg. 40
stn_lon = station's longitude # eg. -110
Now find the absolute value of the difference between the station's lat/lon with every point in the grid. This tells us how close a point is to the particular latitude and longitude.
abslat = np.abs(lat-stn_lat)
abslon= np.abs(lon-stn_lon)
If you plot abslat and abslon, you'll see what we have done...we have created two new arrays with that tell us how close each latitude and longitude is to the grid point we are interested in.
|
abslat: the blue line represents the nearest grid latitude line our station is located. |
|
abslon: the blue line represents the nearest grid longitude our station is located. |
Now we need to combine these two results.
We will use numpy.maximum, which takes two arrays and finds the local maximum.
c = np.maximum(abslon,abslat)
Which is an array that looks like this...
|
c: an array expressing the local maximums between the abslat and abslon arrays |
The dark blue point is the nearest grid location to our station. You can find the index location on the grid of this by using the min function.
latlon_idx = np.argmin(c)
Now, this latitude/longitude index value is the index for a flattened array, so when you look for that same index value in, say, your temperature array, you should flatten the array to pluck out the value at that point.
grid_temp = temp.flat[latlon_idx]
If you don't like flattened arrays, you can also get the row/column index like this
x, y = np.where(c == np.min(c))
grid_temp = temp[x[0], y[0]
And there you have it!