How to select specific data variables from xarray dataset

2024/10/9 20:27:28

BACKGROUND

I am trying to download GFS weather data netcdf4 files via xarray & OPeNDAP. Big thanks to Vorticity0123 for their prior post, which allowed me to get the bones of the python script sorted (as below).

PROBLEM

Thing is, the GFS dataset has 195 data variables, But I don't require the majority, I only need ten of them.

  • ugrd100m, vgrd100m, dswrfsfc, tcdcclm, tcdcblcll, tcdclcll, tcdcmcll, tcdchcll, tmp2m, gustsfc

HELP REQUESTED

I've gone through the xarray readthedocs page and elsewhere, but I couldn't figure out a way to narrow down my dataset to only the ten data variables. Does anyone know how to narrow down the list of variables in a dataset?

PYTHON SCRIPT

import numpy as np
import xarray as xr# File Details
dt = '20201124'
res = 25
step = '1hr'
run = '{:02}'.format(18)# URL
URL = f'http://nomads.ncep.noaa.gov:80/dods/gfs_0p{res}_{step}/gfs{dt}/gfs_0p{res}_{step}_{run}z'# Load data
dataset = xr.open_dataset(URL)
time = dataset.variables['time']
lat = dataset.variables['lat'][:]
lon = dataset.variables['lon'][:]
lev = dataset.variables['lev'][:]# Narrow Down Selection
time_toplot = time
lat_toplot = np.arange(-43, -17, 0.5)
lon_toplot = np.arange(135, 152, 0.5)
lev_toplot = np.array([1000])# Select required data via xarray
dataset = dataset.sel(time=time_toplot, lon=lon_toplot, lat=lat_toplot)
print(dataset)
Answer

You can use the dict-like syntax of xarray.

variables = ['ugrd100m','vgrd100m','dswrfsfc','tcdcclm','tcdcblcll','tcdclcll','tcdcmcll','tcdchcll','tmp2m','gustsfc'
]dataset[variables]

Gives you:

<xarray.Dataset>
Dimensions:    (lat: 721, lon: 1440, time: 121)
Coordinates:* time       (time) datetime64[ns] 2020-11-24T18:00:00 ... 2020-11-29T18:00:00* lat        (lat) float64 -90.0 -89.75 -89.5 -89.25 ... 89.25 89.5 89.75 90.0* lon        (lon) float64 0.0 0.25 0.5 0.75 1.0 ... 359.0 359.2 359.5 359.8
Data variables:ugrd100m   (time, lat, lon) float32 ...vgrd100m   (time, lat, lon) float32 ...dswrfsfc   (time, lat, lon) float32 ...tcdcclm    (time, lat, lon) float32 ...tcdcblcll  (time, lat, lon) float32 ...tcdclcll   (time, lat, lon) float32 ...tcdcmcll   (time, lat, lon) float32 ...tcdchcll   (time, lat, lon) float32 ...tmp2m      (time, lat, lon) float32 ...gustsfc    (time, lat, lon) float32 ...
Attributes:title:        GFS 0.25 deg starting from 18Z24nov2020, downloaded Nov 24 ...Conventions:  COARDS\nGrADSdataType:     Gridhistory:      Sat Nov 28 05:52:44 GMT 2020 : imported by GrADS Data Serve...
https://en.xdnf.cn/q/69977.html

Related Q&A

List object has no attribute Values error

I would like to get the data to Excel worksheet. The problem is when I run the whole code I receive an error but when I run it separately no error it works. Here is what I want; from xlwings import Wor…

How to resize an image in python, while retaining aspect ratio, given a target size?

First off part of me feels like this is a stupid question, sorry about that. Currently the most accurate way Ive found of calculating the optimum scaling factor (best width and height for target pixel …

How to limit number of followed pages per site in Python Scrapy

I am trying to build a spider that could efficiently scrape text information from many websites. Since I am a Python user I was referred to Scrapy. However, in order to avoid scraping huge websites, I …

Why is Pythons sorted() slower than copy, then .sort()

Here is the code I ran:import timeitprint timeit.Timer(a = sorted(x), x = [(2, bla), (4, boo), (3, 4), (1, 2) , (0, 1), (4, 3), (2, 1) , (0, 0)]).timeit(number = 1000) print timeit.Timer(a=x[:];a.sort(…

How to efficiently unroll a matrix by value with numpy?

I have a matrix M with values 0 through N within it. Id like to unroll this matrix to create a new matrix A where each submatrix A[i, :, :] represents whether or not M == i.The solution below uses a lo…

Anaconda Python 3.6 -- pythonw and python supposed to be equivalent?

According to Python 3 documentation, python and pythonw should be equivalent for running GUI scripts as of 3.6With older versions of Python, there is one Mac OS X quirk that you need to be aware of: pr…

Good way of handling NoneType objects when printing in Python

How do I go about printin a NoneType object in Python?# score can be a NonType object logging.info("NEW_SCORE : "+score)Also why is that sometime I see a comma instead of the + above?

problems with easy_install pycrypto

Im trying install pycrypto on osx with easy_install and Im getting the following error:easy_install pycrypto Searching for pycrypto Reading http://pypi.python.org/simple/pycrypto/ Reading http://pycryp…

What is the most efficient way to do a sorted reduce in PySpark?

I am analyzing on-time performance records of US domestic flights from 2015. I need to group by tail number, and store a date sorted list of all the flights for each tail number in a database, to be re…

Interactive figure with OO Matplotlib

Using Matplotlib via the OO API is easy enough for a non-interactive backend:from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvasfrom matplotlib.figure import Figurefig = Figure(…