Xarray mfdataset使用cfgrib引擎组合具有不同变量的文件



我有一个文件夹,其中有几个扩展名为.grib2的文件,其中一些有tcc变量(云覆盖),而其他没有。我想用这个变量打开一个数组中的所有文件,但它给出了一个错误。我一次只能打开一个具有tcc变量的文件。我如何编辑上面的代码,只打开具有tcc变量和连接的文件?

#!/usr/bin/env python
# coding: utf-8
# In[2]:

import os, sys
import xarray as xr
import pygrib
import pandas as pd
import windpowerlib
import numpy as np
from datetime import datetime, timedelta
import datetime
import warnings
warnings.filterwarnings('ignore')
import metpy
import metpy.calc as mpcalc
from metpy.units import units
from eccodes import *

#Colocar aqui o caminho dos arquivos do GFS

path_list = '/media/william/PhD/DownloadRadiation/GFS/20200716/gfs*.grib2'

low_cloud  = xr.open_mfdataset(path_list, concat_dim='valid_time', decode_times=False, combine='nested', engine='cfgrib', backend_kwargs={ 'filter_by_keys':{ 'cfVarName': 'tcc', 'typeOfLevel': 'lowCloudLayer'},'indexpath':''})

但是它给了我一个空数组。我怎样才能正确读取所有文件?

如果我修改上面的代码只读取变量级别,我会得到以下错误:

low_cloud  = xr.open_mfdataset(path_list, concat_dim='valid_time', decode_times=False, combine='nested', engine='cfgrib', backend_kwargs={ 'filter_by_keys':{'typeOfLevel': 'lowCloudLayer'},'indexpath':''})
Unexpected exception formatting exception. Falling back to standard exception
Traceback (most recent call last):
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3378, in run_code
else 0x0
File "/tmp/ipykernel_3037918/493588173.py", line 1, in <module>
low_cloud  = xr.open_mfdataset(path_list, concat_dim='valid_time', decode_times=False, combine='nested', engine='cfgrib',
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/xarray/backends/api.py", line 1003, in open_mfdataset
if parallel:
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/xarray/core/combine.py", line 365, in _nested_combine
combined = _combine_nd(
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/xarray/core/combine.py", line 239, in _combine_nd
combined_ids = _combine_all_along_first_dim(
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/xarray/core/combine.py", line 275, in _combine_all_along_first_dim
new_combined_ids[new_id] = _combine_1d(
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/xarray/core/combine.py", line 298, in _combine_1d
combined = concat(
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/xarray/core/concat.py", line 243, in concat
fill_value=fill_value,
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/xarray/core/concat.py", line 504, in _dataset_concat
grouped = {
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/xarray/core/merge.py", line 302, in merge_collected
merged_vars[name] = unique_variable(
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/xarray/core/merge.py", line 156, in unique_variable
raise MergeError(
xarray.core.merge.MergeError: conflicting values for variable 'time' on objects to be combined. You can skip this check by specifying compat='override'.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 1997, in showtraceback
sys.last_traceback = tb
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/IPython/core/ultratb.py", line 1112, in structured_traceback
etype, value, tb = sys.exc_info()
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/IPython/core/ultratb.py", line 1006, in structured_traceback
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/IPython/core/ultratb.py", line 859, in structured_traceback
evalue: Optional[BaseException],
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/IPython/core/ultratb.py", line 793, in format_exception_as_a_whole
pass
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/IPython/core/ultratb.py", line 848, in get_records
formatter = None
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/stack_data/core.py", line 597, in stack_data
yield from collapse_repeated(
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/stack_data/utils.py", line 84, in collapse_repeated
else:
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/stack_data/core.py", line 587, in mapper
return cls(f, options)
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/stack_data/core.py", line 551, in __init__
self.executing = Source.executing(frame_or_tb)
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/executing/executing.py", line 328, in executing
# type: (Union[types.TracebackType, types.FrameType]) -> "Executing"
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/executing/executing.py", line 250, in for_frame
self._nodes_by_line = defaultdict(list)
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/executing/executing.py", line 278, in for_filename
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/executing/executing.py", line 288, in _for_filename_and_lines
filename = str(filename)
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/stack_data/core.py", line 97, in __init__
return [
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/executing/executing.py", line 392, in asttokens
# classes have a mappingproxy preventing us from using setdefault
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/asttokens/asttokens.py", line 73, in __init__
self._line_numbers.line_to_offset(*start),
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/asttokens/asttokens.py", line 86, in mark_tokens
return self._text[start: end]
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/asttokens/mark_tokens.py", line 61, in visit_tree
util.visit_tree(node, self._visit_before_children, self._visit_after_children)
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/asttokens/util.py", line 246, in visit_tree
``par_value`` is as returned from ``previsit()`` of the parent, and ``value`` is as
File "/home/william/anaconda3/envs/WRF/lib/python3.9/site-packages/asttokens/mark_tokens.py", line 87, in _visit_after_children
if util.is_empty_astroid_slice(child):
AttributeError: module 'asttokens.util' has no attribute 'is_empty_astroid_slice'

open_mfdataset基本上为您执行两个步骤,首先打开所有数据集,然后进行任何必要的合并/连接。由于这可能导致对幕后发生的事情的混淆,因此在尝试在Xarray中构建复杂的多文件数据集时,这里有一些一般性建议。

  1. 从小处开始,逐渐增加复杂度
  2. 单独打开数据集,并在将它们扔给open_mfdataset之前尝试手动连接/合并它们。
import glob
import xarray as xr
files = glob.glob('path/to/files/*.grib2')
# try merging or concatenating the first two datasets
ds0 = xr.open_dataset(files[0])
ds1 = xr.open_dataset(files[1])
# merge, see options here: https://docs.xarray.dev/en/stable/generated/xarray.merge.html
ds_merged = xr.merge([ds0, ds1], ...)
# or concat, see options here: https://docs.xarray.dev/en/stable/generated/xarray.concat.html
ds_concat = xr.concat([ds0, ds1], dim='time', ... )

然后,一旦你找到了一个模式来组合你的数据,尝试open_mfdataset与这些参数。例如:

xr.open_mfdataset(files, concat_dim='time', combine='by_coords', ...)

最新更新