我正在处理加载到Python字典中的JSON数据。其中很多都有可选字段,然后可能包含字典之类的东西。
dictionary1 =
{"required": {"value1": "one", "value2": "two"},
"optional": {"value1": "one"}}
dictionary2 =
{"required": {"value1": "one", "value2": "two"}}
如果我这样做,
dictionary1.get("required").get("value1")
显然,这是有效的,因为场"required"
始终存在。
但是,当我在dictionary2
上使用相同的行(获取可选字段)时,这将产生AttributeError
dictionary2.get("optional").get("value1")
AttributeError: 'NoneType' object has no attribute 'get'
这是有道理的,因为第一个.get()
将返回None
,而第二个.get()
不能在 None 对象上调用.get()
。
如果缺少可选字段,我可以通过给出默认值来解决此问题,但是数据越复杂,这会很烦人,所以我称之为"幼稚的修复":
dictionary2.get("optional", {}).get("value1", " ")
因此,第一个.get()
将返回一个空字典{}
,可以在其上调用第二个.get()
,并且由于它显然不包含任何内容,因此它将返回空字符串,如第二个默认值所定义的那样。
这将不再产生错误,但我想知道是否有更好的解决方案——特别是对于更复杂的情况(value1
包含数组或其他字典等......
我也可以通过 try 解决这个问题 - 除了AttributeError
,但这也不是我的首选方法。
try:
value1 = dictionary2.get("optional").get("value1")
except AttributeError:
value1 = " "
我也不喜欢检查可选字段是否存在,这会产生垃圾代码行,例如
optional = dictionary2.get("optional")
if optional:
value1 = optional.get("value1")
else:
value1 = " "
这看起来非常非Pythonic...
我在想也许我只是链接.get()
的方法首先是错误的?
在你的代码中:
try:
value1 = dictionary2.get("optional").get("value1")
except AttributeError:
value1 = " "
您可以使用括号和except KeyError
:
try:
value1 = dictionary2["optional"]["value1"]
except KeyError:
value1 = " "
如果这对调用方来说太详细,请添加一个帮助程序:
def get_or_default(d, *keys, default=None):
try:
for k in keys:
d = d[k]
except (KeyError, IndexError):
return default
return d
if __name__ == "__main__":
d = {"a": {"b": {"c": [41, 42]}}}
print(get_or_default(d, "a", "b", "c", 1)) # => 42
print(get_or_default(d, "a", "b", "d", default=43)) # => 43
你也可以子类字典并使用元组括号索引,如 NumPy 和 Pandas:
class DeepDict(dict):
def __init__(self, d, default=None):
self.d = d
self.default = default
def __getitem__(self, keys):
d = self.d
try:
for k in keys:
d = d[k]
except (KeyError, IndexError):
return self.default
return d
def __setitem__(self, keys, x):
d = self.d
for k in keys[:-1]:
d = d[k]
d[keys[-1]] = x
if __name__ == "__main__":
dd = DeepDict({"a": {"b": {"c": [42, 43]}}}, default="foo")
print(dd["a", "b", "c", 1]) # => 43
print(dd["a", "b", "c", 11]) # => "foo"
dd["a", "b", "c", 1] = "banana"
print(dd["a", "b", "c", 1]) # => "banana"
但是,如果这让其他开发人员感到困惑,并且您希望充实其他预期方法,如如何"完美"覆盖字典中所述,则可能会产生工程成本。(将此视为概念验证草图)。最好不要太聪明。
您可以使用toolz.dicttoolz.get_in()
:
from toolz.dicttoolz import get_in
dictionary1 = {"required": {"value1": "one", "value2": "two"}, "optional": {"value1": "one"}}
dictionary2 = {"required": {"value1": "one", "value2": "two"}}
get_in(("optional", "value1"), dictionary1)
# 'one'
get_in(("optional", "value1"), dictionary2)
# None
如果你不想安装整个库,你可以复制在 BSD 下许可的源代码:
import operator
from functools import reduce
def get_in(keys, coll, default=None, no_default=False):
try:
return reduce(operator.getitem, keys, coll)
except (KeyError, IndexError, TypeError):
if no_default:
raise
return default
既然你喜欢像dictionary2["optional"]["value1"] if "optional" in dictionary2 else " "
和dictionary2.get("optional", {}).get("value1", " ")
这样的单行词,我想也建议
getattr(dictionary2.get("optional"), "get", {}.get)("value1", " ")
通过使用getattr
,这也解释了[并且将返回" "
]dictionary2['optional']
不是字典[而不是用其他两种方法提出AttributeError
或TypeError
]。
如果包装为函数,它将类似于
# get_v2 = lambda d, k1, k2, vDef=None: getattr(d.get(k1), 'get', {}.get)(k2,vDef) ## OR
def get_v2(d, k1, k2, vDef=None):
return getattr(d.get(k1), 'get', {}.get)(k2,vDef)
a = get_v2(dictionary1, 'optional', 'value1', vDef=' ') ## --> a='one'
b = get_v2(dictionary2, 'optional', 'value1', vDef=' ') ## --> b=' '
但是,如果您希望能够为任意数量的键调用它,则需要使用递归
def getVal(obj, k1, *keys, vDef=None):
nxtVal = getattr(obj, 'get', {}.get)(k1, vDef)
return getVal(nxtVal, *keys, vDef=vDef) if keys else nxtVal
或循环
def getVal(obj, *keys, vDef=None):
for k in keys: obj = getattr(obj, 'get', {}.get)(k, vDef)
return obj
虽然,我认为按照某些人的建议使用try..except
更有效。
def getVal(obj, k1, *keys, vDef=None):
try: return getVal(obj[k1], *keys, vDef=vDef) if keys else obj[k1]
except: return vDef
或
def getVal(obj, *keys, vDef=None):
try:
for k in keys: obj = obj[k]
except: obj = vDef
return obj
你也可以编写一个函数,返回一个函数[有点像operator.itemgetter
],可以像valGetter("optional", "value1")(dictionary2, " ")
一样使用
def valGetter(k1, *keys):
if keys:
def rFunc(obj, vDef=None):
try:
for k in (k1,)+(keys): obj = obj[k]
except: obj = vDef
return obj
else:
def rFunc(obj, vDef=None):
try: return obj[k1]
except: return vDef
return rFunc
但请注意,与其他方法相比,这可能会相当慢。
首先,您将" "
称为空字符串。这是不正确的;""
是空字符串。
其次,如果您正在检查成员资格,我认为首先没有理由使用get
方法。我会选择如下所示的内容。
if "optional" in dictionary2:
value1 = dictionary2["optional"].get("value1")
else:
value1 = ""
另一种需要考虑的替代方法(因为您经常使用get
方法)是切换到defaultdict
类。例如
from collections import defaultdict
dictionary2 = {"required": {"value1": "one", "value2": "two"}}
ddic2 = defaultdict(dict,dictionary2)
value1 = ddic2["optional"].get("value1")
pythonic 的处理方式是使用try/except
块 -
dictionary2 = {"required": {"value1": "one", "value2": "two"}}
try:
value1 = dictionary2["optional"]["value1"]
except (KeyError, AttributeError) as e:
value1 = ""
KeyError
捕获丢失的键,AttributeError
捕获具有list
/str
而不是dict
对象的情况。
如果你不喜欢代码中的大量try/except
,你可以考虑使用一个辅助函数——
def get_val(data, keys):
try:
for k in keys:
data = data[k]
return data
except (KeyError, AttributeError) as e:
return ""
dictionary2 = {"required": {"value1": "one", "value2": "two"}}
print(get_val(dictionary2, ("required", "value2")))
print(get_val(dictionary2, ("optional", "value1")))
输出-
two
我使用 reduce 在 Python 中实现类似 JavaScript 的可选链接
from functools import reduce
data_dictionary = {
'foo': {
'bar': {
'buzz': 'lightyear'
},
'baz': {
'asd': 2023,
'zxc': [
{'patrick': 'star'},
{'spongebob': 'squarepants'}
],
'qwe': ['john', 'sarah']
}
},
'hello': {
'world': 'hello world',
},
}
def optional_chaining_v1(dictionary={}, *property_list):
def reduce_callback(current_result, current_dictionary):
if current_result is None:
return dictionary.get(current_dictionary)
if type(current_result) != dict:
return None
return current_result.get(current_dictionary)
return reduce(reduce_callback, property_list, None)
# or in one line
optional_chaining_v1 = lambda dictionary={}, *property_list: reduce(lambda current_result, current_dictionary: dictionary.get(current_dictionary) if current_result is None else None if type(current_result) != dict else current_result.get(current_dictionary), property_list, None)
# usage
optional_chaining_v1_result1 = optional_chaining_v1(data_dictionary, 'foo', 'bar', 'baz')
print('optional_chaining_v1_result1:', optional_chaining_v1_result1)
optional_chaining_v1_result2 = optional_chaining_v1(data_dictionary, 'foo', 'bar', 'buzz')
print('optional_chaining_v1_result2:', optional_chaining_v1_result2)
# optional_chaining_v1_result1: None
# optional_chaining_v1_result2: lightyear
def optional_chaining_v2(dictionary={}, list_of_property_string_separated_by_dot=''):
property_list = list_of_property_string_separated_by_dot.split('.')
def reduce_callback(current_result, current_dictionary):
if current_result is None:
return dictionary.get(current_dictionary)
if type(current_result) != dict:
return None
return current_result.get(current_dictionary)
return reduce(reduce_callback, property_list, None)
# or in one line
optional_chaining_v2 = lambda dictionary={}, list_of_property_string_separated_by_dot='': reduce(lambda current_result, current_dictionary: dictionary.get(current_dictionary) if current_result is None else None if type(current_result) != dict else current_result.get(current_dictionary), list_of_property_string_separated_by_dot.split('.'), None)
# usage
optional_chaining_v2_result1 = optional_chaining_v2(data_dictionary, 'foo.bar.baz')
print('optional_chaining_v2_result1:', optional_chaining_v2_result1)
optional_chaining_v2_result2 = optional_chaining_v2(data_dictionary, 'foo.bar.buzz')
print('optional_chaining_v2_result2:', optional_chaining_v2_result2)
# optional_chaining_v2_result1: None
# optional_chaining_v2_result2: lightyear