我有一个用例,我需要穿越dict(可能具有字符串,命令和列表为嵌套值(,并根据我的业务团队的预定映射创建一个新的。当要求是:
时,我的第一个实现很简单。- 1:1转换
- 删除一些钥匙值对
我的代码看起来像这样:
def recursively_transform(parent_keys='', current_key='', container=None):
container_class = container.__class__
new_container_value = None
if container is not None:
if isinstance(container, basestring):
new_container_value = do_something_and_return(parent_keys, current_key, container)
if current_key in mapping:
populate(parent_keys + current_key, new_container_value)
elif isinstance(container, collections.Mapping):
if parent_keys:
parent_keys = ''.join([parent_keys, ":"])
new_container_value = container_class(
(x, recursively_transform(parent_keys + x, x, container[x])) for x in container if key_required(parent_keys, current_key))
elif isinstance(container, collections.Iterable):
new_container_value = container_class(recursively_transform(
parent_keys + "[]", current_key, x) for x in container)
else:
raise Exception("")
return new_container_value
您可以看到,在方法do_something_and_return
中,使用参数parent_key
和current_key
,我对该值进行一些转换并返回新的。每个parent_keys
加上current_key
组合的步骤都在外部映射数据库中指定。
但是,现在,要求已更改以具有复杂的转换(而不是1:1(。即,在我的映射数据库内,将指定密钥的新路径。这可能是任何结构。例如,必须将键/值对扁平,很多时候必须发生相反的情况,有时它们之间不会有任何直接的对应关系。
示例,
key1:key2:[]:key3 => key2:[]:key3
key1:key2:[]:key4 => key2:[]:key5
这意味着这样的输入是这样的:
{key1:{key2:[{key3: "value3", key4: "value4"}, {key3:None}]}}
将成为
{key2:[{key3:"value3_after_transformation", key5:"value4_after_transformation"}, {key3:None}]}
:
是我对父键和子键的描述性语言中的分隔符, []
占父母键的列表为其值。
在这种情况下,我对该方法应该是什么感到困惑。我能想到的唯一处理所有这些情况的方法是递归地遍历所有钥匙,然后通过检查目标键并适当地填充目标键来填充另一个全球命令。但这在处理嵌套列表时并不容易。另外,这听起来不像我上面使用容器及其孩子那样优雅的解决方案。最好以一种普遍的方式和优雅的方式做到这一点的方法是什么?
谢谢!
好,我成功。这通过了您给定的测试柜,但是很长。它找到了给定模板的所有可能路径,然后根据新路径
填充新的dict import re
def prepare_path(path):
# split path
path = re.findall(r"[^:]+?(?=[|:|$)|[d*?]", path)
# prepare path
for i, element in enumerate(path):
if element[0] == "[" and element[-1] == "]":
element = int(element[1:-1])
path[i] = element
return path
def prepare_template(template):
# split path template
template = re.findall(r"[^:]+?(?=[|:|$)|[d*?]", template)
# prepare path template
counter = 0
for i, element in enumerate(template):
if element[0] == "[" and element[-1] == "]":
if len(element) > 2:
element = int(element[1:-1])
else:
element = ("ListIndex", counter)
template[i] = element
return template
def fill_template(template, list_indexes):
out = []
for element in template:
if isinstance(element, tuple):
element = f"[{list_indexes[element[1]]}]"
out.append(element)
return ":".join(out)
def populate(result_dict, target_path, value):
target_path = prepare_path(target_path)
current = result_dict
for i, element in enumerate(target_path[:-1]):
if isinstance(element, str): # dict index
if element not in current: # create new entry
if isinstance(target_path[i + 1], str): # next is a dict
current[element] = {}
else: # next is a list
current[element] = []
elif isinstance(element, int): # list index
if element >= len(current): # create new entry
current.extend(None for _ in range(element - len(current) + 1))
if current[element] is None:
if isinstance(target_path[i + 1], str): # next is a dict
current[element] = {}
else: # next is a list
current[element] = []
current = current[element]
if isinstance(target_path[-1], int):
current.append(value)
else:
current[target_path[-1]] = value
def get_value(container, target_path):
target_path = prepare_path(target_path)
current = container
for key in target_path:
current = current[key]
return current
def transform(old_path, new_path, old_container, new_container, transform_value=lambda *args: ' '.join(args)):
value = get_value(old_container, old_path)
new_value = transform_value(old_path, new_path, value)
populate(new_container, new_path, new_value)
def get_all_paths(prepared_template, container):
if not prepared_template:
return [("",())]
key, *rest = prepared_template
if isinstance(key, tuple):
if not isinstance(container, list):
raise ValueError(container, key)
paths = [(f"[{i}]:" + path, (i,) + per) for i, child in enumerate(container) for path, per in get_all_paths(rest, child)]
elif isinstance(key, str):
if key not in container:
return []
child = container[key]
paths = [(f"{key}:" + path, per) for path, per in get_all_paths(rest, child)]
elif isinstance(key, int):
child = container[key]
paths = [(f"[{key}]:" + path, per) for path, per in get_all_paths(rest, child)]
else:
raise ValueError
return paths
def transform_all(old_template, new_template, old_container, new_container, transform_value=lambda op, np, value: value):
new_template = prepare_template(new_template)
old_template = prepare_template(old_template)
all_paths = get_all_paths(old_template, old_container)
for path, per in all_paths:
transform(path, fill_template(new_template, per), old_container, new_container, transform_value)
input_dict = {"key1": {"key2": [{"key3": "value3", "key4": "value4"}, {"key3": None}]}}
output_dict = {}
transform_all("key1:key2:[]:key3", "key2:[]:key3", input_dict, output_dict)
transform_all("key1:key2:[]:key4", "key2:[]:key5", input_dict, output_dict)
print(output_dict)
如果您有任何疑问或其他情况,请询问!这些是您给我们的有趣挑战。