python基于预定义的映射递归地变换



我有一个用例,我需要穿越dict(可能具有字符串,命令和列表为嵌套值(,并根据我的业务团队的预定映射创建一个新的。当要求是:

时,我的第一个实现很简单。
  1. 1:1转换
  2. 删除一些钥匙值对

我的代码看起来像这样:

def recursively_transform(parent_keys='', current_key='', container=None):
    container_class = container.__class__
    new_container_value = None
    if container is not None:
        if isinstance(container, basestring):
            new_container_value = do_something_and_return(parent_keys, current_key, container)
            if current_key in mapping:
                populate(parent_keys + current_key, new_container_value)
        elif isinstance(container, collections.Mapping):
            if parent_keys:
                parent_keys = ''.join([parent_keys, ":"])
            new_container_value = container_class(
                (x, recursively_transform(parent_keys + x, x, container[x])) for x in container if key_required(parent_keys, current_key))
        elif isinstance(container, collections.Iterable):
            new_container_value = container_class(recursively_transform(
                parent_keys + "[]", current_key, x) for x in container)
        else:
            raise Exception("")
    return new_container_value

您可以看到,在方法do_something_and_return中,使用参数parent_keycurrent_key,我对该值进行一些转换并返回新的。每个parent_keys加上current_key组合的步骤都在外部映射数据库中指定。

但是,现在,要求已更改以具有复杂的转换(而不是1:1(。即,在我的映射数据库内,将指定密钥的新路径。这可能是任何结构。例如,必须将键/值对扁平,很多时候必须发生相反的情况,有时它们之间不会有任何直接的对应关系。

示例,

key1:key2:[]:key3 => key2:[]:key3
key1:key2:[]:key4 => key2:[]:key5 

这意味着这样的输入是这样的:

{key1:{key2:[{key3: "value3", key4: "value4"}, {key3:None}]}}

将成为

{key2:[{key3:"value3_after_transformation", key5:"value4_after_transformation"}, {key3:None}]}

:是我对父键和子键的描述性语言中的分隔符, []占父母键的列表为其值。

在这种情况下,我对该方法应该是什么感到困惑。我能想到的唯一处理所有这些情况的方法是递归地遍历所有钥匙,然后通过检查目标键并适当地填充目标键来填充另一个全球命令。但这在处理嵌套列表时并不容易。另外,这听起来不像我上面使用容器及其孩子那样优雅的解决方案。最好以一种普遍的方式和优雅的方式做到这一点的方法是什么?

谢谢!

好,我成功。这通过了您给定的测试柜,但是很长。它找到了给定模板的所有可能路径,然后根据新路径

填充新的dict
 import re

def prepare_path(path):
    # split path
    path = re.findall(r"[^:]+?(?=[|:|$)|[d*?]", path)
    # prepare path
    for i, element in enumerate(path):
        if element[0] == "[" and element[-1] == "]":
            element = int(element[1:-1])
        path[i] = element
    return path

def prepare_template(template):
    # split path template
    template = re.findall(r"[^:]+?(?=[|:|$)|[d*?]", template)
    # prepare path template
    counter = 0
    for i, element in enumerate(template):
        if element[0] == "[" and element[-1] == "]":
            if len(element) > 2:
                element = int(element[1:-1])
            else:
                element = ("ListIndex", counter)
        template[i] = element
    return template

def fill_template(template, list_indexes):
    out = []
    for element in template:
        if isinstance(element, tuple):
            element = f"[{list_indexes[element[1]]}]"
        out.append(element)
    return ":".join(out)

def populate(result_dict, target_path, value):
    target_path = prepare_path(target_path)
    current = result_dict
    for i, element in enumerate(target_path[:-1]):
        if isinstance(element, str):  # dict index
            if element not in current:  # create new entry
                if isinstance(target_path[i + 1], str):  # next is a dict
                    current[element] = {}
                else:  # next is a list
                    current[element] = []
        elif isinstance(element, int):  # list index
            if element >= len(current):  # create new entry
                current.extend(None for _ in range(element - len(current) + 1))
            if current[element] is None:
                if isinstance(target_path[i + 1], str):  # next is a dict
                    current[element] = {}
                else:  # next is a list
                    current[element] = []
        current = current[element]
    if isinstance(target_path[-1], int):
        current.append(value)
    else:
        current[target_path[-1]] = value

def get_value(container, target_path):
    target_path = prepare_path(target_path)
    current = container
    for key in target_path:
        current = current[key]
    return current

def transform(old_path, new_path, old_container, new_container, transform_value=lambda *args: ' '.join(args)):
    value = get_value(old_container, old_path)
    new_value = transform_value(old_path, new_path, value)
    populate(new_container, new_path, new_value)

def get_all_paths(prepared_template, container):
    if not prepared_template:
        return [("",())]
    key, *rest = prepared_template
    if isinstance(key, tuple):
        if not isinstance(container, list):
            raise ValueError(container, key)
        paths = [(f"[{i}]:" + path, (i,) + per) for i, child in enumerate(container) for path, per in get_all_paths(rest, child)]
    elif isinstance(key, str):
        if key not in container:
            return []
        child = container[key]
        paths = [(f"{key}:" + path, per) for path, per in get_all_paths(rest, child)]
    elif isinstance(key, int):
        child = container[key]
        paths = [(f"[{key}]:" + path, per) for path, per in get_all_paths(rest, child)]
    else:
        raise ValueError
    return paths

def transform_all(old_template, new_template, old_container, new_container, transform_value=lambda op, np, value: value):
    new_template = prepare_template(new_template)
    old_template = prepare_template(old_template)
    all_paths = get_all_paths(old_template, old_container)
    for path, per in all_paths:
        transform(path, fill_template(new_template, per), old_container, new_container, transform_value)
input_dict = {"key1": {"key2": [{"key3": "value3", "key4": "value4"}, {"key3": None}]}}
output_dict = {}
transform_all("key1:key2:[]:key3", "key2:[]:key3", input_dict, output_dict)
transform_all("key1:key2:[]:key4", "key2:[]:key5", input_dict, output_dict)
print(output_dict)

如果您有任何疑问或其他情况,请询问!这些是您给我们的有趣挑战。

最新更新