Recursive json to csv with python



>我有一个 json 文件可以在 json 中转换它,但这里的情况是嵌套的 json 结构:

[ { "节点":[ { "节点":[ { "节点":[ { "节点":[ { "valBool":假, "valStr1":[ "真" ], "valStr2":[ "行业在银行排除名单中" ] }, { "valBool":假, "valStr1":[ "真" ], "valStr2":[ "借款人目前正在破产法中" ] }, { "valBool":假, "valStr1":[ "真" ], "valStr2":[ "借款人被标记为不愿意" ] }, { "valBool":假, "valStr1":[ "真" ], "valStr2":[ "借款人被标记为不可生存" ] }, { "valBool":假, "valStr1":[ "真" ], "valStr2":[ "借款人阻止银行进入提雷西亚斯" ] }, { "valBool":假, "valStr1":[ "真" ], "valStr2":[ "借款人违约(NPE/NPF EBA 状态)" ] }, { "valBool":真, "valStr1":[ "假" ], "valStr2":[ "默认值" ] } ] } ] }, { "节点":[ { "节点":[ { "节点":[ { "节点":[ { "节点":[ { "valBool":假, "valStr1":[ "1" ], "valStr2":[ "员工少于 10 人" ] }, { "valBool":假, "valStr1":[ "1" ], "valStr2":[ "年营业额低于年营业额阈值" ] }, { "valBool":假, "valStr1":[ "1" ], "valStr2":[ "总资产低于总资产阈值" ] }, { "valBool":真, "valStr1":[ "0" ], "valStr2":[ "默认" ] } ] } ] } ] } ] } ] } ] } ] } ]

如您所见,可以在任何级别找到"节点"。我尝试了一些递归的 proache,但输出不是我们想要的。 我们需要获取每个包含三个值的节点,并将它们写入 csv 上的一行。

预期输出应为:

valBool,valStr1,valStr2
false,"true","Industry is in bank exclusion list"
false,"true","Borrower is currently under bankruptcy law"

我已经尝试过了,但输出只是在新行中附加每个值,在键上写入整个路径。

有什么想法吗?

谢谢!

你必须在递归函数中考虑你的数据 处理是一个列表或字典。 如果它是一个列表,您只需对其项目进行回避调用函数即可。 如果这是一本字典,你尝试 打印与'valBool''valStr1''valStr2'if 相关联的值 它们存在并递归调用与'node'如果有的话。

data = [ { "node":[ { "node":[ { "node":[ { "node":[ { "valBool":False, "valStr1":[ "true" ], "valStr2":[ "Industry is in bank exclusion list" ] }, { "valBool":False, "valStr1":[ "true" ], "valStr2":[ "Borrower is currently under bankruptcy law" ] }, { "valBool": False, "valStr1":[ "true" ], "valStr2":[ "Borrower is flagged as Unwilling" ] }, { "valBool": False, "valStr1":[ "true" ], "valStr2":[ "Borrower is flagged as non-viable" ] }, { "valBool": False, "valStr1":[ "true" ], "valStr2":[ "Borrower has blocked access of bank to Tiresias" ] }, { "valBool":False, "valStr1":[ "true" ], "valStr2":[ "Borrower is default (NPE/NPF eba status) " ] }, { "valBool":True, "valStr1":[ "false" ], "valStr2":[ "Default value" ] } ] } ] }, { "node":[ { "node":[ { "node":[ { "node":[ { "node":[ { "valBool":False, "valStr1":[ "1" ], "valStr2":[ "There are less nthan 10 employees" ] }, { "valBool":False, "valStr1":[ "1" ], "valStr2":[ "Annual turnover is nlower than annual nturnover threshold" ] }, { "valBool":False, "valStr1":[ "1" ], "valStr2":[ "Total assets are nlower than total nassets threshold" ] }, { "valBool":True, "valStr1":[ "0" ], "valStr2":[ "Default" ] } ] } ] } ] } ] } ] } ] } ] } ]
result = list()
def loop(data):
if isinstance(data, list):
for item in data:  # data is a list => recursive call on all its items
loop(item)
elif isinstance(data, dict):  # data is a dictionary
try:
row = f"{data['valBool']};{data['valStr1'][0]};{data['valStr2'][0]}"
print(row)
result.append(row)
except KeyError: # dictionary does not have all valXXX keys
pass
if 'node' in data:  # recursive call if the dictionary has a "node" key
loop(data['node'])
print('valBool;valStr1;valStr2')
loop(data)

这不是您期望的输出,但您会弄清楚 如何修改它。

[编辑] 修改了代码以将行放在列表result

最新更新