如何在具有嵌套键和值的Json文件中搜索某个关键字



我是python的初学者,我正试图在Json文件中搜索特定的关键字。我一直在阅读python中的字典和列表是如何工作的,我发现了这个:

complex_list = [["a",["b",["c","x"]]],42]
complex_list[0][1]
output: ['b', ['c', 'x']]
complex_list[0][1][1][0]
output: 'c'

据我所知,在complex_list[0][1]中,[0]是整个括号[],而[1]访问括号的第二部分:[, [this one] ]

现在,这个:complex_list = [["a",["b",["c","x"]]],42],列表中有2个元素,对吗?a、 b、c和x属于一个集合,42属于第二集合。我不知道如何解释:complex_list[0][1][1][0]访问'c'

有人能把它分解一下吗?我问这个问题是因为我认为这是我需要用来解决下面我解释的问题的。

这是我目前正在处理的文件中的一个小样本:

{ (white)
"results": [
{ (black)
"Fruit": "Apple",
"Nested fruit": [
"Orange"
],
"Title1": "Some text",
"Contents": { (yellow)
"Name 1": [
"John Smith"
],
"Name 2": [
"Tyler"
],
"Name 3": [
"Bob",
"Rob"
],
"Name 4": [
"Linda"
],
"Name 5": [
"Mark",
"Matt"
],
"Some boolean": [
true
]
}, (yellow)
"More stuff": "More random text",
"Confusing": [
{ (red)
"Some info": "456",
"Info I want": "849456"
} (red)
],
"Not important": [
{ (blue)
"random text": "bla",
"random text2": "bla bla"
} (blue)
],
"Not important 2": "000",
"Not important3": [
"whatever",
"whatever"
],
"Not important 4": "16",
"Not important 5": "0058"
} (black)
]
} (white)

我把颜色放在相应的花括号旁边的括号里,这样很容易区分。下面是一些在线示例,我发现:

import json
with open('searchingKeywords.json') as f:
data = json.load(f)
print(data.keys())
for k in data:
for v in data[k]:
if 'More stuff' in v:
print("yes")

打印:

dict_keys(['results'])
yes

只有一个键,但目录呢?这不是结果中的另一个关键吗?我很困惑。我感兴趣的是"困惑"里面的"我想要的信息"。如果包含关键字"我想要的信息",我该如何在这么多嵌套的东西中搜索?最初,我尝试逐行读取——一旦我将Json文件解析为Python对象——然后看看是否在每行中都找到了关键字"Info I want",但我总是出错。此外,我正在处理的文件很大,"我想要的信息"可能会以不同的方式嵌套。

如评论中所述,链接问题中未被接受的答案非常适合您的情况:

data = {
"results": [
{
"Fruit": "Apple",
"Nested fruit": [
"Orange"
],
"Title1": "Some text",
"Contents": {
"Name 1": [ 
"John Smith"
],
"Name 2": [
"Tyler"
],
"Name 3": [
"Bob",
"Rob"
],
"Name 4": [
"Linda"
],
"Name 5": [
"Mark",
"Matt"
],
"Some boolean": [
True
]
},
"More stuff": "More random text",
"Confusing": [
{
"Some info": "456",
"Info I want": "849456"
}
],
"Not important": [
{
"random text": "bla",
"random text2": "bla bla"
}
],
"Not important 2": "000",
"Not important3": [
"whatever",
"whatever"
],
"Not important 4": "16",
"Not important 5": "0058"
}
]
}

def item_generator(json_input, lookup_key):
if isinstance(json_input, dict):
for k, v in json_input.items():
if k == lookup_key:
yield v
else:
yield from item_generator(v, lookup_key)
elif isinstance(json_input, list):
for item in json_input:
yield from item_generator(item, lookup_key)

res = item_generator(data, 'More stuff')
print([x for x in res])
res = item_generator(data, 'Info I want')
print([x for x in res])

输出:

['More random text']
['849456']

最新更新