如何在 Python 中解析 JSON 数据时解析多索引值并创建 CSV 文件



我有几个静态键列EmployeeId,type和来自第一个FOR循环的几个列。

在第二个 FOR 循环中,如果我有一个特定的键,那么只有值应该附加到现有的数据框列中,否则从第一个 for 循环中获取的列应该保持不变。

第一个 for 环路输出:

"EmployeeId","type","KeyColumn","Start","End","Country","Target","CountryId","TargetId"
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","","",""

在第二个 For 循环之后,我有以下输出:

"EmployeeId","type","KeyColumn","Start","End","Country","Target","CountryId","TargetId"
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","AMAZON","1",""
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","FLIPKART","2",""

根据代码,如果我有可用的员工标签,我有超过 2 条记录,但我可能有几个没有员工标签的 json 文件,那么输出应该与第一个循环输出相同。

但是根据我的代码,我得到了 0 条记录。如果我的编码方式是错误的,请帮助我。

真的很抱歉 - 如果提问的方式不清楚,因为我是python的新手。 请在下面的超链接中找到代码

请找到下面的代码

for i in range(len(json_file['enty'])):
temp = {}
temp['EmployeeId'] = json_file['enty'][i]['id']
temp['type'] = json_file['enty'][i]['type']
for key in json_file['enty'][i]['data']['attributes'].keys():        
try:
temp[key] = json_file['enty'][i]['data']['attributes'][key]['values'][0]['value']
except:
temp[key] = None      
for key in json_file['enty'][i]['data']['attributes'].keys(): 
if(key == 'Employee'):
for j in range(len(json_file['enty'][i]['data']['attributes']['Employee']['group'])):
for key in json_file['enty'][i]['data']['attributes']['Employee']['group'][j].keys():
try:
temp[key] = json_file['enty'][i]['data']['attributes']['Employee']['group'][j][key]['values'][0]['value']
except:
temp[key] = None
temp_df = pd.DataFrame([temp])
df = pd.concat([df, temp_df], sort=True)
# Rearranging columns
df = df[['EmployeeId', 'type'] + [col for col in df.columns if col not in ['EmployeeId', 'type']]]
# Writing the dataset
df[columns_list].to_csv("Test22.csv", index=False, quotechar='"', quoting=1)

如果员工标签不可用,我将获得 0 条记录作为输出,但我期望每个 FOR 循环的输出有 1 条记录。如果"员工标签"可用,那么我期待 2 条记录以及我的静态列"员工 ID"、"类型"、"键列"、"开始"、"结束",否则如果标签不可用,则所有静态列"员工 ID"、"类型"、"键列"、"开始"、"结束",其余列为空白

在此处输入链接说明

一个很长的解决方案,修改你的代码,所以再添加一个循环,更改索引,以及修改range参数:

df = pd.DataFrame()
num = max([len(v) for k,v in json_file['data'][0]['data1'].items()])
for i in range(num):
temp = {}
temp['Empid'] = json_file['data'][0]['Empid']
temp['Empname'] = json_file['data'][0]['Empname']
for key in json_file['data'][0]['data1'].keys():
if key not in temp:
temp[key] = []
try:
for j in range(len(json_file['data'][0]['data1'][key])):
temp[key].append(json_file['data'][0]['data1'][key][j]['relative']['id']) 
except:
temp[key] = None                    
temp_df = pd.DataFrame([temp])
df = pd.concat([df, temp_df],ignore_index=True)
for i in json_file['data'][0]['data1'].keys():
df[i] = pd.Series([x for y in df[i].tolist() for x in y]).drop_duplicates()

现在:

print(df)

是:

Empid Empname    XXXX   YYYYY
0  1234     ABC  Naveen   Kumar
1  1234     ABC     NaN  Rajesh

最新更新