如何从字典中提取两个值在python?



我使用python3和我有数据集。它包含以下数据。我试图从这个数据列表中得到期望值。我已经尝试了很多方法,但就是想不出该怎么做。

slots_data = [
{
"id":551,
"user_id":1,
"time":"199322002",
"expire":"199322002"
},
{
"id":552,
"user_id":1,
"time":"199322002",
"expire":"199322002"
},
{
"id":525,
"user_id":3,
"time":"199322002",
"expire":"199322002"
},
{
"id":524,
"user_id":3,
"time":"199322002",
"expire":"199322002"
},
{
"id":553,
"user_id":1,
"time":"199322002",
"expire":"199322002"
},
{
"id":550,
"user_id":2,
"time":"199322002",
"expire":"199322002"
}
]


# Desired output 
# [
# {"user_id":1,"slots_ids":[551,552,553]}
# {"user_id":2,"slots_ids":[550]}
# {"user_id":3,"slots_ids":[524,525]}
# ]

我已经尝试了以下方式,显然这是不正确的。我想不出这个问题的解决方法:

final_list = []
for item in slots_data:
obj = obj.dict()
obj = {
"user_id":item["user_id"],
"slot_ids":item["id"]
}
final_list.append(obj)
print(set(final_list))

这里添加的另一个答案有一个很好的解决方案,但这里没有使用pandas:

users = {}
for item in slots_data:
# Check if we've seen this user before,
if item['user_id'] not in users:
# if not, create a new entry for them
users[item['user_id']] = {'user_id': item['user_id'], 'slot_ids': []}
# Add their slot ID to their dictionary
users[item['user_id']]['slot_ids'].append(item['id'])
# We only need the values (dicts)
output_list = list(users.values())

这里有很多好的答案。

如果我这样做,我会基于我的答案setdefault和/或collections.defaultdict,可以以类似的方式使用。我认为defaultdict版本是非常可读的,但如果你还没有导入集合,你可以没有它。

给定你的数据:

slots_data = [
{
"id":551,
"user_id":1,
"time":"199322002",
"expire":"199322002"
},
{
"id":552,
"user_id":1,
"time":"199322002",
"expire":"199322002"
},
#....
]

你可以通过:

将它重塑成你想要的输出
## -------------------
## get the value for the key user_id if it exists
## if it does not, set the value for that key to a default
## use the value to append the current id to the sub-list
## -------------------
reshaped = {}
for slot in slots_data:
user_id = slot["user_id"]
id = slot["id"]
reshaped.setdefault(user_id, []).append(id)
## -------------------

## -------------------
## take a second pass to finish the shaping in a sorted manner
## -------------------
reshaped = [
{
"user_id": user_id,
"slots_ids": sorted(reshaped[user_id])
}
for user_id
in sorted(reshaped)
]
## -------------------
print(reshaped)

这将给你:

[
{'user_id': 1, 'slots_ids': [551, 552, 553]},
{'user_id': 2, 'slots_ids': [550]},
{'user_id': 3, 'slots_ids': [524, 525]}
]

我会说尝试使用pandas将用户id分组在一起并将其转换回字典

pd.DataFrame(slots_data).groupby('user_id')['id'].agg(list).reset_index().to_dict('records')
[{'user_id': 1, 'id': [551, 552, 553]},
{'user_id': 2, 'id': [550]},
{'user_id': 3, 'id': [525, 524]}]

through just simple loop way

>>> result = {}
>>> for i in slots_data:
...     if i['user_id'] not in result:
...             result[i['user_id']] = []
...     result[i['user_id']].append(i['id'])
... 
>>> output = []
>>> for i in result:
...     dict_obj = dict(user_id=i, slots_id=result[i])
...     output.append(dict_obj)
... 
>>> output
[{'user_id': 1, 'slots_id': [551, 552, 553]}, {'user_id': 3, 'slots_id': [525, 524]}, {'user_id': 2, 'slots_id': [550]}]

您可以使用以下方法来完成它。纯Python。没有任何依赖。

slots_data = [
{
"id":551,
"user_id":1,
"time":"199322002",
"expire":"199322002"
},
{
"id":552,
"user_id":1,
"time":"199322002",
"expire":"199322002"
},
{
"id":525,
"user_id":3,
"time":"199322002",
"expire":"199322002"
},
{
"id":524,
"user_id":3,
"time":"199322002",
"expire":"199322002"
},
{
"id":553,
"user_id":1,
"time":"199322002",
"expire":"199322002"
},
{
"id":550,
"user_id":2,
"time":"199322002",
"expire":"199322002"
}
]
user_wise_slots = {}
for slot_detail in slots_data:
if not slot_detail["user_id"] in user_wise_slots:
user_wise_slots[slot_detail["user_id"]] = {
"user_id": slot_detail["user_id"],
"slot_ids": []
}
user_wise_slots[slot_detail["user_id"]]["slot_ids"].append(slot_detail["id"])
print(user_wise_slots.values())

这可以在using listcomprehension语句中实现:

final_list = [{"user_id": user_id, "id":sorted([slot["id"] for slot in slots_data if slot["user_id"] == user_id])} for user_id in sorted(set([slot["user_id"] for slot in slots_data]))]

相同代码的更详细、格式更好的版本:

all_user_ids = [slot["user_id"] for slot in slots_data]
unique_user_ids = sorted(set(all_user_ids))
final_list = [
{
"user_id": user_id,
"id": sorted([slot["id"] for slot in slots_data if slot["user_id"] == user_id])
}
for user_id in unique_user_ids]

解释:

  1. 使用列表推导式获取所有用户id
  2. 通过创建集
  3. 获得唯一的用户id
  4. 使用列表推导式创建字典的最终列表。
  5. 每个字段id本身是一个具有列表推导式的列表。我们获取槽位的id,只有当用户id匹配
  6. 时才将其添加到列表中。

使用pandas可以很容易地实现这个结果。如果没有下列文件,请先安装pandas

pip install pandas

import pandas as pd
df = pd.DataFrame(slots_data) #create dataframe
df1 = df.groupby("user_id")['id'].apply(list).reset_index(name="slots_ids") #groupby on user_id and combine elements of id in list and give the column name is slots_ids
final_slots_data = df1.to_dict('records') # convert dataframe into a list of dictionary
final_slots_data

输出:

[{'user_id': 1, 'slots_ids': [551, 552, 553]},
{'user_id': 2, 'slots_ids': [550]},
{'user_id': 3, 'slots_ids': [525, 524]}]

相关内容

  • 没有找到相关文章

最新更新