我有多个具有外键关系的文件(csv(。因此,一个文件将一个页面称为数字123,还有一个单独的文件将该数字映射到"/homepage"。映射文件不是按顺序排列的,也不是基于零的。
我无法从文档中弄清楚如何将astype('category
(与查找dict或其他东西一起使用。
有什么帮助吗?
# the file with the foreign keys
lookup_df = pd.DataFrame({'page':[123,2,3], 'name':['/homepage','/search','/checkout']})
# the file with the pages
df1 = pd.DataFrame({'pages':[2,3,123]})
# wanted df1, with categorical 'pages' column
# pages
# 0 /search
# 1 /checkout
# 2 /homepage
# but instead ofcourse
# pages
# 0 2
# 1 3
# 2 123
你可以试试这个:
from functools import cache
import pandas as pd
lookup_df = pd.DataFrame(
{"page": [123, 2, 3], "name": ["/homepage", "/search", "/checkout"]}
)
df1 = pd.DataFrame({"pages": [2, 3, 123, 4, 3, 123, 2]})
@cache
def match(value):
try:
return lookup_df.loc[lookup_df["page"] == value, "name"].values[0]
except IndexError:
return "/unknown"
df1["name"] = df1["pages"].apply(match)
print(df1)
# Output
pages name
0 2 /search
1 3 /checkout
2 123 /homepage
3 4 /unknown
4 3 /checkout
5 123 /homepage
6 2 /search