数据帧包含两个数组列,分别是1d和2d数组。
| 1d_col | 2d_col |
| --------------------------| ---------------------------|
| ["negative", "positive"] | [["zero", ""], ["one", ""]]|
| ["positive", "negative"] | [["two", ""], ["zero", ""]]|
| ["negative"] | [["minus", ""]] |
| ["positive"] | [["three", ""]] |
我需要将1d_col转换为相同长度的2d_col。输出数据帧应该是这样的。
| 2d_col |new_col |
| ---------------------------| ----------------------------------------------------|
| [["zero", ""], ["one", ""]]| [["negative", "negative"], ["positive", "positive"]]|
| [["two", ""], ["zero", ""]]| [["positive", "positive"], ["negative", "negative"]]|
| [["minus", ""]] | [["negative", "negative"]] |
| [["three", ""]] | [["positive", "positive"]] |
使用您提供的数据帧:
import pandas as pd
df = pd.DataFrame(
{
"1d_col": [
["negative", "positive"],
["positive", "negative"],
["negative"],
["positive"],
],
"2d_col": [
[["zero", ""], ["one", ""]],
[["two", ""], ["zero", ""]],
[["minus", ""]],
[["three", ""]],
],
}
)
这里有一种方法:
df["new_col"] = df.apply(
lambda x: [
[item] * len(x["2d_col"]) if len(x["2d_col"]) >= 2 else [item] * 2
for item in x["1d_col"]
],
axis=1,
)
df = df.drop(columns=["1d_col"])
print(df)
# Output
2d_col new_col
0 [[zero, ], [one, ]] [[negative, negative], [positive, positive]]
1 [[two, ], [zero, ]] [[positive, positive], [negative, negative]]
2 [[minus, ]] [[negative, negative]]
3 [[three, ]] [[positive, positive]]