将pandas数据帧中的1d数组转换为2d数组列的长度



数据帧包含两个数组列,分别是1d和2d数组。

| 1d_col                    | 2d_col                     |
| --------------------------| ---------------------------|
| ["negative", "positive"]  | [["zero", ""], ["one", ""]]|
| ["positive", "negative"]  | [["two", ""], ["zero", ""]]|
| ["negative"]              | [["minus", ""]]            |
| ["positive"]              | [["three", ""]]            |

我需要将1d_col转换为相同长度的2d_col。输出数据帧应该是这样的。

| 2d_col                     |new_col                                              | 
| ---------------------------| ----------------------------------------------------|
| [["zero", ""], ["one", ""]]| [["negative", "negative"], ["positive", "positive"]]|
| [["two", ""], ["zero", ""]]| [["positive", "positive"], ["negative", "negative"]]|
| [["minus", ""]]            | [["negative", "negative"]]                          |
| [["three", ""]]            | [["positive", "positive"]]                          |

使用您提供的数据帧:

import pandas as pd
df = pd.DataFrame(
{
"1d_col": [
["negative", "positive"],
["positive", "negative"],
["negative"],
["positive"],
],
"2d_col": [
[["zero", ""], ["one", ""]],
[["two", ""], ["zero", ""]],
[["minus", ""]],
[["three", ""]],
],
}
)

这里有一种方法:

df["new_col"] = df.apply(
lambda x: [
[item] * len(x["2d_col"]) if len(x["2d_col"]) >= 2 else [item] * 2
for item in x["1d_col"]
],
axis=1,
)
df = df.drop(columns=["1d_col"])
print(df)
# Output
2d_col                                       new_col
0  [[zero, ], [one, ]]  [[negative, negative], [positive, positive]]
1  [[two, ], [zero, ]]  [[positive, positive], [negative, negative]]
2          [[minus, ]]                        [[negative, negative]]
3          [[three, ]]                        [[positive, positive]]

最新更新