Python Pandas将一列中的NaN替换为另一列下一行中的值


性别na女性
name ID
John 123男
苏格兰 na
124男 na
Jill 231

我想您正在寻找bfill

以下是示例:https://www.geeksforgeeks.org/python-pandas-series-bfill/

所以这应该做到:

df['ID'] = df['ID'].bfill()
df['gender'] = df['gender'].bfill()

或者,如果你不需要选择性,你可以在整个数据帧上运行它:

df = df.bfill()

通过更改最初加载数据的方式,修复这个问题可能会更容易,因为那里似乎有换行符。然而,你可以这样做:

测试数据:

import pandas as pd
import numpy as np
df = pd.DataFrame({'name': {0: 'John', 1: 'Scot', 2: '124', 3: 'Jill'},
'ID': {0: '123', 1: np.nan, 2: 'male', 3: '231'},
'gender': {0: 'male', 1: np.nan, 2: np.nan, 3: 'female'}})

代码:

# find out which rows are valid (m) and which contain the offset data (m2)
m = df['ID'].isna()
m2 = m.shift(fill_value=False)
# create a separate dataframe, only containing the relevant row and columns for filling nan values
df2 = df[df.columns[:-1]][m2].copy()
# harmonize the index and column names so it fits the original dataframe
df2.columns = df.columns[1:]
df2.index = df2.index-1
# fill empty values by using the newly created dataframe values
df.fillna(df2)[~m2]

输出:

#    name   ID  gender
# 0  John  123    male
# 1  Scot  124    male
# 3  Jill  231  female

最新更新