name | ID | 性别
---|---|
John | 123男 |
苏格兰 | na | na
124男 | na |
Jill | 231 | 女性
我想您正在寻找bfill
。
以下是示例:https://www.geeksforgeeks.org/python-pandas-series-bfill/
所以这应该做到:
df['ID'] = df['ID'].bfill()
df['gender'] = df['gender'].bfill()
或者,如果你不需要选择性,你可以在整个数据帧上运行它:
df = df.bfill()
通过更改最初加载数据的方式,修复这个问题可能会更容易,因为那里似乎有换行符。然而,你可以这样做:
测试数据:
import pandas as pd
import numpy as np
df = pd.DataFrame({'name': {0: 'John', 1: 'Scot', 2: '124', 3: 'Jill'},
'ID': {0: '123', 1: np.nan, 2: 'male', 3: '231'},
'gender': {0: 'male', 1: np.nan, 2: np.nan, 3: 'female'}})
代码:
# find out which rows are valid (m) and which contain the offset data (m2)
m = df['ID'].isna()
m2 = m.shift(fill_value=False)
# create a separate dataframe, only containing the relevant row and columns for filling nan values
df2 = df[df.columns[:-1]][m2].copy()
# harmonize the index and column names so it fits the original dataframe
df2.columns = df.columns[1:]
df2.index = df2.index-1
# fill empty values by using the newly created dataframe values
df.fillna(df2)[~m2]
输出:
# name ID gender
# 0 John 123 male
# 1 Scot 124 male
# 3 Jill 231 female