我有一个带有日期索引的pandas dataframe。我可以在给定的日期范围内更新其列之一。
import pandas as pd
df = pd.read_csv('https://pylie.com/dl/readings/bikes-nyc-pylie.csv', index_col=0)
df.index = pd.to_datetime(df.index)
print(df.tail(5))
df.loc['2016-10-29':]['temperature'] = 20.0
print(df.tail(5))
df['temperature_f'] = df['temperature'].copy()
df.loc['2016-10-29':]['temperature'] = 40.0
print(df.tail(5))
输出:
temperature precipitation brooklyn manhattan williamsburg queensboro
date
2016-10-27 8.6 35.81 651 1558 2137 1902
2016-10-28 7.5 0.00 2021 3872 4271 3202
2016-10-29 10.6 0.00 1639 3160 4027 2920
2016-10-30 19.1 14.22 1702 2971 3531 2547
2016-10-31 9.4 0.00 2648 4876 5440 3720
temperature precipitation brooklyn manhattan williamsburg queensboro
date
2016-10-27 8.6 35.81 651 1558 2137 1902
2016-10-28 7.5 0.00 2021 3872 4271 3202
2016-10-29 20.0 0.00 1639 3160 4027 2920
2016-10-30 20.0 14.22 1702 2971 3531 2547
2016-10-31 20.0 0.00 2648 4876 5440 3720
temperature precipitation brooklyn manhattan williamsburg queensboro temperature_f
date
2016-10-27 8.6 35.81 651 1558 2137 1902 8.6
2016-10-28 7.5 0.00 2021 3872 4271 3202 7.5
2016-10-29 20.0 0.00 1639 3160 4027 2920 20.0
2016-10-30 20.0 14.22 1702 2971 3531 2547 20.0
2016-10-31 20.0 0.00 2648 4876 5440 3720 20.0
熊猫版本为0.24.1
而不是这样做:
df.loc['2016-10-29':, 'temperature'] = 20.0
添加索引的选择器和您尝试在同一.loc调用中更新的列,否则不能保证将更新原始的数据帧。这也适用于更新现有列。
添加新列时,您不需要使用.copy()
。因此,df['temperature_f'] = df['temperature']
就足够了。