我有一个pandas数据框架,它由一年的日期列和与之配套的每日数据组成。我只想更新与1月份有关的行的年份。我可以选择数据帧的一月子集,并尝试根据这里给出的答案更改该子集的年份,但是当我试图通过添加偏移量来更新该子集的值时,我得到了一个错误。
设置:
import pandas as pd
df = pd.DataFrame({'Date': pd.date_range(start = "01-01-2023", end = "12-31-2023"), 'data': 25})
Select January子集:
df[df['Date'].dt.month == 1]
按预期运行:
Date data
0 2023-01-01 25
1 2023-01-02 25
2 2023-01-03 25
3 2023-01-04 25
4 2023-01-05 25
5 2023-01-06 25
6 2023-01-07 25
7 2023-01-08 25
8 2023-01-09 25
9 2023-01-10 25
10 2023-01-11 25
11 2023-01-12 25
12 2023-01-13 25
13 2023-01-14 25
14 2023-01-15 25
15 2023-01-16 25
16 2023-01-17 25
17 2023-01-18 25
18 2023-01-19 25
19 2023-01-20 25
20 2023-01-21 25
21 2023-01-22 25
22 2023-01-23 25
23 2023-01-24 25
24 2023-01-25 25
25 2023-01-26 25
26 2023-01-27 25
27 2023-01-28 25
28 2023-01-29 25
29 2023-01-30 25
30 2023-01-31 25
尝试更改:
df[df['Date'].dt.month == 1] = df[df['Date'].dt.month == 1] + pd.offsets.DateOffset(years=1)
TypeError: Concatenation operation is not implemented for NumPy arrays, use np.concatenate() instead. Please do not rely on this error; it may not be given on all Python implementations.
我已经尝试了一些不同的变化,但似乎有问题改变子集数据帧数据。
您必须选择Date
列(解决方案由@mozway增强,谢谢):
df.loc[df['Date'].dt.month == 1, 'Date'] += pd.offsets.DateOffset(years=1)
print(df)
# Output
Date data
0 2024-01-01 25
1 2024-01-02 25
2 2024-01-03 25
3 2024-01-04 25
4 2024-01-05 25
.. ... ...
360 2023-12-27 25
361 2023-12-28 25
362 2023-12-29 25
363 2023-12-30 25
364 2023-12-31 25
[365 rows x 2 columns]