我有以下数据框架:
<表类>
日期
风(°)
风(kt)
临时(C°)
湿度(%)
电流(°)
电流(kt)
stemp (C°)
sea_temp_diff
wind_distance_diff
wind_speed_diff
temp_diff
humidity_diff
current_distance_diff
current_speed_diff
tbody><<tr>8 12018 175.000000 16.333333 25.500000 82.500000 60.000000 0.100000 25.400000 -1.066667 23.333333 -0.500000 -0.333333 -12.000000 160.000000 6.666667 e-02 9 12019 180.000000 17.000000 23.344828 79.724138 230.000000 0.100000 23.827586 -0.379310 22.068966 1.068966 0.827586 -7.275862 315.172414 3.449034 e + 02 12020 365.000000 208.653846 24.192308 79.346154 355.769231 192.500000 24.730769 574.653846 1121.923077 1151.153846 1149.346154 -19.538462 1500.000000 1.538454 e + 03 14 22019530.357143 372.964286 23.964286 81.964286 1270.714286 1071.560714 735.642857 -533.642857 -327.500000 -356.892857 1.857143 -10.321429 -873.571429 -8.928107 e + 02 15 22020216.551724 12.689655 24.517241 81.137931 288.275862 172.565517 196.827586 -171.379310 -8.965517 3.724138 1.413793 -7.137931 -105.517241 -1.722724 e + 02 32019 323.225806 174.709677 25.225806 80.741935 260.000000 161.451613 25.709677 480.709677 486.451613 483.967742 0.387097 153.193548 1044.516129 9.677065 e + 02 32020 351.333333 178.566667 25.533333 78.800000 427.666667 166.666667 26.600000 165.533333 -141.000000 -165.766667 166.633333 158.933333 8.333333 1.500000 e-01 18 42017180.000000 14.000000 27.000000 5000.000000 200.000000 0.400000 25.400000 2.600000 20.000000 -4.000000 0.000000 0.000000 -90.000000 -1.000000 e-01 19 42019694.230769 589.769231 24.038462 69.461538 681.153846 577.046154 26.884615 -1.346154 37.307692 -1.692308 1.500000 4.769231 98.846154 1.538462 e-01 42020 306.666667 180.066667 24.733333 75.166667 427.666667 166.666667 26.800000 165.066667 205.333333 165.200000 1.100000 -4.066667 360.333333 3.334233 e + 02 21 52017146.333333 11.966667 22.900000 5000.000000 116.333333 0.410000 26.066667 -1.553333 8.666667 0.833333 -0.766667 0.000000 95.000000 -1.300000 e-01 22 52019 107.741935 12.322581 23.419355 63.032258 129.354839 0.332258 25.935484 -1.774194 14.838710 0.096774 -0.612903 -14.451613 130.967742 表类>
From yourDataFrame
:
>>> df = pd.DataFrame({'id': [1, 2, 3, 4],
... 'date': ['1 42018', '12 32019', '8 112020', '23 42021']},
... index = [0, 1, 2, 3])
>>> df
id date
0 1 1 42018
1 2 12 32019
2 3 8 112020
3 4 23 42021
我们可以对列进行split
以获得day的第一个值,如下所示:
>>> df['day'] = df['date'].str.split(' ', expand=True)[0]
>>> df
id date day
0 1 1 42018 1
1 2 12 32019 12
2 3 8 112020 8
3 4 23 42021 23
并从年份列date
中获取最后4位数字,以获得预期结果:
>>> df['year'] = df['date'].str[-4:].astype(int)
>>> df
id date day year
0 1 1 42018 1 2018
1 2 12 32019 12 2019
2 3 8 112020 8 2020
3 4 23 42021 23 2021
奖励:正如评论中所问的,你甚至可以用同样的原则得到这个月:
>>> df['month'] = df['date'].str.split(' ', expand=True)[1].str[:-4].astype(int)
>>> df
id date day year month
0 1 1 42018 1 2018 4
1 2 12 32019 12 2019 3
2 3 8 112020 8 2020 11
3 4 23 42021 23 2021 4