如何在数据帧之前重复第一行n次,并在数据帧末尾重复最后一行,而n表示熊猫数据帧的长度? 我有熊猫数据帧:
PT011 PT012 PT013 PT014 PT015 PT021 PT022 PT023 PT024 PT025
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
1 -0.162 -0.12 -0.12 -0.10 -0.12 -0.12 -0.12 -0.12 -0.12 -0.12
2 -0.164 -0.14 -0.14 -0.11 -0.14 -0.14 -0.14 -0.14 -0.14 -0.14
3 -0.166 -0.16 -0.16 -0.11 -0.16 -0.16 -0.16 -0.16 -0.16 -0.16
4 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15
我试过了:
import pandas as pd
import numpy as np
probes = {'PT011': [-0.16,-0.162,-0.164,-0.166,-0.167],
'PT012': [-0.1,-0.12,-0.14,-0.16,-0.15],
'PT013': [-0.1,-0.12,-0.14,-0.16,-0.15],
'PT014': [-0.09,-0.10,-0.11,-0.11,-0.13],
'PT015': [-0.1,-0.12,-0.14,-0.16,-0.15],
'PT021': [-0.1,-0.12,-0.14,-0.16,-0.15],
'PT022': [-0.1,-0.12,-0.14,-0.16,-0.15],
'PT023': [-0.1,-0.12,-0.14,-0.16,-0.15],
'PT024': [-0.1,-0.12,-0.14,-0.16,-0.15],
'PT025': [-0.2,-0.12,-0.14,-0.16,-0.15]
}
df = pd.DataFrame(probes,columns= ['PT011', 'PT012','PT013','PT014','PT015','PT021','PT022','PT023','PT024','PT025'])
print(df)
new_df=df.iloc[np.arange(len(df)).repeat([5,1,1,1,1])]
print("Repeated dataframe:n",new_df)
这给出了输出:
Repeated dataframe:
PT011 PT012 PT013 PT014 PT015 PT021 PT022 PT023 PT024 PT025
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
1 -0.162 -0.12 -0.12 -0.10 -0.12 -0.12 -0.12 -0.12 -0.12 -0.12
2 -0.164 -0.14 -0.14 -0.11 -0.14 -0.14 -0.14 -0.14 -0.14 -0.14
3 -0.166 -0.16 -0.16 -0.11 -0.16 -0.16 -0.16 -0.16 -0.16 -0.16
4 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15
但是这个解决方案并不灵活,因为每列都有可变的长度,并不总是 5 个。您对更好、更灵活的单行本有任何想法吗?
使用:
a = np.ones(len(df), dtype=int)
#if need repeat only first row 5 times
#a[0] = len(df)+1
#if need repeat first and last row 5 times
a[[0, -1]] = len(df)+1
print (a)
[5 1 1 1 5]
new_df=df.iloc[np.arange(len(df)).repeat(a)]
print("Repeated dataframe:n",new_df)
PT011 PT012 PT013 PT014 PT015 PT021 PT022 PT023 PT024 PT025
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
1 -0.162 -0.12 -0.12 -0.10 -0.12 -0.12 -0.12 -0.12 -0.12 -0.12
2 -0.164 -0.14 -0.14 -0.11 -0.14 -0.14 -0.14 -0.14 -0.14 -0.14
3 -0.166 -0.16 -0.16 -0.11 -0.16 -0.16 -0.16 -0.16 -0.16 -0.16
4 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15
4 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15
4 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15
4 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15
4 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15
如果所有值都是数字,则可以使用:
new_df = pd.DataFrame(np.repeat(df.values, a, axis=0), columns=df.columns)
print("Repeated dataframe:n",new_df)
PT011 PT012 PT013 PT014 PT015 PT021 PT022 PT023 PT024 PT025
0 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
1 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
2 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
3 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
4 -0.160 -0.10 -0.10 -0.09 -0.10 -0.10 -0.10 -0.10 -0.10 -0.20
5 -0.162 -0.12 -0.12 -0.10 -0.12 -0.12 -0.12 -0.12 -0.12 -0.12
6 -0.164 -0.14 -0.14 -0.11 -0.14 -0.14 -0.14 -0.14 -0.14 -0.14
7 -0.166 -0.16 -0.16 -0.11 -0.16 -0.16 -0.16 -0.16 -0.16 -0.16
8 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15
9 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15
10 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15
11 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15
12 -0.167 -0.15 -0.15 -0.13 -0.15 -0.15 -0.15 -0.15 -0.15 -0.15