我在Excel中有如下数据
category size1 size2 size3
cat1 10 20 30
cat2 20 10 15
cat3 30 20 10
我想要两个报告/Excel 输出,如下所示
#1)
Category-sizetype-value
cat1 size1 10
cat1 size2 20
cat1 size3 30
cat2 size1 20
。
#2)
Category-size-value-value counts(i.e how many time specific size value appears)
cat1 size1 10 3 times
cat1 size2 20 2 times
cat1 size3 30 1 time
cat2 size1 20 4 times
。 到目前为止我写的代码,感谢一些指示为什么 pd.concat 在这里不起作用?并且不能
import pandas as pd
path_to_file = 'C:UsersNiruDesktopcat-sizes.xlsx'
xl = pd.ExcelFile(path_to_file)
print(xl.sheet_names)
df = xl.parse('Sheet1')
#print(df.head())
print(df.columns)
frames = []
for i in df.columns:
dfd = "df.loc[:,['Category','" +i+"']]"
frames.append(dfd)
print(pd.concat(frames))
您的示例数据和输出让我有点困惑,但我想这就是您想要的。
#Q1:
df1=pd.melt(df, id_vars=['category'], value_vars=['size1','size2','size3'])
Out[66]:
category variable value
0 cat1 size1 10
1 cat2 size1 20
2 cat3 size1 30
3 cat1 size2 20
4 cat2 size2 10
5 cat3 size2 20
6 cat1 size3 30
7 cat2 size3 15
8 cat3 size3 10
#Q2:
df1['counts']=df1.groupby(['variable','value']).transform('count')
Out[69]:
category variable value counts
0 cat1 size1 10 1
1 cat2 size1 20 1
2 cat3 size1 30 1
3 cat1 size2 20 2
4 cat2 size2 10 1
5 cat3 size2 20 2
6 cat1 size3 30 1
7 cat2 size3 15 1
8 cat3 size3 10 1
或第 2 季度
df1['counts']=df1.groupby(['variable']).transform('count')
Out[71]:
category variable value counts
0 cat1 size1 10 3
1 cat2 size1 20 3
2 cat3 size1 30 3
3 cat1 size2 20 3
4 cat2 size2 10 3
5 cat3 size2 20 3
6 cat1 size3 30 3
7 cat2 size3 15 3
8 cat3 size3 10 3