我正在阅读一个 csv 文件,其中包含数周内按商店按产品划分的每周销售数据,并尝试部分转置数据,以便每一行代表按产品、商店、周划分的每周交易
从这里出发:
Product,Store,9/1/18,9/8/18,9/15/18,9/22/18
vacuum,123,1,5,3,3
toaster,456,5,7,4,10
对此:
Product,Store,Week,Sales
vacuum,123,9/1/18,1
vacuum,123,9/8/18,5
vacuum,123,9/15/18,3
vacuum,123,9/22/18,3
toaster,456,9/1/18,5
toaster,456,9/8/18,7
toaster,456,9/15/18,4
toaster,456,9/22/18,10...
我是 Python 的新手(2 天大(,我半懂了使用 zip_longest/阅读器/编写器的完整转置,但无法弄清楚如何做部分版本
另外,
df.set_index(['Product','Store']).stack().reset_index()
输出:
Product Store level_2 0
0 vacuum 123 9/1/18 1
1 vacuum 123 9/8/18 5
2 vacuum 123 9/15/18 3
3 vacuum 123 9/22/18 3
4 toaster 456 9/1/18 5
5 toaster 456 9/8/18 7
6 toaster 456 9/15/18 4
7 toaster 456 9/22/18 10
使用清理的列命名,
(df.set_index(['Product','Store'])
.rename_axis('Week', axis=1)
.stack()
.rename('Sales')
.reset_index())
输出:
Product Store Week Sales
0 vacuum 123 9/1/18 1
1 vacuum 123 9/8/18 5
2 vacuum 123 9/15/18 3
3 vacuum 123 9/22/18 3
4 toaster 456 9/1/18 5
5 toaster 456 9/8/18 7
6 toaster 456 9/15/18 4
7 toaster 456 9/22/18 10
您可以使用 melt 来执行此操作:
df.melt(id_vars=['Product', 'Store',],
value_vars=['9/1/18', '9/8/18','9/15/18','9/22/18'],
var_name='week', value_name='Sales')
Product Store week Sales
0 vacuum 123 9/1/18 1
1 toaster 456 9/1/18 5
2 vacuum 123 9/8/18 5
3 toaster 456 9/8/18 7
4 vacuum 123 9/15/18 3
5 toaster 456 9/15/18 4
6 vacuum 123 9/22/18 3
7 toaster 456 9/22/18 10
您可以使用df.pivot()
:
df.pivot(index='Product', columns='Store').stack(level=[1,0]).reset_index()
例如:
import pandas as pd
df = pd.read_csv('test.csv')
df = df.pivot(index='Product', columns='Store').stack(level=[1,0]).reset_index()
df.columns = ['Product','Store','Week','Sales']
这给了:
Product Store Week Sales
0 toaster 456 9/1/18 5.0
1 toaster 456 9/8/18 7.0
2 toaster 456 9/15/18 4.0
3 toaster 456 9/22/18 10.0
4 vacuum 123 9/1/18 1.0
5 vacuum 123 9/8/18 5.0
6 vacuum 123 9/15/18 3.0
7 vacuum 123 9/22/18 3.0