透视单行数据框,其中不能应用groupBy



我有一个这样的数据框架:

inputRecordSetCount

您可以使用本教程中提到的stack()操作。

因为有3个唯一的值,传递大小,对标签和列名:

stack(3, "inputRecordSetCount", inputRecordSetCount, "inputRecordCount", inputRecordCount, "suspenseRecordCount", suspenseRecordCount) as (operation, value)

完整的示例:

df = spark.createDataFrame(data=[[166,1216,10]], schema=['inputRecordSetCount','inputRecordCount','suspenseRecordCount'])
cols = [f'"{c}", {c}' for c in df.columns]
exprs = f"stack({len(cols)}, {', '.join(str(c) for c in cols)}) as (operation, value)"
df = df.selectExpr(exprs)
df.show()
+-------------------+-----+
|          operation|value|
+-------------------+-----+
|inputRecordSetCount|  166|
|   inputRecordCount| 1216|
|suspenseRecordCount|   10|
+-------------------+-----+

最新更新