我有一个类似的数据框架:
+---+--------------------+
|idn| recommendations|
+---+--------------------+
|463|[[10955,0.0086656...|
|496|[[12767,0.0209305...|
|148|[[9813,0.00673213...|
|471|[[8537,0.00546676...|
|243|[[10846,0.0044064...|
|623|[[10955,0.3857911...|
|540|[[11463,0.0250675...|
|392|[[7177,0.01615425...|
|737|[[7994,0.12720428...|
|516|[[10955,0.4047550...|
+---+--------------------+
和类似的模式:
dataFrame.printSchema()
root
|-- idn: long (nullable = true)
|-- recommendations: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- id_usn: long (nullable = true)
| | |-- rating: double (nullable = true)
现在,我要转换 id_usn 和评级 in column 建议 to String
Div>您可以在下面施放嵌套的结构列,
col_schema = ArrayType(StructType([StructField('id_usn',StringType(),True),StructField('rating',StringType(),True)]))
df = dataFrame.select('idn',dataFrame.recommendations.cast(col_schema))
df.printSchema()
请尝试一下,让我知道。