我在pyspark中有这个模式:
root
|-- SortedLenders: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- LenderID: string (nullable = true)
| | |-- MaxProfit: string (nullable = true)
|-- FilteredOutDecisions: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- ApprovedAmount: integer (nullable = true)
| | |-- Reasons: array (nullable = true)
| | | |-- element: integer (containsNull = true)
如何强制转换FilteredOutDecisions。列变成双列?谢谢你,提前!
试试这个:
df = (
df
.withColumn('newFilteredOutDecisions', f.expr('transform(FilteredOutDecisions, element -> struct(element.ApprovedAmount as ApprovedAmount, transform(element.Reasons, value -> cast (value as double)) as Reasons))'))
)