我对UDF函数有一个问题:我有一个数据:原始数据
我正在写一个代码来提升名称列的大小写:(input john doe->putout:john doe(
@F.udf
def coverCase(Str):
resStr=""
arr=str.split(" ")
for x in arr:
resStr=resStr+ x[0:1].upper()+x[1:len(x)]+" "
return resStr
df.select(coverCase("name")).show()
我没有输出输出屏幕截图
from pyspark.sql import functions as F
from pyspark.sql.functions import col
df = spark.createDataFrame([(1,"john doe", 21)], ("id", "name", "age"))
@F.udf
def convertCase(str):
resStr=""
arr = str.split(" ")
for x in arr:
resStr= resStr + x[0:1].upper() + x[1:len(x)] + " "
return resStr
df.select(convertCase(col("name"))).show()
+-----------------+
|convertCase(name)|
+-----------------+
| John Doe |
+-----------------+