Pyspark 创建名称中包含非字母数字字符的行



有没有办法在pyspark中创建一个字段包含非字母数字字符的行?

例如

from pyspark.sql import Row
Row(my-field='myvalue') # does not work because my-field can't be parsed by python
Row(**{'my-field':'myvalue'}) # I was expecting this workaround to work but 
# it gives "TypeError: Can not infer schema for type: <class 'str'>"

有可能:

>>> from pyspark.sql import Row
>>> P = Row("foo-bar", "date")  # use it as a class factory
>>> P("a", "b")
Row(foo-bar='a', date='b')

请注意,并非每种序列化格式(例如 Parquet、ORC(都能正确处理列名中的某些特殊字符。最好坚持使用 ASCII。

最新更新