有没有办法在pyspark中创建一个字段包含非字母数字字符的行?
例如
from pyspark.sql import Row
Row(my-field='myvalue') # does not work because my-field can't be parsed by python
Row(**{'my-field':'myvalue'}) # I was expecting this workaround to work but
# it gives "TypeError: Can not infer schema for type: <class 'str'>"
有可能:
>>> from pyspark.sql import Row
>>> P = Row("foo-bar", "date") # use it as a class factory
>>> P("a", "b")
Row(foo-bar='a', date='b')
请注意,并非每种序列化格式(例如 Parquet、ORC(都能正确处理列名中的某些特殊字符。最好坚持使用 ASCII。