to_date() 需要 1 个位置参数,但给出了 2 个


| date|
+----------+
|  2/3/1994|
|  3/4/1994|
|  4/5/1994|
|  5/3/1994|
|  6/9/1994|
|  7/8/1994|
|  8/9/1994|
| 9/10/1994|
|10/10/1994|
| 11/4/1994|
| 12/3/1994|
|  2/4/1996|
|  4/9/1996|
|    5/7/96|
|  6/8/1996|
| 7/10/1996|
| 9/11/1996|
| 10/3/1996|
|  6/2/2000|
|  7/2/2000|

from pyspark.sql.functions import to_date
newdate=df6.withColumn(to_date(df6.date, 'yyyy-MM-dd').alias('dt')).show()
TypeError: to_date() takes 1 positional argument but 2 were given

withColumn语法似乎是错误的。你能试试这个吗:

newdate=df6.withColumn("new_date", to_date("date", 'dd/MM/yyyy')).show()
>>> from pyspark.sql.functions  import *
>>> df.show()
+----------+
|      date|
+----------+
|  2/3/1994|
|  3/4/1994|
|  4/5/1994|
|  5/3/1994|
|  6/9/1994|
|  7/8/1994|
|  8/9/1994|
| 9/10/1994|
|10/10/1994|
| 11/4/1994|
| 12/3/1994|
|  2/4/1996|
|  4/9/1996|
|    5/7/96|
|  6/8/1996|
| 7/10/1996|
| 9/11/1996|
| 10/3/1996|
|  6/2/2000|
|  7/2/2000|
+----------+
>>> df.withColumn("dt", to_date(col("date"), "MM/dd/yyyy")).show()
+----------+----------+
|      date|        dt|
+----------+----------+
|  2/3/1994|1994-02-03|
|  3/4/1994|1994-03-04|
|  4/5/1994|1994-04-05|
|  5/3/1994|1994-05-03|
|  6/9/1994|1994-06-09|
|  7/8/1994|1994-07-08|
|  8/9/1994|1994-08-09|
| 9/10/1994|1994-09-10|
|10/10/1994|1994-10-10|
| 11/4/1994|1994-11-04|
| 12/3/1994|1994-12-03|
|  2/4/1996|1996-02-04|
|  4/9/1996|1996-04-09|
|    5/7/96|0096-05-07|
|  6/8/1996|1996-06-08|
| 7/10/1996|1996-07-10|
| 9/11/1996|1996-09-11|
| 10/3/1996|1996-10-03|
|  6/2/2000|2000-06-02|
|  7/2/2000|2000-07-02|
+----------+----------+
to_date

从Spark 2.2.0开始进行了改造,如果您使用的是Spark <2.2.0,那么它只需要一个参数。

请参考 Spark 2.2.0 pyspark.sql.functions.to_date 和 Spark 2.1.0 pyspark.sql.functions.to_date

相关内容

  • 没有找到相关文章

最新更新