嗨,我正在尝试火花窗口功能。我需要从" 0"启动row_number。这是我的代码。
val target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank", row_number().over(Window.partitionBy("name","mark1","mark2").orderBy("id")))
行号从" 1"开始。我尝试过这样的。
val target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank", row_number().over(Window.partitionBy("name","mark1","mark2").orderBy("id") -1))
val target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank", row_number().over(Window.partitionBy("name","mark1","mark2").orderBy("id"))) -1
对我不起作用。我需要从零启动我的row_number。任何帮助将不胜感激。
带有Spark 2.4它是
val target2 = target1.select("id","name","mark1","mark2","version")
.withColumn("rank",
row_number.over(Window.partitionBy("name","mark1","mark2")
.orderBy("id")) - 1)
在这里,它将相对记录编号减少1
尝试这个:
w = Window.partitionBy("name","mark1","mark2").orderBy("id")
target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank",
row_number().over(w)-1)
它与pyspark一起使用。