设置row_number从0开始



嗨,我正在尝试火花窗口功能。我需要从" 0"启动row_number。这是我的代码。

val target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank", row_number().over(Window.partitionBy("name","mark1","mark2").orderBy("id")))

行号从" 1"开始。我尝试过这样的。

val target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank", row_number().over(Window.partitionBy("name","mark1","mark2").orderBy("id") -1))
val target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank", row_number().over(Window.partitionBy("name","mark1","mark2").orderBy("id"))) -1

对我不起作用。我需要从零启动我的row_number。任何帮助将不胜感激。

带有Spark 2.4它是

val target2 = target1.select("id","name","mark1","mark2","version")
             .withColumn("rank",  
              row_number.over(Window.partitionBy("name","mark1","mark2") 
             .orderBy("id")) - 1)

在这里,它将相对记录编号减少1

尝试这个:

w = Window.partitionBy("name","mark1","mark2").orderBy("id") 
target2 = target1.select("id","name","mark1","mark2","version").withColumn("rank",
                         row_number().over(w)-1)

它与pyspark一起使用。

相关内容

  • 没有找到相关文章

最新更新