例如:Mydataframe有2列1行
val df = Seq(("1,2,3", "tom")).toDF("id", "name")
ids(String) name(String)
1,2,3 tom
after transform =>
我希望df等于
Seq(("1", "tom"), ("2", "tom"), ("3", "tom")).toDF("id", "name")
ids name
1 tom
2 tom
3 tom
我看到有一个explosion()函数,其签名如下:
public <A extends scala.Product> DataFrame explode(scala.collection.Seq<Column> input,
scala.Function1<Row,scala.collection.TraversableOnce<A>> f,
scala.reflect.api.TypeTags.TypeTag<A> evidence$1)
- 我想使用scala API
Try
from pyspark.sql.functions import explode
df.select(explode(df.ids).alias('ids'), name).collect()