我正在通过
阅读CSV文件data=sc.textFile("filename")
Df = Sparksql.create dataframe()
Pdf = Df.toPandas ()
现在Pdf是分布在spark集群中还是驻留在主机环境中??
No
正如DataFrame的PySpark源代码中所说:
.. note:: This method should only be used if the resulting Pandas's DataFrame is expected
to be small, as all the data is loaded into the driver's memory.