我有一个数据帧,其中有两列包含 json 数据,我想将该 json 数据解析到我的数据帧所在的列中
+------------+---------+--------------------+--------------------+
| firstname| lastname| travellerdetails| bookjson|
+------------+---------+--------------------+--------------------+
| K| Gupta|[{FlierNumber:","...|[{origin:DEL","Et...|
| K| Gupta|[{FlierNumber:","...|[{origin:DEL","Et...|
|Jana Ranjani|Raghu Raj|[{BaggageTypeRetu...|[{origin:AMD","De...|
+------------+---------+--------------------+--------------------+
有两列有json数据,我想解析该列
The first row of travellerdetails is
:
""[{""""FlierNumber"""":""""""""","BaggageTypeReturn"""":""""""""","FirstName"""":""""K""""","Title"""":""""1""""","MiddleName"""":""""D""""","LastName"""":""""Gupta""""","MealTypeOnward"""":""""""""","DateOfBirth"""":""""""""","BaggageTypeOnward"""":""""""""","SeatTypeOnward"""":""""""""","MealTypeReturn"""":""""""""","FrequentAirline"""":null","Type"""":""""A""""","SeatTypeReturn"""":""""""""}","{""""FlierNumber"""":""""""""","BaggageTypeReturn"""":""""""""","FirstName"""":""""Sweety""""","Title"""":""""2""""","MiddleName"""":""""""""","LastName"""":""""Gupta""""","MealTypeOnward"""":""""""""","DateOfBirth"""":""""""""","BaggageTypeOnward"""":""""""""","SeatTypeOnward"""":""""""""","MealTypeReturn"""":""""""""","FrequentAirline"""":null","Type"""":""""A""""","SeatTypeReturn"""":""""""""}]""
the first row of bookjson is
:
""[{""""origin"""":""""DEL""""","EticketFlag"""":""""false""""","flightcode"""":""""251""""","farebasis"""":""""L0IP""""","spicestatus"""":""""Canceled""""","deptime"""":""""07:20""""","codeshare"""":""""""""","ibibopartner"""":""""indigonew""""","productclass"""":""""R""""","duration"""":""""2h 5m""""","ruleno"""":""""4910""""","qtype"""":""""fbs""""","tickettype"""":""""e""""","flightno"""":""""251""""","servicetype"""":""""""""","fareclass"""":""""L""""","faresequence"""":""""1""""","destination"""":""""GAU""""","carrierid"""":""""6E""""","stops"""":""""0""""","state"""":""""New""""","fare"""":{""""adultphf"""":50","adultttf"""":75","adultdf"""":115","totalsurcharge"""":0","indigonewgrossamount"""":10202","adulttotalfare"""":5101","totalcommission"""":0","adultbasefare"""":4150","totalpassengerhandlingfee"""":0","adultudf"""":562","adultpassengerservicefee"""":149","totalpassengerservicefee"""":0","totalothers"""":0","childtotalfare"""":0","totalbasefare"""":8300","totalfare"""":101...
请帮我如何解析列..??
您要查找的是F.from_json()
。
你会像这样使用它:
from pyspark.sql import functions as F
df = df.withColumn("travellerdetails", F.from_json(F.col("travellerdetails")))
df = df.withColumn("bookjson", F.from_json(F.col("bookjson")))
但是,请注意,您在问题中给出的 JSON 无效,因此会导致null
。另请注意,您可以将架构作为第二个参数传递给from_json
- 这可能会加快解析速度,并允许您为每个字段指定所需的数据类型。