>我有以下列包含位置(街道名称,x和y坐标(:
Location
"1139 57 STREET New York (40.632653207600001, -74.000244990799999)"
我想做的是将其分成三列:"地址","经度"和"纬度"。类似这个:
Location Latitude Longitude
"1139 57 STREET New York 40.632653207600001 -74.000244990799999"
我将如何做到这一点?
使用str.extract
df.Location.str.extract(
'^(?P<Location>.*)s*((?P<Latitude>[^,]*),s*(?P<Longitude>S*)).*$',
expand=True
)
Location Latitude Longitude
0 1139 57 STREET New York 40.632653207600001 -74.000244990799999
另一个不使用正则表达式的想法,假设您的原始数据格式一致:
def split_location(row):
Location = row[:row.find('(')-1]
Latitude = row[row.find('(')+1 : r.find(',')]
Longitude = row[row.find(',')+2 :-1]
return {'Location' : Location,
'Latitude' : Latitude,
'Longitude' : Longitude}
# original_df is a 1 column dataframe of Location (string) that you want to split
split_df = original_df[Location].apply(lambda x: split_location(x))
split_df = pd.DataFrame(list(split_df))