如何取一个熊猫系列并将其拆分,这样它就给了我全名


0
2   ['name:', 'Atlanta', 'GA:', 'Hartsfield-Jackson', 'Atlanta', 'International']
35  ['name:', 'Boston', 'MA:', 'Logan', 'International']
68  ['name:', 'Baltimore', 'MD:', 'Baltimore/Washington', 'International', 'Thurgood', 'Marshall']
101 ['name:', 'Charlotte', 'NC:', 'Charlotte', 'Douglas', 'International']
134 ['name:', 'Washington', 'DC:', 'Ronald', 'Reagan', 'Washington', 'National']
167 ['name:', 'Denver', 'CO:', 'Denver', 'International']
200 ['name:', 'Dallas/Fort', 'Worth', 'TX:', 'Dallas/Fort', 'Worth', 'International']
233 ['name:', 'Detroit', 'MI:', 'Detroit', 'Metro', 'Wayne', 'County']
266 ['name:', 'Newark', 'NJ:', 'Newark', 'Liberty', 'International']
299 ['name:', 'Fort', 'Lauderdale', 'FL:', 'Fort', 'Lauderdale-Hollywood', 'International']
332 ['name:', 'Washington', 'DC:', 'Washington', 'Dulles', 'International']

我上面有这个系列,我想把每一行都分开,这样它就列出了所有类似的东西;亚特兰大GA:哈茨菲尔德-杰克逊亚特兰大国际。这将是数据帧中的一列。本质上,我只想删除每行开头的"name:",然后将其拆分,这样我就有了我想要的表单。我试过

my_series.str.split("'").str[7].reset_index(drop=True).astype(str)

但我的输出是

0
0   Hartsfield-Jackson
1   Logan
2   Baltimore/Washington
3   Charlotte
4   Ronald
5   Denver
6   TX:
7   Detroit
8   Newark
9   FL:
10  Washington

你好,枪手10,

正如Barmar所建议的那样,你不是在处理字符串;相反,我怀疑您将列表作为每个单元格中的值来处理。

只需自己做一个快速测试,看看结果是否是一个列表:

type(df.iloc[index_of_this_column,0])

如果你确实在处理列表,那么下面是我将如何处理它。

def pandas_list_str_spliter(input_value):
# we use the join function to turn list into the str
# From here, you should be able to do any transformation you need 
# I only removed the "name:" and front & tailing whitespace.
# But you can modify the function to suit your need
return ' '.join(input_value).replace('name:','').strip()
df['Column_Split'] = df['Column_Split'].apply(pandas_list_str_spliter)

我希望这能有所帮助。

最新更新