Pandas - Column在Dataframe的列中添加重复或列表值



假设一个数据框是这样的

one  two 
a  1.0  1.0 
b  2.0  2.0 
c  3.0  3.0 
d  NaN  4.0 

添加新的三列如下

df["three"] = df["one"] * df["two"]

结果

one  two     three 
a  1.0  1.0    1.0 
b  2.0  2.0    4.0 
c  3.0  3.0    9.0   
d  NaN  4.0    NaN  

包含重复列表或列表的列值如何,我需要创建一个新列并添加值最高的数字

例子
one  two 
a  1.0  [12,1]
[12,1]
b  2.0  2.0    
c  3.0  3.0    
d  NaN  4.0    

所以我想这样

one  two        flag
a  1.0  [12,1]      12
[12,1]
b  2.0  [200,400]   400
c  3.0  3.0         3.0
d  NaN  4.0         4.0

感谢

如果有列表或嵌套列表或浮动,您可以使用max:

将列表扁平化。
df = pd.DataFrame({"two":  [[[12,1],[12,1]] ,[200,400] ,3.0,4.0 ]})

from typing import Iterable 

#https://stackoverflow.com/a/40857703/2901002
def flatten(items):
"""Yield items from any nested iterable; see Reference."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
for sub_x in flatten(x):
yield sub_x
else:
yield x

df['new'] = [max(flatten(x)) if isinstance(x, list) else x for x in df['two']]
print (df)
two    new
0  [[12, 1], [12, 1]]   12.0
1          [200, 400]  400.0
2                 3.0    3.0
3                 4.0    4.0

编辑:对于new DataFrame中所有列的max使用聚合函数max:

df = df_orig.pivot_table(index=['keyword_name','volume'], 
columns='asin', 
values='rank', 
aggfunc=list)
df1 = df_orig.pivot_table(index=['keyword_name','volume'], 
columns='asin', 
values='rank', 
aggfunc='max')
out = pd.concat([df, df1.add_suffix('_max')], axis=1)

最新更新