高效计算对称函数



我有两个数据帧。 我需要将一个函数应用于数据帧中所有可能的几行。

L=product(df.iterrows(),df.iterrows())
res=map(myfunc,L)

其中 myfunc(r1,r2(->float 在输入中获取两行并返回一个值。 现在,Myfunc是对称的,因此

myfunc(f1,f2)=myfunc(f2,f1)

对于所有可能的输入耦合。

使用产品/地图,我计算函数的次数是所需次数的两倍。 如何优雅地避免这些重复计算?

IIUC,您可以将itertools.combinations与数据帧索引一起使用:

np.random.seed(0)
df = pd.DataFrame(np.random.randint(0,100,(10,10)), index=[*'abcdefghij'], columns=[*'ABCDEFGHIJ'])
from itertools import combinations
def addTwoRows(r1, r2):
return r1.sum() + r2.sum()
[(addTwoRows(df.loc[i], df.loc[j]),(i,j)) for i, j in combinations(df.index, 2)]

输出:

[(1166, ('a', 'b')),
(1074, ('a', 'c')),
(1035, ('a', 'd')),
(922, ('a', 'e')),
(849, ('a', 'f')),
(920, ('a', 'g')),
(968, ('a', 'h')),
(1046, ('a', 'i')),
(1043, ('a', 'j')),
(1190, ('b', 'c')),
(1151, ('b', 'd')),
(1038, ('b', 'e')),
(965, ('b', 'f')),
(1036, ('b', 'g')),
(1084, ('b', 'h')),
(1162, ('b', 'i')),
(1159, ('b', 'j')),
(1059, ('c', 'd')),
(946, ('c', 'e')),
(873, ('c', 'f')),
(944, ('c', 'g')),
(992, ('c', 'h')),
(1070, ('c', 'i')),
(1067, ('c', 'j')),
(907, ('d', 'e')),
(834, ('d', 'f')),
(905, ('d', 'g')),
(953, ('d', 'h')),
(1031, ('d', 'i')),
(1028, ('d', 'j')),
(721, ('e', 'f')),
(792, ('e', 'g')),
(840, ('e', 'h')),
(918, ('e', 'i')),
(915, ('e', 'j')),
(719, ('f', 'g')),
(767, ('f', 'h')),
(845, ('f', 'i')),
(842, ('f', 'j')),
(838, ('g', 'h')),
(916, ('g', 'i')),
(913, ('g', 'j')),
(964, ('h', 'i')),
(961, ('h', 'j')),
(1039, ('i', 'j'))]

最新更新