如何通过pandas中列的列表创建新列?

我想在另一个列的基础上创建一个新列。列C是一个单词列表。我想列出所有对。例如，考虑C列的2行，如下所示!

Existing column "C" 
[‘a’,’b’,’c’]
[‘g’,’h’, ‘j’]

我希望新列是这样的:

New column
[‘a’, ‘b’],[‘a’, ‘c’],[‘b’, ‘c’]
[‘g’, ‘h’],[‘g’, ‘j’],[‘h’, ‘j’]

我知道如何通过使用list的方法做到这一点。例如:

List_words=Df[‘C].tolist()
pairs_of_skills=[]
for l in List_words:
Each=[]
for i in range(len(l)-1):
for j in range(i+1,len(l)):
if j!=i:      
Each.append(sorted([l[j],l[i]]))
pairs_of_skills.append(pairs_of_skills)

则Each为新列。但我正在寻找一种更有效的方法。我有一个巨大的数据集。有更快的方法吗?

使用itertools.combinations:

>>> import itertools
>>> df['C'].apply(lambda x: list(itertools.combinations(x, 2)))
0    [(a, b), (a, c), (b, c)]
1    [(g, h), (g, j), (h, j)]
Name: C, dtype: object
>>>

作为新的DataFrame:

>>> pd.DataFrame(df['C'].apply(lambda x: list(itertools.combinations(x, 2))).to_numpy(), columns=['New Column'])
New Column
0  [(a, b), (a, c), (b, c)]
1  [(g, h), (g, j), (h, j)]
>>>

或assign和pop:

>>> df.assign(**{'New Column': df.pop('C').apply(lambda x: list(itertools.combinations(x, 2)))})
New Column
0  [(a, b), (a, c), (b, c)]
1  [(g, h), (g, j), (h, j)]
>>>

相关内容

最新更新

热门标签：