我有这个CSV,包含以下列:
ID | Dim | Word | |
---|---|---|---|
a1 | U | 快乐||
a1 | X | 爱 | |
a1 | H | 脂肪 | |
a1 | H | 丑陋 | |
y2 | U | 快乐 | |
y2 | X | 信任 | |
y2 | X | 爱 | |
pd3 | H | 丑陋 | |
ed4 | X | 信任 | |
ed4 | H | 丑陋 |
你可以做:
import numpy as np
import pandas as pd
df = pd.read_csv('your_file.csv')
A = pd.crosstab(df.ID, df.Word)
df2 = A.T @ A
np.fill_diagonal(df2.values, 0)
df2
Word Fat Happy Love Trust Ugly
Word
Fat 0 1 1 0 1
Happy 1 0 2 1 1
Love 1 2 0 1 1
Trust 0 1 1 0 1
Ugly 1 1 1 1 0
然后,您可以将df2
写入csv文件,即df2.to_csv('your_output_file.csv')