我使用python将两个excel数据表连接在一起。
我想查看我添加的列表作为行而不是列表,因为我拥有的实际数据非常大。
我最好想将列表移动到不同的excel选项卡(使用python)。我可以将列表中的数据视为行而不是列表。有python代码吗?
季度1
团队国家账户PRI 10-k日期名称家庭美国1002 205调整累计3/31/2022杰西国际象棋巴西1001 109翻译综合2022年4月1日迈克House肯尼亚1003 303累计收入2022年2月4日南非1004 404翻译其他4/3/2022约翰俱乐部土耳其1005 605调整累计4/4/2022尼克复活节法国1006 808翻译其他4/5/2022乔关键属性综合2022年6月4日里克累计收入2022年7月4日汤姆穿越埃塞俄比亚1010405翻译累计4/8/2022吉尔足球古巴1011 808调整其他4/9/2022马特新加坡棒球1012 304调整综合2022年4月10日篮球卡塔尔1013 102翻译收入2022年4月11日简
季度2
团队国家账户PRI 10-k日期名称家庭美国10002 205调整累计3/31/2022杰西国际象棋巴西1001 109翻译综合2022年4月1日迈克House肯尼亚1003 303累计收入2022年2月4日南非1004 404翻译其他4/3/2022约翰俱乐部土耳其1005 605调整累计4/4/2022尼克复活节法国1006 808翻译其他4/5/2022乔关键属性综合2022年6月4日里克累计收入2022年7月4日汤姆鹰眼荷兰1015 208调整综合2022年4月12日克里斯
这是我到目前为止的代码和我要加入的数据。
输入图片描述
输入图片描述
输入图片描述
在这里输入图像描述输入图片描述
import csv
import os
import sys
path_quarter_1 = os.path.join(sys.path[0], "Quarter_1.csv") # path to the csv file
path_quarter_2 = os.path.join(sys.path[0], "Quarter_2.csv") # path to the csv file
# coulmns are:- Team, Country, Account, PRI, 10-k, Date, Name
with open(path_quarter_1, "r") as f: # opening the csv file
reader = csv.reader(f) # creating a reader object
data_q1 = list(reader) # converting the reader object to a list for Quarter 1
with open(path_quarter_2, "r") as f: # opening the csv file
reader = csv.reader(f) # creating a reader object
data_q2 = list(reader) # converting the reader object to a list for Quarter 2
# Account that are in both quarters
account_in_both = [] # creating a list to store the accounts that are in both quarters
for i in range(1, len(data_q1)): # iterating through the list for Quarter 1
for j in range(1, len(data_q2)): # iterating through the list for Quarter 2
if data_q1[i][2] == data_q2[j][2]: # checking if the accounts are same
account_in_both.append(data_q1[i][2]) # appending the account to the list
print("Account in both", account_in_both) # printing the list
print("nn") # printing a new line
# In Q1 but not in Q2
account_in_q1_not_q2 = (
[]
) # creating a list to store the accounts that are in Q1 but not in Q2
for i in range(1, len(data_q1)): # iterating through the list for Quarter 1
if (
data_q1[i][2] not in account_in_both
): # checking if the account is not in both quarters
account_in_q1_not_q2.append(data_q1[i][2]) # appending the account to the list
print("Account in Q1 but not in Q2", account_in_q1_not_q2) # printing the list
print("nn") # printing a new line
# In Q2 but not in Q1
account_in_q2_not_q1 = (
[]
) # creating a list to store the accounts that are in Q2 but not in Q1
for i in range(1, len(data_q2)): # iterating through the list for Quarter 2
if (
data_q2[i][2] not in account_in_both
): # checking if the account is not in both quarters
account_in_q2_not_q1.append(data_q2[i][2]) # appending the account to the list
print("Account in Q2 but not in Q1", account_in_q2_not_q1) # printing the list
print("nn") # printing a new line
# In which both list are same
all_rows_are_same_in_both = (
[]
) # creating a list to store the rows that are in both quarters
for i in range(1, len(data_q1)): # iterating through the list for Quarter 1
for j in range(1, len(data_q2)): # iterating through the list for Quarter 2
if data_q1[i] == data_q2[j]: # checking if the rows are same
all_rows_are_same_in_both.append(
data_q1[i]
) # appending the row to the list
print("All rows are same in both", all_rows_are_same_in_both) # printing the list
print("nn") # printing a new line
# In which rows are in q1 but not in q2
rows_in_q1_not_q2 = [] # creating a list to store the rows that are in Q1 but not in Q2
for i in range(1, len(data_q1)): # iterating through the list for Quarter 1
if (
data_q1[i] not in all_rows_are_same_in_both
): # checking if the row is not in both quarters
rows_in_q1_not_q2.append(data_q1[i]) # appending the row to the list
print("Rows in Q1 but not in Q2", rows_in_q1_not_q2) # printing the list
print("nn") # printing a new line
# In which rows are in q2 but not in q1
rows_in_q2_not_q1 = [] # creating a list to store the rows that are in Q2 but not in Q1
for i in range(1, len(data_q2)): # iterating through the list for Quarter 2
if (
data_q2[i] not in all_rows_are_same_in_both
): # checking if the row is not in both quarters
rows_in_q2_not_q1.append(data_q2[i]) # appending the row to the list
print("Rows in Q2 but not in Q1", rows_in_q2_not_q1) # printing the list
print("nn") # printing a new line
你用嵌套循环把这复杂化了。
简单地循环遍历一个列表,并检查元素是否在另一个列表中:
account_in_q1 = [x for x in list(data_q1['Account']) if x not in list(data_q2['Account'])]
account_in_q2 = [x for x in list(data_q2['Account']) if x not in list(data_q1['Account'])]
account_in_both = [x for x in list(data_q1['Account']) if x in list(data_q2['Account'])]
但即使这样,因为它听起来像你想把它放入一个表也许,我只是使用pandas
,并使用'outer'
连接合并,包括合并的数据框上的指示符,并过滤掉。
import pandas as pd
import os
import sys
path_quarter_1 = os.path.join(sys.path[0], "Quarter_1.csv") # path to the csv file
path_quarter_2 = os.path.join(sys.path[0], "Quarter_2.csv") # path to the csv file
data_q1 = pd.read_csv(path_quarter_1)
data_q2 = pd.read_csv(path_quarter_2)
cols = list(data_q1.columns)
merged = data_q1.merge(data_q2, how='outer', on=cols, indicator=True)
# In Q1 but not in Q2
in_q1 = merged[merged['_merge'] == 'left_only']
q1_accounts = list(in_q1['Account'])
print("Rows in Q1 but not in Q2", q1_accounts)
# In Q2 but not in Q1
in_q2 = merged[merged['_merge'] == 'right_only']
q2_accounts = list(in_q2['Account'])
print("Rows in Q2 but not in Q1", q2_accounts)
# In which both list are same
in_both = merged[merged['_merge'] == 'both']
both_accounts = list(in_both['Account'])
print("All rows are same in both", both_accounts)
输出:
Rows in Q1 but not in Q2 [1002, 1010, 1011, 1012, 1013]
Rows in Q2 but not in Q1 [10002, 1015]
All rows are same in both [1001, 1003, 1004, 1005, 1006, 1007, 1008]
查看不同帐户的"链接"位置在一起:
import pandas as pd
import os
import sys
import numpy as np
path_quarter_1 = os.path.join('D:/test/', "Quarter_1.csv") # path to the csv file
path_quarter_2 = os.path.join('D:/test/', "Quarter_2.csv") # path to the csv file
data_q1 = pd.read_csv(path_quarter_1)
data_q2 = pd.read_csv(path_quarter_2)
cols = [x for x in data_q1.columns if x != 'Account']
merged = data_q1.merge(data_q2, how='outer', on=cols, indicator=True)
for col in ['Account_x', 'Account_y']:
merged[col] = merged[col].fillna('')
def combine_columns(row):
combined = list(set([row['Account_x'], row['Account_y']]))
combined = [x for x in combined if x != '']
combined = [str(int(x)) for x in combined]
return combined
merged['Account'] = merged.apply(lambda row: combine_columns(row), axis=1)
# In Q1 but not in Q2
in_q1 = merged[merged['_merge'] == 'left_only']
q1_accounts = list(in_q1['Account'])
print("Rows in Q1 but not in Q2", q1_accounts)
# In Q2 but not in Q1
in_q2 = merged[merged['_merge'] == 'right_only']
q2_accounts = list(in_q2['Account'])
print("Rows in Q2 but not in Q1", q2_accounts)
# In which both list are same
in_both = merged[merged['_merge'] == 'both']
both_accounts = list(in_both['Account'])
print("All rows are same in both", both_accounts)