我有下面的PostgreSQL表:
cust prod day month year state quant
0 Bloom Pepsi 2 12 2011 NY 4232
1 Bloom Bread 23 5 2015 PA 4167
2 Bloom Pepsi 22 1 2016 CT 4404
3 Bloom Fruits 11 1 2010 NJ 4369
4 Bloom Milk 7 11 2016 CT 210
我必须找到并显示每个州Bloom的平均销售额,并像这样显示:
CUST AVG_NY AVG_CT AVG_NJ
Bloom 28923 3241 1873
我将数据转换成以下形式:
[('Bloom', 'Pepsi', 2, 12, 2011, 'NY', 4232), ('Bloom', 'Eggs', 30, 11, 2010, 'NJ', 559), ('Bloom', 'Yogurt', 25, 7, 2014, 'PA', 17), ('Bloom', 'Yogurt', 3, 4, 2011, 'NJ', 1203), ('Bloom', 'Coke', 7, 2, 2010, 'NY', 1229), ('Bloom', 'Coke', 6, 10, 2018, 'PA', 2867), ('Bloom', 'Soap', 6, 1, 2015, 'CT', 4623), ('Bloom', 'Milk', 8, 9, 2010, 'NJ', 1106), ('Bloom', 'Milk', 19, 4, 2013, 'NY', 3516), ('Bloom', 'Soap', 7, 6,
2015, 'PA', 3404)]
下面是我的代码,这可能是最糟糕的方式:
connection = psycopg2.connect(user="postgres",
password="ss",
host="127.0.0.1",
port="8800",
database="postgres")
cursor = connection.cursor()
postgreSQL_select_Query = "select * from sales"
cursor.execute(postgreSQL_select_Query)
mobile_records = cursor.fetchall()
def takeSecond(elem):
return elem[0][0]
mobile_records.sort(key=takeSecond)
Bloom1 = []
for i in mobile_records:
if i[5] == 'NY' and i[0] == 'Bloom':
Bloom1.append(i)
s1 = 0
for j in Bloom1:
s1 += j[6]
avg1 = s1/len(Bloom1)
Bloom2 = []
for i in mobile_records:
if i[5] == 'CT' and i[0] == 'Bloom':
Bloom2.append(i)
s2 = 0
for j in Bloom2:
s2 += j[6]
avg2 = s2/len(Bloom2)
Bloom3 = []
for i in mobile_records:
if i[5] == 'NJ' and i[0] == 'Bloom':
Bloom3.append(i)
s3 = 0
for j in Bloom3:
s3 += j[6]
avg3 = s3/len(Bloom3)
我如何开始实现这一点?
您应该更深入地了解SQL。完全没有必要那样做。请使用group by语句。
statement = "SELECT state, AVG(quant) FROM sales WHERE cust = Bloom GROUP BY state"
执行后,您可以简单地遍历返回的列表并检查每个状态。
data = cursor.fetchall()
for dataset in data:
if dataset[0] == 'NJ':
# do something with dataset[1]
注意:dataset[0]
存储状态,dataset[1]
存储平均值。