为什么PostgreSQL在一个表上分组并选择另一个表时不接受

我使用的是postgreSQL版本PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu，由gcc编译（Ubuntu/Linaro 4.7.2-22ubuntu5）4.7.2，64位，我的问题是连接两个表，让它命名为temp1和temp2，这里我需要连接这个两个表

表格结构为

marks_map
marks   int
stud_id  int
student
stud_id int
class_id int

这是我的查询

select class_id,stud_id,count(marks) 
from student as s 
inner join marks_map as m on (s.stud_id=m.stud_id) group by stud_id

这里我得到错误作为

ERROR:  column "s.class_id" must appear in the GROUP BY clause or be used in an aggregate function

为什么会出现这种错误？如果我在group by中使用class_id，它将成功运行。

您必须将class_id属性添加到您的groupby子句中，因为在语句的选择部分中，没有针对该属性的聚合函数。

在GROUP BY语句中，您必须在GROUP BY子句之后添加尚未聚合的所有属性。

例如：

SELECT
non-aggregating-attr-1, non-aggregating-attr2, non-aggregating-attr3, sum(attr4)
FROM
table
GROUP BY
non-aggregating-attr-1, non-aggregating-attr2, non-aggregating-attr3

这就是group by的工作方式。

你可以像一样检查你的数据

select
    array_agg(class_id) as arr_class_id,
    stud_id, count(marks) 
from student as s 
   inner join marks_map as m on (s.stud_id=m.stud_id)
group by stud_id

看看每个组有多少class_id。有时class_id依赖于stud_id（数组中每个组只有一个elemnet），因此可以使用伪聚合，如：

select
    max(class_id) as class_id,
    stud_id, count(marks) 
from student as s 
   inner join marks_map as m on (s.stud_id=m.stud_id)
group by stud_id

您应该能够理解一个甚至不涉及JOIN的简化案例中的问题。

查询SELECT x,[other columns] GROUP BY x表达了这样一个事实，即对于x的每个不同值，[其他列]必须输出，每个x只有一行。

现在来看一个简化的例子，其中student表有两个条目：

stud_id=1，class_id=1
stud_id=1，class_id=2

我们要求SELECT stud_id,class_id FROM student GROUP BY class_id。

stud_id只有一个不同的值，即1。

所以我们告诉SQL引擎，给我一行stud_id=1和随之而来的class_id的值。问题是，这样的值不是一个，而是两个，1和2。那么该选择哪一个呢？SQL引擎不是随机选择，而是产生一个错误，说这个问题在概念上是假的，因为没有规则说stud_id的每个不同值都有自己对应的class_id。

另一方面，如果非GROUP的输出列是将一系列值转换为一个值的聚合函数，如min、max或count，则它们提供了缺少的规则，说明如何从多个值中仅获取一个值。这就是SQL引擎可以使用的原因，例如：SELECT stud_id,count(class_id) FROM student GROUP BY stud_id;。

此外，当遇到错误列"somecolumn"必须出现在GROUP BY子句中时，您不希望只将列添加到GROUP BY中，直到错误消失，就好像这纯粹是语法问题一样。这是一个语义问题，添加到GROUP BY的每一列都会更改提交给SQL引擎的问题的含义。

也就是说，对于（x，y）对的每个不同值，GROUP BY x,y意味着。这并不意味着GROUP BY x，嘿，因为它会导致错误，所以让我们也加入y！

相关内容

最新更新

热门标签：