ID FirstName LastName Gender Salary
1 Ben Hoskins Male 70000
2 Mark Hastings Male 60000
4 Ben Hoskins Male 70000
8 John Stanmore Male 80000
运行查询时:
select *
from Employees
where Salary > (SELECT AVG(distinct SALARY) FROM employees)
它显示了2条7000条记录,应该显示一条。如果我使用了distinct,为什么它显示2条记录?
您在子查询中使用了distinct
,而不是在外部查询中,因此外部查询仍然可以看到重复的数据。
您可以使用解决此问题
select e.FirstName, e.LastName, e.Gender, e.Salary
from Employees e
where e.Salary > (SELECT AVG(distinct e2.SALARY) FROM employees e2);
但是,如果您的行具有这样的重复项,那么数据模型似乎会出现严重问题。数据应该是固定的。
与此同时,你可以解决这个问题。您可以使用消除重复的CTE来表达您的查询:
with e as (
select e.*
from (select e.*,
row_number(). over (partition by firstname, lastname, gender order by id desc) as seqnum
from employees e
)
where seqnum = 1
)
select e.*
from e
where e.salary > (select avg(salary) from e)
表中似乎有重复的行(它们的id
不同(,所以我假设您希望distinct
覆盖姓名、性别和工资,而不仅仅是工资。
您似乎希望distinct
同时出现在外部和内部查询中:
select distinct firstname, lastname, gender, salary
from employees
where salary > (
select avg(salary)
from (
select distinct firstname, lastname, gender, salary
from employees
) e
)
如果你的数据库支持窗口功能,这可以缩短:
select *
from (
select e.*, avg(salary) over() as avg_salary
from (
select distinct firstname, lastname, gender, salary
from employees
) e
) e
where salary > avg_salary