查找SQL中基于另一个变量的变量的总出现次数



我试图提供一个SQL查询来查找哪个XNumbers拥有最多的YNumbers。以下是我的数据:


-- Create Tables
-- Create X Table
CREATE TABLE X (
XID int identity,
XNumber varchar(100) not null,
XDescription varchar(100) not null,
XCode varchar(100) not null,
VIT varchar(100) not null,
XLocation varchar(100) not null,
CONSTRAINT PK_X PRIMARY KEY (XID),
CONSTRAINT U1_X UNIQUE (XNumber),
CONSTRAINT U2_X UNIQUE (XCode)
)

-- Create Y Table
CREATE TABLE Y (
YID int identity,
YNumber varchar(100) not null,
YDescription varchar(100) not null,
YDate datetime not null default GetDate(),
CONSTRAINT PK_Y PRIMARY KEY(YID),
CONSTRAINT U1_Y UNIQUE(YNumber)
)

-- Create XY Table
CREATE TABLE XY (
XYID int identity,
XID int not null,
YID int not null,
CONSTRAINT PK_XY PRIMARY KEY(XYID),
CONSTRAINT FK1_XY FOREIGN KEY(XID) REFERENCES X(XID),
CONSTRAINT FK2_XY FOREIGN KEY(YID) REFERENCES Y(YID)
)

-- Insert Values in X
insert into X (XNumber, XDescription, XCode, VIT, XLocation) values ('ASDLKVNAEQRIUNBVLAXKD', 'THDSDF', 'Violet', 2, 'FROP');
insert into X (XNumber, XDescription, XCode, VIT, XLocation) values ('GBOGJMBNNUTHDFHJG', 'TFGSDFG',    'Pink', 20, 'DOOO');
insert into X (XNumber, XDescription, XCode, VIT, XLocation) values ('GKHL', 'SDFGSDF', 'Yellow', 21, 'DSUIO');
insert into X (XNumber, XDescription, XCode, VIT, XLocation) values ('DELKMDTKJGBT', 'HNYDGH', 'Green', 20, 'KABBA');
insert into X (XNumber, XDescription, XCode, VIT, XLocation) values ('WRTOUBHNPSDFIVKM', 'OYIU', 'Aquamarine', 19, 'SANFRIO');
insert into X (XNumber, XDescription, XCode, VIT, XLocation) values ('FGBKMOSEKFVKF', 'DCGTH', 'Red', 8, 'BOB');
insert into X (XNumber, XDescription, XCode, VIT, XLocation) values ('GBIMGIK', 'FTGHJM', 'Mauv', 10, 'CHURCH');
insert into X (XNumber, XDescription, XCode, VIT, XLocation) values ('JLJLH', 'FCTGHKII', 'Goldenrod', 5, 'ZAN');
insert into X (XNumber, XDescription, XCode, VIT, XLocation) values ('YITORKRKG', 'ITRTGJJ', 'Maroon', 7, 'ZERR');
insert into X (XNumber, XDescription, XCode, VIT, XLocation) values ('QWYTWEGD', 'PRTHJHJUKKM', 'Khaki', 9, 'ZAB');
-- Insert Values in Y
insert into Y (YNumber, YDescription, YDate) values ('8e3ac22d07ee88efa9e91fc0830cdee0f1263d82', 'POSTY', '12/22/2019');
insert into Y (YNumber, YDescription, YDate) values ('735c653fec21726598b47b8c59f7b4a4a8ee3618', 'PUTH 3500', '1/14/2021');
insert into Y (YNumber, YDescription, YDate) values ('ef3b991a287db1c3bfd9c568972bc35e5c7c6218', 'BTS', '4/4/2021');
insert into Y (YNumber, YDescription, YDate) values ('a45cec89ba00b77439802b342ce16be4c8ddbee4', 'JLO', '12/3/2019');
insert into Y (YNumber, YDescription, YDate) values ('747a92666a65947e9f03fd7cf7ad55fdd803a2ab', 'JT', '5/1/2020');
insert into Y (YNumber, YDescription, YDate) values ('19a5e1d0c75abea249e50997dd1252285c36404c', 'M5', '1/13/2022');
insert into Y (YNumber, YDescription, YDate) values ('1fe285c2249937f9c5c7608f46be736840ef6007', 'Laroi', '12/12/2021');
insert into Y (YNumber, YDescription, YDate) values ('e6d3614571648dceae07a4813824904f70bc4bef', 'Biebs', '5/1/2022');
insert into Y (YNumber, YDescription, YDate) values ('2a7401fd3f68e09f45564ff29fc4b16c1f7e88ba', 'The Weeknd', '2/21/2022');
insert into Y (YNumber, YDescription, YDate) values ('20a451d4be1894e8f585af47807df6cbe55edfe4', 'Taylor', '2/1/2022');
-- Insert Values in XY
insert into XY (XID, YID) values (2, 6);
insert into XY (XID, YID) values (6, 3);
insert into XY (XID, YID) values (3, 3);
insert into XY (XID, YID) values (9, 8);
insert into XY (XID, YID) values (5, 7);
insert into XY (XID, YID) values (4, 1);
insert into XY (XID, YID) values (3, 7);
insert into XY (XID, YID) values (3, 2);
insert into XY (XID, YID) values (5, 9);
insert into XY (XID, YID) values (6, 5);

这是我尝试过的:

SELECT 
X.XNumber,
COUNT(XNumber) AS XFreq,
Y.YNumber
--COUNT(YNumber) AS YFreq
FROM XY
JOIN X ON X.XID = XY.XID
JOIN Y ON Y.YID = XY.YID
GROUP BY XNumber, YNumber

结果返回为一行仅1s,这是不正确的。

期望的输出将是回答以下问题的SELECT语句:哪个XNumbers拥有最多的YNumber?

对于每个"Xnumber",您想要计数"YNumber",所以只需计数"YNumber"并按"xnumber"分组。

因为您只需要最多的"ynumber",所以使用RANK()函数。

,看到db&lt的在小提琴

WITH tmp AS (
SELECT 
X.XNumber,
COUNT(Y.YNumber) AS YFreq,
RANK() OVER(ORDER BY COUNT(Y.YNumber) DESC) AS rnk
FROM XY
JOIN X ON X.XID = XY.XID
JOIN Y ON Y.YID = XY.YID
GROUP BY X.XNumber
)

SELECT tmp.*
FROM tmp
WHERE rnk = 1

两个列分组。所有行都是唯一的,这意味着每个组只有一行。如果要对y进行计数,请仅按X分组:

SELECT 
X.XNumber,
COUNT(*) AS XFreq
FROM XY
JOIN X ON X.XID = XY.XID
GROUP BY XNumber
ORDER BY Count(*) desc
-------------------------
XNumber             XFreq
GKHL                3
WRTOUBHNPSDFIVKM    2
FGBKMOSEKFVKF       2
GBOGJMBNNUTHDFHJG   1
YITORKRKG           1
DELKMDTKJGBT        1

您不需要与Y JOIN,因为您没有使用该表中的任何信息。

您也可以使用ROW_NUMBER()添加排名,以按计数对行进行排名:

SELECT 
X.XNumber,
COUNT(*) AS XFreq,
ROW_NUMBER() OVER (ORDER BY COUNT(*) desc) as RN
FROM XY
JOIN X ON X.XID = XY.XID
GROUP BY XNumber
ORDER BY Count(*) desc
-------------------------
XNumber             XFreq   RN
GKHL                3       1
WRTOUBHNPSDFIVKM    2       2
FGBKMOSEKFVKF       2       3
GBOGJMBNNUTHDFHJG   1       4
YITORKRKG           1       5
DELKMDTKJGBT        1       6

你不能在WHERE子句中使用排序函数。要按行号列(RN)进行过滤,需要将原始查询包装为CTE或子查询

;with ranks as (
SELECT 
X.XNumber,
COUNT(*) AS XFreq,
ROW_NUMBER() OVER (ORDER BY COUNT(*) desc) as RN
FROM XY
JOIN X ON X.XID = XY.XID
GROUP BY XNumber
)
SELECT * 
FROM ranks 
WHERE RN=1
-------------------
XNumber XFreq   RN
GKHL    3       1

或等价的

SELECT * 
FROM (
SELECT 
X.XNumber,
COUNT(*) AS XFreq,
ROW_NUMBER() OVER (ORDER BY COUNT(*) desc) as RN
FROM XY
JOIN X ON X.XID = XY.XID
GROUP BY XNumber
) ranks
WHERE RN=1

您可以将其简化一些。您不需要加入Y,也不需要使用ROW_NUMBER。您可以使用TOP (1)ORDER BY代替。

先取GROUP BY XNumber,再取COUNT(*),取最高值

SELECT TOP (1)
X.XNumber,
COUNT(*) AS YFreq
FROM XY
JOIN X ON X.XID = XY.XID
GROUP BY X.XNumber
ORDER BY
YFreq DESC;

,db&lt的在小提琴

最新更新