如何查找重复图像



使用下面的表,我需要找到重复的图像记录。该查询用于查找记录,但不提供准确的结果。在获取重复的图像记录之后,我们希望只获取那些具有FM_NAME_EN, LASTNAME_EN, RLN_TYPE, RLN_FM_NM_EN, RLN_L_NM_EN,GENDER,AGE和PHOTO相同的记录。所使用的表有如下所示的列

CREATE TABLE [dbo].[ERoll](
[AC_NO] [int] NULL,
[PART_NO] [int] NULL,
[SLNOINPART] [int] NULL,
[FM_NAME_EN] [nvarchar](250) NULL,
[LASTNAME_EN] [nvarchar](250) NULL,
[RLN_TYPE] [nvarchar](1) NULL,
[RLN_FM_NM_EN] [nvarchar](250) NULL,
[RLN_L_NM_EN] [nvarchar](250) NULL,
[EPIC_NO] [nvarchar](25) NULL,
[GENDER] [nvarchar](1) NULL,
[AGE] [int] NULL,
[PHOTO] [image] NULL
);

在上表中有一些重复图像的记录。那么我们如何找到重复的图像记录

?只有50%的记录是正确的通过以下查询:

SELECT e.AC_NO, e.PART_NO, e.SLNOINPART, e.FM_NAME_EN, e.LASTNAME_EN, 
e.RLN_FM_NM_EN, e.RLN_L_NM_EN, e.RLN_TYPE, e.GENDER, e.AGE
FROM S21_EROLL_MANAGEMENT_PROD.dbo.ERoll e
JOIN (
select FM_NAME_EN, LASTNAME_EN, RLN_FM_NM_EN, RLN_L_NM_EN, RLN_TYPE, GENDER, AGE, hashbytes('md5', cast([PHOTO] as varbinary)) PHOTO ,count(*) PHOTO_COUNT
from S21_EROLL_MANAGEMENT_PROD.dbo.ERoll
group by FM_NAME_EN, LASTNAME_EN, RLN_FM_NM_EN, RLN_L_NM_EN, RLN_TYPE, GENDER, AGE, hashbytes('md5', cast([PHOTO] as varbinary)) 
having count(*) > 1
) d ON e.FM_NAME_EN = d.FM_NAME_EN and e.LASTNAME_EN = d.LASTNAME_EN and e.RLN_FM_NM_EN=d.RLN_FM_NM_EN 
and e.RLN_L_NM_EN = d.RLN_L_NM_EN and e.RLN_TYPE = d.RLN_TYPE and e.GENDER = d.GENDER and e.AGE = d.AGE
order by AC_NO, PART_NO, SLNOINPART

我想你可以试着找到像这样的重复照片

SELECT HASHBYTES('SHA2_256', cast(photo as varbinary(max))) from dbo.ERoll

using GROUP BY, HAVING, COUNT():

select hashbytes('md5', cast([imagecolumn] as varbinary(max))),count(*)
from yourtable
group by hashbytes('md5', cast([imagecolumn] as varbinary(max)))
having count(*) > 1

使用窗口函数:

select e.*
from (select e.*
count(*) over (partition by FM_NAME_EN, LASTNAME_EN, RLN_FM_NM_EN, RLN_L_NM_EN, RLN_TYPE, GENDER, AGE, hashbytes('md5', cast([PHOTO] as varbinary))) as cnt
from S21_EROLL_MANAGEMENT_PROD.dbo.ERoll e
) e
where cnt > 1;

相关内容

  • 没有找到相关文章

最新更新