我有一个字符串存储在表中,如:
1. "the quick brown fox"
2. "the quick brown fox jumps"
3. "the quick brown fox jumps over the lazy dog"
4. "the quick potato does nothing"
给定三个输入单词,我想在三个单词都在字符串
中找到时返回该条目我这样写:
WHERE word1 IN stringfield AND word2 IN stringfield AND word3 IN stringfeild
但是,我想选择性地提供额外的输入单词,以便根据包含与输入单词最匹配的条目过滤结果。所有返回的匹配至少有三个匹配。
例如输入:
"the", "quick", "brown", "fox", "jumps", "over"
的回报:
3.
2.
1.
因为3有最多的匹配项,然后是2,然后是1。而4没有被选中,因为它不包含至少三个匹配项。
这有可能吗?这是最快的方法吗,还是用连接表更好?如果有,那是怎么回事?非常感谢。
首先,你最好使用mysql的全文功能。点击这里阅读。
我假设你正在动态地构造你的where
子句,所以如果你有五个单词,你可以构造:
WHERE stringfield LIKE '%word1%' OR
stringfield LIKE '%word2%' OR
stringfield LIKE '%word3%' OR
stringfield LIKE '%word4%' OR
stringfield LIKE '%word5%'
IN
操作符根本没有做你认为它在做的事情。
如果你可以这样做,那么完整的查询也会有:
WHERE ((stringfield LIKE '%word1%') +
(stringfield LIKE '%word2%') +
(stringfield LIKE '%word3%') +
(stringfield LIKE '%word4%') +
(stringfield LIKE '%word5%')
) >= 3
ORDER BY ((stringfield LIKE '%word1%') +
(stringfield LIKE '%word2%') +
(stringfield LIKE '%word3%') +
(stringfield LIKE '%word4%') +
(stringfield LIKE '%word5%')
) DESC
MySQL在数值上下文中将布尔表达式视为整数。这使得计算匹配次数变得特别容易。但是,正如我所说,全文索引可能才是您真正需要的。
当我开始觉得某件事很难的时候,我就会努力去完成它……下面是一个解决方案:(name of the Database is 'Test')
第一次创建这个函数
DELIMITER $$
CREATE DEFINER = 'root'@'%'
FUNCTION Test.countOccurence (LineTocheck nvarchar(255), criteriaToMatch nvarchar(15))
RETURNS int(11)
BEGIN
DECLARE Occurences int DEFAULT 0;
SELECT
(LENGTH(LineTocheck) - LENGTH(REPLACE(LineTocheck, criteriaToMatch, ''))) / LENGTH(criteriaToMatch) INTO Occurences;
RETURN Occurences;
END
$$
DELIMITER ;
第二次执行查询:
SELECT Generic.id
,Description
,SUM(countOccurence(Description, c.criteria))
FROM Generic
,criteria c
GROUP BY Description
,Generic.id
ORDER BY SUM(countOccurence(Description, c.criteria)) desc
注:表结构为:对于标准:
CREATE TABLE Test.criteria (
id int(11) NOT NULL AUTO_INCREMENT,
criteria varchar(15) NOT NULL,
PRIMARY KEY (id)
)
ENGINE = INNODB
AUTO_INCREMENT = 1
CHARACTER SET utf8
COLLATE utf8_general_ci;
对于要搜索occurrence
的表CREATE TABLE Test.Generic (
id int(11) NOT NULL AUTO_INCREMENT,
Description varchar(255) NOT NULL,
PRIMARY KEY (id)
)
ENGINE = INNODB
AUTO_INCREMENT = 1
CHARACTER SET utf8
COLLATE utf8_general_ci;
SET NAMES 'utf8';
INSERT INTO Test.criteria(id, criteria) VALUES
(1, 'fox');
INSERT INTO Test.criteria(id, criteria) VALUES
(2, 'brown');
INSERT INTO Test.criteria(id, criteria) VALUES
(3, 'over');
SET NAMES 'utf8';
INSERT INTO Test.Generic(id, Description) VALUES
(1, 'the quick brown fox');
INSERT INTO Test.Generic(id, Description) VALUES
(2, 'the quick brown fox jumps');
INSERT INTO Test.Generic(id, Description) VALUES
(3, 'the quick brown fox jumps over the lazy dog');
INSERT INTO Test.Generic(id, Description) VALUES
(4, 'the quick potato does nothing');
使用Dbforge MySQL Studio Express(免费)连接MySQL并运行语句http://www.devart.com/login.html?returnToUrl=/dbforge/mysql/studio/download.html%3Ffd=dbforgemysqlfree.exe
测试一下,然后告诉我