将接近搜索迫使多个单词形式搜索



我使用邻近度与sphinx一起使用,例如Twain NEAR/1 Mark将返回

标记吐温

吐温,马克

但是说我有一个单词形式,例如:

工作日>周日

我如何设置任何给定的搜索以使用接近NEAR/3(或NEAR/X(,因此可以找到

工作日

一周中的一天

在这种情况下,我有其他方法可以使猫剥皮,但通常,要寻找一种方法,即多个单词映射doe不会被推为 'Word1 Word2',即 'Week Day',因为否则我会得到

之类的文档

'我整天工作了一整天,然后才意识到这将要

全周'

盒子没有简单的方法。您也许可以在应用程序中进行更改,因此它可以将每个"词"更改为搜索查询中的" word" 〜n,甚至更好地仅对Sphinx处理的相同词形式来做到这一点。这是一个例子:

mysql> select *, weight() from idx_min where match('weekday');
+------+-------------------------------------------------------------------------------+------+----------+
| id   | doc                                                                           | a    | weight() |
+------+-------------------------------------------------------------------------------+------+----------+
|    1 | Weekday                                                                       |    1 |     2319 |
|    2 | day of week                                                                   |    2 |     1319 |
|    3 | I worked for one entire day before realizing it was going to take a full week |    3 |     1319 |
+------+-------------------------------------------------------------------------------+------+----------+
3 rows in set (0.00 sec)
mysql> select *, weight() from idx_min where match('"weekday"');
+------+---------+------+----------+
| id   | doc     | a    | weight() |
+------+---------+------+----------+
|    1 | Weekday |    1 |     2319 |
+------+---------+------+----------+
1 row in set (0.00 sec)
mysql> select *, weight() from idx_min where match('"weekday"~2');
+------+-------------+------+----------+
| id   | doc         | a    | weight() |
+------+-------------+------+----------+
|    1 | Weekday     |    1 |     2319 |
|    2 | day of week |    2 |     1319 |
+------+-------------+------+----------+
2 rows in set (0.00 sec)
mysql> select *, weight() from idx_min where match('"entire"~2 "day"~2');
+------+-------------------------------------------------------------------------------+------+----------+
| id   | doc                                                                           | a    | weight() |
+------+-------------------------------------------------------------------------------+------+----------+
|    3 | I worked for one entire day before realizing it was going to take a full week |    3 |     1500 |
+------+-------------------------------------------------------------------------------+------+----------+
1 row in set (0.00 sec)
mysql> select *, weight() from idx_min where match('weekday full week');
+------+-------------------------------------------------------------------------------+------+----------+
| id   | doc                                                                           | a    | weight() |
+------+-------------------------------------------------------------------------------+------+----------+
|    3 | I worked for one entire day before realizing it was going to take a full week |    3 |     2439 |
+------+-------------------------------------------------------------------------------+------+----------+
1 row in set (0.01 sec)
mysql> select *, weight() from idx_min where match('"weekday"~2 full week');
Empty set (0.00 sec)

最后一个是最好的方法,但是您必须:

1(解析您的查询。例如。这样:

mysql> call keywords('weekday full week', 'idx_min');
+------+-----------+------------+
| qpos | tokenized | normalized |
+------+-----------+------------+
| 1    | weekday   | week       |
| 2    | weekday   | day        |
| 3    | full      | full       |
| 4    | week      | week       |
+------+-----------+------------+
4 rows in set (0.00 sec)

,如果您看到的是相同的标记单词,您会得到2个不同的归一化词,这可能是您的应用程序将令牌化词包装到" word"〜n。

中的信号

2(运行查询。在这种情况下,"工作日" 〜2整周

相关内容

  • 没有找到相关文章

最新更新