搜索solr中不起作用的多值字段



我在solr中有一个多值字段,它有类似的用户名称

{
    "counsel_for_department": [
      "mr  a g  srivastava with mr xyz doe,
      " mr  johh david and mr john deo",
      " mr  n p  smith and mr  ng smith",
    ]
  },

当我像fq=counsel_for_department:a g srivastava那样查询时,它不会返回任何结果。我正在使用这个领域的标准标记器

该字段的字段类型为text_general

如果我们需要为多值字段配置不同的设置,请告诉我。

我得到以下json对象

  {
  "responseHeader": {
    "status": 0,
    "QTime": 20,
    "params": {
      "q": "*:*",
      "indent": "true",
      "fl": "counsel_for_department",
      "fq": [
        "doc_type:source_analysis",
        "counsel_for_department:*g*c*Srivastava*"
      ],
      "rows": "100",
      "wt": "json",
      "debugQuery": "true",
      "_": "1459351342391"
    }
  },
  "response": {
    "numFound": 0,
    "start": 0,
    "docs": []
  },
  "debug": {
    "rawquerystring": "*:*",
    "querystring": "*:*",
    "parsedquery": "MatchAllDocsQuery(*:*)",
    "parsedquery_toString": "*:*",
    "explain": {},
    "QParser": "LuceneQParser",
    "filter_queries": [
      "doc_type:source_analysis",
      "counsel_for_department:*g*c*Srivastava*"
    ],
    "parsed_filter_queries": [
      "doc_type:source_analysis",
      "counsel_for_department:*g*c*srivastava*"
    ],
    "timing": {
      "time": 20,
      "prepare": {
        "time": 16,
        "query": {
          "time": 16
        },
        "facet": {
          "time": 0
        },
        "facet_module": {
          "time": 0
        },
        "mlt": {
          "time": 0
        },
        "highlight": {
          "time": 0
        },
        "stats": {
          "time": 0
        },
        "expand": {
          "time": 0
        },
        "debug": {
          "time": 0
        }
      },
      "process": {
        "time": 3,
        "query": {
          "time": 3
        },
        "facet": {
          "time": 0
        },
        "facet_module": {
          "time": 0
        },
        "mlt": {
          "time": 0
        },
        "highlight": {
          "time": 0
        },
        "stats": {
          "time": 0
        },
        "expand": {
          "time": 0
        },
        "debug": {
          "time": 0
        }
      }
    }
  }
}

提前感谢

通配符查询不会被分析,所以在大多数情况下最好远离它们,而是使用术语匹配。这样,无论术语的顺序如何,您都可以匹配文档,因此"john oliver"也将匹配"oliver john","john oliver"基于短语匹配而增强。

为了扩展,通配符匹配的唯一方式是基础数据集中的实际令牌匹配——如果你有令牌化器和过滤器链,通常情况下,只要你在混合中加入一个空格,它就不会出现。

去掉通配符并使用适当的匹配(这正是Solr真正擅长的)。

对于纯文本搜索,您应该选择:

fq=counsel_for_department:*a g  srivastava* 
//OR you can also use : 
fq=counsel_for_department:*a*g*srivastava*

一开始就这样用。但在SOLR中,这是一个相对昂贵/较慢的查询。作为改进,如果此查询非常昂贵(花费太多时间),则应该在1个合并字段中转换多值字段。并查询该字段而不是多值字段。

最新更新