分组结果在SOLR?

  • 本文关键字:SOLR 结果 solr
  • 更新时间 :
  • 英文 :


我有一个模式如下的Solr索引:

{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "*:*",
"q.op": "OR",
"_": "1673422604341"
}
},
"response": {
"numFound": 1206,
"start": 0,
"numFoundExact": true,
"docs": [
{
"material_name_s":"MaterialName1",
"company_name_s": "CompanyName1",
"price_per_lb_value_f": 1.11,
"received_date_dt": "2015-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName1",
"company_name_s": "CompanyName2",
"price_per_lb_value_f": 2.22,
"received_date_dt": "2020-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName1",
"company_name_s": "CompanyName3",
"price_per_lb_value_f": 3.33,
"received_date_dt": "2021-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName2",
"company_name_s": "CompanyName1",
"price_per_lb_value_f": 4.44,
"received_date_dt": "2016-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName2",
"company_name_s": "CompanyName2",
"price_per_lb_value_f": 5.55,
"received_date_dt": "2021-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName2",
"company_name_s": "CompanyName3",
"price_per_lb_value_f": 6.66,
"received_date_dt": "2022-01-01T00:00:00Z"
}
]
}
}

这些是不同公司不同材料的历史价格。

我想要得到最近2年中每个material_name_s的最低price_per_lb_value_f,因此结果看起来像这样:

{
"response": {
"numFound": 2,
"start": 0,
"numFoundExact": true,
"docs": [
{
"material_name_s":"MaterialName1",
"company_name_s": "CompanyName3",
"price_per_lb_value_f": 3.33,
"received_date_dt": "2021-01-01T00:00:00Z"
},
{
"material_name_s":"MaterialName2",
"company_name_s": "CompanyName2",
"price_per_lb_value_f": 5.55,
"received_date_dt": "2021-01-01T00:00:00Z"
}
]
}
}

这种分组在Solr中是可能的吗?我是一个Solr的新手,所以任何帮助都会很感激。

分组在Solr中是可能的。您可以通过以下查询获得您想要的结果:

  1. 字段折叠方法(在您的情况下推荐):https://solr.apache.org/guide/solr/latest/query-guide/collapse-and-expand-results.html
http://localhost:8983/solr/test/select?indent=true&q.op=OR&q=received_date_dt:[NOW-3YEAR%20TO%20*]&fq={!collapse%20field=material_name_s%20min=price_per_lb_value_f}

q:received_date_dt:[NOW-3YEAR TO *]//范围查询仅过滤最近3年内收到的文档,否则我将无法获得2021年1月1日收到的文档
fq:{!collapse field=material_name_s min=price_per_lb_value_f}//它只显示所有具有相同material_name_s值的文档中的一个文档。它获取最小值为price_per_lb_value_f

的文档
  1. 分组方式: https://solr.apache.org/guide/solr/latest/query-guide/result-grouping.html
http://localhost:8983/solr/test/select?indent=true&q.op=OR&q=received_date_dt:[NOW-3YEAR%20TO%20*]&group=true&group.field=material_name_s&group.sort=price_per_lb_value_f%20asc

q:received_date_dt:[NOW-3YEAR TO *]//与之前相同的过滤器
group:true//启用分组
group.field:material_name_s//按material_name_s分组
group.sort:price_per_lb_value_f asc//按price_per_lb_value_f字段升序对每个组进行排序
group.limit不指定,默认值为1//设置每个组的结果数量

最新更新