Solr将数据从"crawler"核心复制到"search"核心

我们希望有一个Solr 4.9设置，其中我们有一个非常简单的爬虫清除和加载一个"爬虫"核心，然后触发一个数据副本到"搜索"核心，当抓取完成。这样做的目的是，我们的爬虫非常简单，并没有真正跟踪文档的方式，将有利于做更新和删除。基本上，爬虫将清除整个"爬虫"核心，抓取大约5万个文档(每提交1000个左右)，然后触发一些东西将数据复制到另一个"搜索"核心。

假设我们必须重新启动搜索核心，如何从命令行或代码中实现这一点?

创建第三个核心作为search核心的副本。然后使用CoreAdmin中的mergeindexes命令将两个不同的内核合并为第三个内核。合并完成后，将第三个核心与旧的search核心交换。然后卸载交换出的核心(如果您觉得可以永久删除旧数据，请使用deleteInstanceDir=true)。

类似:

http://localhost:8983/solr/admin/cores/action=CREATE&name=core0&instanceDir=path_to_instance_directory&config=config_file_name.xml&schema=schema_file_name.xml&dataDir=data
http://localhost:8983/solr/admin/cores?action=mergeindexes&core=core0&indexDir=/opt/solr/crawl/data/index&indexDir=/opt/solr/index/data/index
http://localhost:8983/solr/admin/cores?action=SWAP&core=search&other=core0
http://localhost:8983/solr/admin/cores?action=UNLOAD&core=core0

相关内容

最新更新

热门标签：