我按照上面提到的步骤做了@:http://wiki.apache.org/solr/DataImportHandler
我也尝试了stackoverflow的其他解决方案,但仍然不起作用。
问题是:每次运行时,我仍然配置了delta导入处理程序;它索引DB中的所有记录。我在DB中有30条记录。每次我运行delta import,它都会索引所有30条记录。我只希望那些被更改/删除的应该被索引。
任何快速的帮助/指针/解决这个问题是赞赏的。
Data-config.xml
<dataConfig>
<dataSource type="JdbcDataSource" name="ds-books" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/test" user="root" password=""/>
<document name="books">
<entity name="books" pk="id" query="select * from books" deltaImportQuery="SELECT * FROM books WHERE id = '${dataimporter.delta.id}'" deltaQuery="SELECT id FROM books WHERE last_modified > '${dataimporter.last_index_time}'" >
<field column="id" name="id" indexed="true" stored="true"/>
<field column="NAME" name="name" />
<field column="PRICE" name="price" />
<field column="last_modified" name="last_modified" />
</entity>
</document>
</dataConfig>
我用来执行它的命令是:
http://localhost:8983/solr/dataimport?command=delta-import
dataimport。属性文件:
星期五May 10 17:13:18 IST 2013
last_index_time = 17 2013-05-10: 13 : 18
的书。last_index_time = 17 2013-05-10: 13 : 18
dataimporter。last_index_time = 17 2013-05-10: 11 : 42
我得到的XML响应如下:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">4</int>
</lst>
<lst name="initArgs">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</lst>
<str name="command">delta-import</str>
<str name="status">idle</str>
<str name="importResponse"/>
<lst name="statusMessages">
<str name="Total Requests made to DataSource">1</str>
<str name="Total Rows Fetched">30</str>
<str name="Total Documents Skipped">0</str>
<str name="Delta Dump started">2013-05-10 17:13:17</str>
<str name="Identifying Delta">2013-05-10 17:13:17</str>
<str name="Deltas Obtained">2013-05-10 17:13:17</str>
<str name="Building documents">2013-05-10 17:13:17</str>
<str name="Total Changed Documents">30</str>
<str name="">Indexing completed. Added/Updated: 30 documents. Deleted 0 documents.</str>
<str name="Committed">2013-05-10 17:13:17</str>
<str name="Total Documents Processed">30</str>
<str name="Time taken">0:0:0.303</str></lst>
<str name="WARNING">This response format is experimental. It is likely to change in the future.</str>
</response>
在日志文件中,我得到以下内容:
INFO: Read dataimport.properties
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder doDelta
INFO: Starting delta collection.
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta
INFO: Running ModifiedRowKey() for Entity: books
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Creating a connection for entity books with URL: jdbc:mysql://localhost/test
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Time taken for getConnection(): 9
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta
INFO: Completed ModifiedRowKey for Entity: books rows obtained : 30
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta
INFO: Completed DeletedRowKey for Entity: books rows obtained : 0
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta
INFO: Completed parentDeltaQuery for Entity: books
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder doDelta
INFO: Delta Import completed successfully
May 10, 2013 5:13:18 PM org.apache.solr.handler.dataimport.DocBuilder finish
INFO: Import completed successfully
May 10, 2013 5:13:18 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommi
t=false}
May 10, 2013 5:13:18 PM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2
在data-config.xml中更改以下值已经解决了问题
$ {dih。last_index_time}代替${dataimporter。last_index_time}
$ {dih.delta.id} 而不是 $ {dataimporter.delta.id} 。
我使用的是SOLR 4.0
当调用delta import pass clean parameter =" false"时,这会对您有所帮助。
如:域名:8983/solr/dataimport ?命令= delta-import&清洁= false"