目前处于锁定区域:Fuseki+全文搜索+推理



我最近开始在Fuseki 0.2.8快照中进行全文搜索。

我有一个由TDB数据集支持的InfModel,我在其中添加了一个Lucene文本索引

prefix text: <http://jena.apache.org/text#>
select distinct ?s where { ?s text:query ('stu' 16) }

这非常有效,直到我同时对Fuseki进行两次或两次以上的查询,然后偶尔我会得到:

Error 500: Currently in a locked region Fuseki - version 0.2.8-SNAPSHOT (Build date: 20130820-0755). 

我已经尝试过用10个并发用户以随机间隔发送查询来测试端点,在两分钟的时间内,大约30%的查询返回上面的500个错误。

我还尝试通过替换这一部分来禁用推理(下面是完整的汇编文件):

<#dataset_fulltext> rdf:type     text:TextDataset ;
  text:dataset   <#dataset_inf> ;
  ##text:dataset   <#tdbDataset> ;
  text:index     <#indexLucene> .

这个:

<#dataset_fulltext> rdf:type     text:TextDataset ;
  ##text:dataset   <#dataset_inf> ;
  text:dataset   <#tdbDataset> ;
  text:index     <#indexLucene> .

并且当TextDataset使用#tdbDataset而不是#dataset_inf时没有生成异常。

我的设置有什么问题吗,或者这是Fuseki中的一个错误?

这是我当前的汇编文件:

@prefix :        <#> .
@prefix fuseki:  <http://jena.apache.org/fuseki#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text:    <http://jena.apache.org/text#> .
@prefix dc:      <http://purl.org/dc/terms/> .
[] rdf:type fuseki:Server ;
  # Timeout - server-wide default: milliseconds.
  # Format 1: "1000" -- 1 second timeout
  # Format 2: "10000,60000" -- 10s timeout to first result, then 60s timeout to for rest of query.
  # See java doc for ARQ.queryTimeout
  ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "12000,50000" ] ;
  fuseki:services (
    <#service1>
  ) .
# Custom code.
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
# TDB
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .
## Initialize text query
[] ja:loadClass       "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
# Lucene index
text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .
## ---------------------------------------------------------------
## Service with only SPARQL query on an inference model.
## Inference model bbase data in TDB.
<#service1>  rdf:type fuseki:Service ;
  rdfs:label               "TDB/text service" ;
  fuseki:name              "dataset" ;         # http://host/dataset
  fuseki:serviceQuery      "query" ;
  fuseki:serviceUpdate     "update" ;
  fuseki:serviceUpload     "upload" ;
  fuseki:serviceReadWriteGraphStore "data" ;
  fuseki:serviceReadGraphStore "get" ;
  fuseki:dataset           <#dataset_fulltext> ;
    .
<#dataset_inf> rdf:type ja:RDFDataset ;
  ja:defaultGraph       <#model_inf> .
<#model_inf> rdf:type ja:Model ;
  ja:baseModel <#tdbGraph> ;
  ja:reasoner [ ja:reasonerURL <http://jena.hpl.hp.com/2003/OWLMicroFBRuleReasoner> ] .
<#tdbDataset> rdf:type tdb:DatasetTDB ;
  tdb:location "Data" .
<#tdbGraph> rdf:type tdb:GraphTDB ;
  tdb:dataset <#tdbDataset> .
# Dataset with full text index.
<#dataset_fulltext> rdf:type     text:TextDataset ;
  text:dataset   <#dataset_inf> ;
  ##text:dataset   <#tdbDataset> ;
  text:index     <#indexLucene> .
# Text index description
<#indexLucene> a text:TextIndexLucene ;
  text:directory <file:Lucene> ;
  ##text:directory "mem" ;
  text:entityMap <#entMap> ;
  .
# Mapping in the index
# URI stored in field "uri"
# rdfs:label is mapped to field "text"
<#entMap> a text:EntityMap ;
  text:entityField      "uri" ;
  text:defaultField     "text" ;
  text:map (
    [ text:field "text" ; text:predicate dc:title ]
    [ text:field "text" ; text:predicate dc:description ]
  ) .

以下是Fuseki日志中一个异常的完整堆栈跟踪:

16:27:01 WARN  Fuseki               :: [2484] RC = 500 : Currently in a locked region
com.hp.hpl.jena.sparql.core.DatasetGraphWithLock$JenaLockException: Currently in a locked region
    at com.hp.hpl.jena.sparql.core.DatasetGraphWithLock.checkNotActive(DatasetGraphWithLock.java:72)
    at com.hp.hpl.jena.sparql.core.DatasetGraphTrackActive.begin(DatasetGraphTrackActive.java:44)
    at org.apache.jena.query.text.DatasetGraphText.begin(DatasetGraphText.java:102)
    at org.apache.jena.fuseki.servlets.HttpAction.beginRead(HttpAction.java:117)
    at org.apache.jena.fuseki.servlets.SPARQL_Query.execute(SPARQL_Query.java:236)
    at org.apache.jena.fuseki.servlets.SPARQL_Query.executeWithParameter(SPARQL_Query.java:195)
    at org.apache.jena.fuseki.servlets.SPARQL_Query.perform(SPARQL_Query.java:80)
    at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.executeLifecycle(SPARQL_ServletBase.java:185)
    at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.executeAction(SPARQL_ServletBase.java:166)
    at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.execCommonWorker(SPARQL_ServletBase.java:154)
    at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:73)
    at org.apache.jena.fuseki.servlets.SPARQL_Query.doGet(SPARQL_Query.java:61)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1448)
    at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)
    at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
    at org.eclipse.jetty.server.Server.handle(Server.java:370)
    at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
    at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
    at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
    at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
    at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
    at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
    at org.eclipse.jetty.server.nio.BlockingChannelConnector$BlockingChannelEndPoint.run(BlockingChannelConnector.java:298)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
    at java.lang.Thread.run(Thread.java:722)

如有任何建议,不胜感激。

谢谢,斯图亚特。

这看起来可能是一个错误,我已经将其作为JENA-522提交,如果您有关于该错误的更多详细信息要添加,请在那里添加注释。

问题是,具有推理的数据集隐式地使用ARQ的标准内存Dataset实现,而这不支持事务。

但是,内部(以及堆栈跟踪中)对应于DatasetGraphText的文本数据集需要封装的数据集来支持事务,而这些数据集不使用DatasetGraphWithLock进行封装。这似乎遇到了锁的问题,文档指出这应该支持多个读取器,但遵循了代码的逻辑,我不确定它是否真的允许这样做。

相关内容

  • 没有找到相关文章

最新更新