我正在尝试在我的个人Mac机器上设置DBpedia Live Mirror。以下是关于我的设置的一些技术主机信息:操作系统:OS X 10.9.3处理器2.6 GHz英特尔酷睿i7内存16gb 1600mhz DDR3数据库服务器:OpenLink Virtuoso(开源版)
下面是到目前为止我所遵循的步骤的总结:
- 从DBPedia Live下载初始数据种子:dbpedia_2013_07_18.nt。bz2
- 从http://sourceforge.net/projects/dbpintegrator/files/下载同步工具。
- 执行virtload.sh脚本。必须在这里调整一些命令以与OS x兼容。
-
根据README.txt文件调整同步工具配置文件如下:
a)将文件"lastDownloadDate.dat"中的开始日期设置为该转储的日期(2013-07-18-00-000000)。
b)设置"dbpedia_updates_downloader.ini"文件中的配置信息,如Virtuoso的登录凭据和GraphURI。
-
在命令行执行"java -jar dbpintegrator-1.1.jar"。这个脚本反复显示以下错误:
INFO - Options file read successfully INFO - File : http://live.dbpedia.org/changesets/lastPublishedFile.txt has been successfully downloaded INFO - File : http://live.dbpedia.org/changesets/2014/06/16/13/000001.removed.nt.gz has been successfully downloaded WARN - File /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.removed.nt.gz cannot be decompressed due to Unexpected end of ZLIB input stream ERROR - Error: (No such file or directory) INFO - File : http://live.dbpedia.org/changesets/2014/06/16/13/000001.added.nt.gz has been successfully downloaded WARN - File /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.added.nt.gz cannot be decompressed due to Unexpected end of ZLIB input stream ERROR - Error: (No such file or directory) INFO - File : http://live.dbpedia.org/changesets/lastPublishedFile.txt has been successfully downloaded INFO - File : http://live.dbpedia.org/changesets/2014/06/16/13/000002.removed.nt.gz has been successfully downloaded INFO - File : /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000002.removed.nt.gz decompressed successfully to /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000002.removed.nt WARN - null Function executeStatement WARN - null Function executeStatement WARN - null Function executeStatement WARN - null Function executeStatement WARN - null Function executeStatement ...
为什么我在运行Java程序:"dbpintegrator- 1.1.1 .jar"时反复看到以下错误?这是否意味着这些文件中的三元组没有在我的实时镜像中更新?
WARN - File /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.removed.nt.gz cannot be decompressed due to Unexpected end of ZLIB input stream ERROR - Error: (No such file or directory)
我如何验证在我的镜像中加载的数据是最新的?我可以使用SPARQL查询来验证这一点吗?
我看到我的实时镜像中的数据缺少wikiPageId (http://dbpedia.org/ontology/wikiPageID)和wikiPageRevisionID。为什么呢?这些数据是否从DBpedia实时数据转储中丢失?
现在应该修复了。您可以从这里再试一次:https://github.com/dbpedia/dbpedia-live-mirror