我尝试运行在Google appengine上运行的GWTP爬虫。
https://github.com/ArcBees/GWTP/wiki/Crawler-Support
https://github.com/ArcBees/GWTP-Samples/tree/master/gwtp-samples/gwtp-sample-crawler-service
由于文档很短,我无法运行它。
我可以部署我的 gwtp-cralwer,但是当我尝试抓取应用程序时出现以下错误。
我尝试了这样的网址:
http://testcrawler.appspot.com/?key=123456&url=http://google.com
并收到以下错误
80.171.208.157 - - [11/Feb/2014:08:59:18 -0800] "GET /?key=123456&url=http://google.com HTTP/1.1" 200 41 - "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36" "testcrawler.appspot.com" ms=7082 cpu_ms=2209 cpm_usd=0.000005 pending_ms=3689 app_engine_release=1.9.0 instance=00c61b117ca91589d80617271db80bb7cba3a3
W 2014-02-11 08:59:18.623
Error for /
java.lang.NoSuchFieldError: FIREFOX_17
at com.gwtplatform.crawlerservice.server.guice.CrawlServiceModule.getWebClient(CrawlServiceModule.java:40)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:45)
at com.google.inject.internal.ProviderMethod.get(ProviderMethod.java:104)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:40)
at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1031)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.inject.Scopes$1$1.get(Scopes.java:65)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:40)
at com.google.inject.internal.InjectorImpl$4$1.call(InjectorImpl.java:978)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1024)
at com.google.inject.internal.InjectorImpl$4.get(InjectorImpl.java:974)
at com.gwtplatform.crawlerservice.server.CrawlServiceServlet.renderPage(CrawlServiceServlet.java:226)
at com.gwtplatform.crawlerservice.server.CrawlServiceServlet.renderResponse(CrawlServiceServlet.java:161)
at com.gwtplatform.crawlerservice.server.CrawlServiceServlet.doGet(CrawlServiceServlet.java:106)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at com.google.tracing.TraceContext$TraceContextRunnable.runInContext(TraceContext.java:437)
at com.google.tracing.TraceContext$TraceContextRunnable$1.run(TraceContext.java:444)
at com.google.tracing.CurrentContext.runInContext(CurrentContext.java:188)
at com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContextNoUnref(TraceContext.java:308)
at com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContext(TraceContext.java:300)
at com.google.tracing.TraceContext$TraceContextRunnable.run(TraceContext.java:441)
at java.lang.Thread.run(Thread.java:724)
我必须如何解决它?
在ArcBees GWTP-Samples问题中提出此问题。 您的问题中没有足够的信息供互联网上的程序员重现问题。 如果没有其他人会帮助您,请将搜索重点放在文本FIREFOX_17的来源上 - 它很可能是一个值,而不是字段名称。