Nokogiri -XML编码问题



我写了一个简单的红宝石脚本,该脚本与Google搜索的建议API进行了交谈。

通过更改"查询"变量,您可以定义要问API的内容。在英语中效果很好,但德国的Umlauts似乎引起了一些编码问题。在下面的示例中,我使用"tür"(门)一词来说明问题。

#!/usr/bin/env ruby
# encoding: UTF-8
require 'nokogiri'
require 'open-uri'
query = 'Tür'
uri = URI.encode("http://suggestqueries.google.com/complete/search?output=toolbar&hl=de&q=#{query}")
puts uri
puts '----------'
xml_doc = Nokogiri::XML(open(uri)) 
puts xml_doc
puts '----------'
xml_doc.xpath('.//suggestion').each do |suggestion| 
  puts suggestion.attr('data')
end

输出:

http://suggestqueries.google.com/complete/search?output=toolbar&hl=de&q=T%C3%BCr
----------
element suggestion: output error : invalid character value
<?xml version="1.0"?>
<toplevel>
  <CompleteSuggestion>
    <suggestion data="t&#xFC;rkei"/>
  </CompleteSuggestion>
  <CompleteSuggestion>
    <suggestion data="t?rkis"/>
  </CompleteSuggestion>
  <CompleteSuggestion>
    <suggestion data="t?rkei news"/>
  </CompleteSuggestion>
  <CompleteSuggestion>
    <suggestion data="t?rkiye"/>
  </CompleteSuggestion>
  <CompleteSuggestion>
    <suggestion data="t?ren"/>
  </CompleteSuggestion>
  <CompleteSuggestion>
    <suggestion data="t?rstopper"/>
  </CompleteSuggestion>
  <CompleteSuggestion>
    <suggestion data="t?rschloss"/>
  </CompleteSuggestion>
  <CompleteSuggestion>
    <suggestion data="t?rkisch deutsch"/>
  </CompleteSuggestion>
  <CompleteSuggestion>
    <suggestion data="t?renheld"/>
  </CompleteSuggestion>
  <CompleteSuggestion>
    <suggestion data="t?rkisch"/>
  </CompleteSuggestion>
</toplevel>
----------
t?rkei
t?rkis
t?rkei news
t?rkiye
t?ren
t?rstopper
t?rschloss
t?rkisch deutsch
t?renheld
t?rkisch

您可以看到URI有效,并且API返回XML数据。但是印刷数据已经有这些编码错误,我怀疑诺科吉里的配置错误是因为它在chrome中工作得很好。它还说:

元素建议:输出错误:无效的字符值

有人知道如何解决这个问题吗?会很棒!

尝试以下:

xml_doc = open(url) { |io| Nokogiri::XML(io.read.encode('UTF-8')) }

相关内容

  • 没有找到相关文章

最新更新