我写了一个简单的红宝石脚本,该脚本与Google搜索的建议API进行了交谈。
通过更改"查询"变量,您可以定义要问API的内容。在英语中效果很好,但德国的Umlauts似乎引起了一些编码问题。在下面的示例中,我使用"tür"(门)一词来说明问题。
#!/usr/bin/env ruby
# encoding: UTF-8
require 'nokogiri'
require 'open-uri'
query = 'Tür'
uri = URI.encode("http://suggestqueries.google.com/complete/search?output=toolbar&hl=de&q=#{query}")
puts uri
puts '----------'
xml_doc = Nokogiri::XML(open(uri))
puts xml_doc
puts '----------'
xml_doc.xpath('.//suggestion').each do |suggestion|
puts suggestion.attr('data')
end
输出:
http://suggestqueries.google.com/complete/search?output=toolbar&hl=de&q=T%C3%BCr
----------
element suggestion: output error : invalid character value
<?xml version="1.0"?>
<toplevel>
<CompleteSuggestion>
<suggestion data="türkei"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rkis"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rkei news"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rkiye"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?ren"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rstopper"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rschloss"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rkisch deutsch"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?renheld"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rkisch"/>
</CompleteSuggestion>
</toplevel>
----------
t?rkei
t?rkis
t?rkei news
t?rkiye
t?ren
t?rstopper
t?rschloss
t?rkisch deutsch
t?renheld
t?rkisch
您可以看到URI有效,并且API返回XML数据。但是印刷数据已经有这些编码错误,我怀疑诺科吉里的配置错误是因为它在chrome中工作得很好。它还说:
元素建议:输出错误:无效的字符值
有人知道如何解决这个问题吗?会很棒!
尝试以下:
xml_doc = open(url) { |io| Nokogiri::XML(io.read.encode('UTF-8')) }