脚本文本 ruby 中的 JSON 解析错误

我正在尝试从包含存储数据的脚本文本中解析 json。它位于页面 http://www.buildbase.co.uk/storefinder 内。我正在解决的脚本文本是 http://pastebin.com/embed_js/3cnewiSh，我的代码如下：

stores_url = "http://www.buildbase.co.uk/storefinder"
mechanize = Mechanize.new
stores_page = mechanize.get(stores_url)
stores_script_txt = stores_page.search("//script[contains(text(), 'storeLocator.initialize(')]")[0].text
stores_jsons = stores_script_txt.split("storeLocator.initialize( $.parseJSON('{\"all\":")[-1].split(",\"selected\":0}') ,tfalse);n        });")[0]
puts stores_jsons
stores_result = JSON.parse(stores_jsons)

JSON.parse 给我的错误是：

from /home/private/.rvm/gems/ruby-2.1.5/gems/json-1.8.3/lib/json/common.rb:155:in `parse'
from /home/private/.rvm/gems/ruby-2.1.5/gems/json-1.8.3/lib/json/common.rb:155:in `parse'
from (irb):240
from /home/private/.rvm/rubies/ruby-2.1.5/bin/irb:11:in `<main>'

我不知道我哪里出错了，因为 JSON 字符串对我来说似乎有效。

有几个问题。首先，您获得的文本格式不正确，因为它使用了 \" 而不是引号等。

其次，它有HTML标签，其中包括引号，这打破了实际JSON中的引用。我抓住了一个片段，只是去掉了标签。

我不知道你需要多少数据，但这段代码确实有效。我也不确定它有多强大（例如，我只是用"代替了任何"）

require 'mechanize'
stores_url = "http://www.buildbase.co.uk/storefinder"
mechanize = Mechanize.new
stores_page = mechanize.get(stores_url)
stores_script_txt = stores_page.search("//script[contains(text(), 'storeLocator.initialize(')]")[0].text
stores_jsons = stores_script_txt.split("storeLocator.initialize( $.parseJSON('{\"all\":")[-1].split(",\"selected\":0}') ,tfalse);n        });")[0]
stores_jsons = stores_jsons.gsub('"', '"').gsub(/</?[^>]*>/, '').gsub(/nn+/, "n").gsub(/^n|n$/, '')
stores_result = JSON.parse(stores_jsons)

相关内容

最新更新

热门标签：