我正在尝试将一些工作Ruby代码转换为Clojure,它调用分页REST API并积累数据。Ruby代码,基本上最初调用API,检查是否有pagination.hasNextPage
键,并使用pagination.endCursor
作为下一个API调用的查询字符串参数,这是在while
循环中完成的。下面是简化的Ruby代码(删除了日志/错误处理代码等):
def request_paginated_data(url)
results = []
response = # ... http get url
response_data = response['data']
results << response_data
while !response_data.nil? && response.has_key?('pagination') && response['pagination'] && response['pagination'].has_key?('hasNextPage') && response['pagination']['hasNextPage'] && response['pagination'].has_key?('endCursor') && response['pagination']['endCursor']
response = # ... http get url + response['pagination']['endCursor']
response_data = response['data']
results << response_data
end
results
end
下面是我的Clojure代码的开头:
(defn get-paginated-data [url options]
{:pre [(some? url) (some? options)]}
(let [body (:body @(client/get url options))]
(log/debug (str "body size =" (count body)))
(let [json (json/read-str body :key-fn keyword)]
(log/debug (str "json =" json))))
;; ???
)
我知道我可以使用contains?
在jsonclojure.lang.PersistentArrayMap
中查找关键字,但是,我不确定如何编写其余的代码…
您可能想要这样:
(let [data (json/read-str body :key-fn keyword)
hnp (get-in data [:pagination :hasNextPage])
ec (get-in data [:pagination :endCursor])
continue? (and hnp ec) ]
(println :hnp hnp)
(println :ec ec)
(println :cont continue?)
...)
取出嵌套的位并打印一些调试信息。再次检查json到clojure的转换是否得到了"camelcase";关键字符合预期,并在必要时修改为匹配。
你可能会发现我最喜欢的模板项目很有帮助,尤其是最后的文档列表。一定要阅读Clojure CheatSheet!
Clojure 1.11引入了一个新的函数迭代,它正是为分页而构建的。
这篇文章也解释得很好https://www.juxt.pro/blog/new-clojure-iteration
在过去,我使用loop
和recur
来处理这些事情。
下面是一个查询Jira API的例子:
(defn get-jira-data [from to url hdrs]
(loop [result []
n 0
api-offset 0]
(println n " " (count result) " " api-offset)
(let [body (jql-body from to api-offset)
resp (client/post url
{:headers hdrs
:body body})
issues (-> resp
:body
(json/read-str :key-fn keyword)
:issues)
returned-count (count issues)
intermediate-res (into result issues)]
(if (and (pos? returned-count)
(< (inc n) MAX-PAGED-PAGES))
(recur intermediate-res
(inc n)
(+ api-offset returned-count))
intermediate-res)))))
我建议将递归限制为最大页数,以避免在生产中出现不可预见和令人不快的意外。使用Jira API,您可以在请求体中发送下一次迭代所需的偏移量或页面。例如,如果你使用GitHub API,你需要在loop
调用中对URL进行本地绑定。
谈论GitHub API:他们在响应中以HTTP头的形式发布相关的url。你可以这样使用它们:
(loop [result []
u url
n 0]
(log/debugf "Get JSON paged (%s previous results) from %s"
(count result) u)
(let [resp (http-get-with-retry u {:headers auth-hdr})
data (-> resp :body
(json/read-str :key-fn keyword))
intermediate-res (into result data)
next-url (-> resp :links :next :href)]
(if (and next-url
data
(pos? (count data))
(<= n MAX-PAGED-PAGES))
(recur intermediate-res next-url (inc n))
intermediate-res))
您需要在这里推断缺失的函数和其他变量。http-get-with-retry
本质上只是一个添加了重试处理函数的HTTP GET。该模式与您所看到的相同,它只是使用来自响应的相应链接和本地url
绑定。
我特此将上述所有代码置于Apache软件许可证2.0下,以及StackOverflow上的标准许可证
以下是应用Stefan Kamphausen和Alan Thompson的建议后的最终结果:
(defn get-paginated-data [^String url ^clojure.lang.PersistentArrayMap options ^clojure.lang.Keyword data-key]
{:pre [(some? url) (some? options)]}
(loop [results [] u url page 1]
(log/debugf "Requesting data from API url=%s page=%d" u page)
(let [body (:body @(client/get u options))
body-map (json/read-str body :key-fn keyword)
data (get-in body-map [data-key])
has-next-page (get-in body-map [:pagination :hasNextPage])
end-cursor (get-in body-map [:pagination :endCursor])
accumulated-results (into results data)
continue? (and has-next-page (> (count end-cursor) 0))]
(log/debugf "count body=%d" (count body))
(log/debugf "count results=%s" (count results))
(log/debugf "has-next-page=%s" has-next-page)
(log/debugf "end-cursor=%s" end-cursor)
(log/debugf "continue?=%s" continue?)
(if continue?
(let [next-url (str url "?after=" end-cursor)]
(log/info (str "Sleeping for " (/ pagination-delay 1000) " seconds..."))
(Thread/sleep pagination-delay)
(recur accumulated-results next-url (inc page)))
accumulated-results))))