我的简单python脚本创建一个数据集,并将一个PDF文件作为资源添加到该数据集,但失败了;{file}不是json可序列化的"。
# coding=utf-8
# import base64
import ckanapi
import requests
import csv
import json
import pprint
import socket
import netifaces as ni
# UPDATE THESE AND ONLY THESE.
api_token = '***'
the_hostname = socket.gethostname()
the_ipaddress = ni.ifaddresses('eth0')[ni.AF_INET][0]['addr']
site_url = 'http://' + the_ipaddress + ':5000'
endpoint_p = '{}/api/3/action/package_create'.format(site_url)
endpoint_r = '{}/api/3/action/resource_create'.format(site_url)
headers = {'Authorization': api_token}
payload_p = {
"name": "test01",
"private": "true",
"state": "active",
"owner_org": "b15a6f45-e2ed-4587-8c5e-a92dbc9f157d",
"maintainer" : "Forms Management",
"maintainer_email" : "forms.management@province.ca",
"author" : "Test Author",
"author_email" : "hughj@province.ca"
}
payload_r = {
"package_id": "null",
"name": "English - test01 - Test Description",
"url": "upload",
"upload": open('/var/www/upload/2nd/unzipped/002-33-5098E/33-5098E.pdf', 'r'),
"description": "This is a test resource attached to dataset test01",
"notes": "This is a longer block of text that is for the resource test01e which is attached to the dataset test01"
}
filepaths = {
"thepath": "/var/www/upload/2nd/unzipped/002-33-5098E/33-5098E.pdf"
}
req_p = requests.post(endpoint_p, json=payload_p, headers=headers)
theLastResponse = req_p.json()
theLastPackageCreated = theLastResponse['result']['id']
payload_r["package_id"] = theLastPackageCreated
req_r = requests.post(endpoint_r, json = payload_r, headers = headers) # resource_create()
这抛出一个错误";{file}不是json可序列化的"。该文件是一个PDF,是一个二进制文件,但我不确定是否需要某种类型的编码(请注意注释掉的"base64"模块……我不想在不询问这是否是正确的方法的情况下走上这条路。(
CKAN API文档如下:https://docs.ckan.org/en/2.9/api/#ckan.logic.action.create.resource_create
说";上传";应该是";(FieldStorage(可选(需要多部分/表单数据(-(可选(";但我看到的所有将文件上传到CKAN的示例脚本都只显示了代码,并且确切地显示了我在这里所做的事情,没有对上传的文件进行额外的预处理,所以我不确定到底是什么问题。。。如果可以的话,请帮忙!
我复制了你的代码,并针对CKAN的本地开发副本运行了一个修改后的版本,在我的MOD之后,它就可以正常工作了,这些MOD包括在下面。
最值得注意的是:
- payload_r->所有这些额外的东西都不需要,但如果需要的话,您可以包括其他资源元数据,如描述、名称等
- req_r->1( 将有效载荷作为
data
而不是作为multipart-form-data
的json
传递。2( 在此处发送带有files
参数的文件
文档:https://docs.ckan.org/en/2.9/maintaining/filestore.html#filestore-api
IMO这与其说是CKAN的问题,不如说是对所选图书馆的理解(即请求(。使用不同的工具有很多方法可以做到这一点。
我还必须更新有效负载以与我的模式保持一致,但假设这对您的模式是正确的,这应该可以工作。
# coding=utf-8
# import base64
import ckanapi
import requests
import csv
import json
import pprint
import socket
import netifaces as ni
# UPDATE THESE AND ONLY THESE.
api_token = '***'
the_hostname = socket.gethostname()
the_ipaddress = ni.ifaddresses('eth0')[ni.AF_INET][0]['addr']
site_url = 'http://' + the_ipaddress + ':5000'
endpoint_p = '{}/api/3/action/package_create'.format(site_url)
endpoint_r = '{}/api/3/action/resource_create'.format(site_url)
headers = {'Authorization': api_token}
payload_p = {
"name": "test01",
"private": "true",
"state": "active",
"owner_org": "b15a6f45-e2ed-4587-8c5e-a92dbc9f157d",
"maintainer" : "Forms Management",
"maintainer_email" : "forms.management@province.ca",
"author" : "Test Author",
"author_email" : "hughj@province.ca"
}
payload_r = {
"package_id": "null"
}
filepaths = {
"thepath": "/var/www/upload/2nd/unzipped/002-33-5098E/33-5098E.pdf"
}
req_p = requests.post(endpoint_p, json=payload_p, headers=headers)
theLastResponse = req_p.json()
theLastPackageCreated = theLastResponse['result']['id']
payload_r["package_id"] = theLastPackageCreated
req_r = requests.post(endpoint_r, data=payload_r, headers=headers, files=[('upload', file('/var/www/upload/2nd/unzipped/002-33-5098E/33-5098E.pdf'))]) # resource_create()