无法用python请求发送良好的查询字符串



我正在尝试用python和请求库应用一些过滤器对这个url进行正确的get请求:

https://www.efast.dol.gov/5500search/

我只需要一些过滤器获得正确的数据内的搜索页面是:计划年,ein和pn。当我尝试执行请求时,我得到了错误的数据,因为我的字典得到了"q">

后面的删除值这是一个例子:

import requests
args = {'q.parser': 'lucene', 'q': {'ein': '814699012', 'planyear': '2020', 'pn': '001'}}
url = "https://www.efast.dol.gov/services/afs"
response = requests.get(url, params=args)

当我检查响应。url:

https://www.efast.dol.gov/services/afs?q.parser=lucene&q=ein&q=planyear&q=pn

每个键都没有值

这是我离你最近的一次:

args = {"q.parser":"lucene","q":{"ein":"814699012"}, "planyear":"2020","pn":"001"}

但是如果我响应。url:

'https://www.efast.dol.gov/services/afs?q.parser=lucene&q=ein&planyear=2020&pn=001

ein的值没有了,如果我把planyear或pn作为值放在q旁边,结果都是一样的。

我做错了什么?

正确的结果将是与2020年相对应的数据,正确的ein号码和pn号码,无论我得到几个结果还是只有一个

正确的结果应该是:

https://www.efast.dol.gov/services/afs?q.parser=lucene&size=200&sort=planname%20asc&q=(((planyear:2020))%20AND%20((ein:814699012))%20AND%20((pn:001)))&facet.planyear=%7Bsize:30%7D%26facet.plancode=%7Bsize:100%7D&facet.plancode=%7Bsize:100%7D&facet.assetseoy=%7Bbuckets:%5B%22%7B,100000%5D%22,%22%5B100001,500000%5D%22,%22%5B500001,1000000%5D%22,%22%5B1000001,10000000%5D%22,%22%5B10000001,%7D%22%5D%7D&facet.plantype=%7Bsize:20%7D&facet.businesscodecat=%7Bsize:30%7D&facet.businesscode=%7Bsize:30%7D&facet.state=%7Bsize:100%7D&facet.countrycode=%7Bbuckets:%5B%22CA%22,%22GB%22,%22BM%22,%22KY%22%5D%7D&facet.formyear=%7Bsize:30%7D

Python的requests包不支持类似dict的参数。键-值字典中的值必须是字符串或字符串列表:

Requests允许您使用params关键字参数将这些参数作为字符串字典提供。

https://docs.python-requests.org/en/latest/user/quickstart/passing-parameters-in-urls

您的网站使用非标准编码将字典编码为url有效字符。

如果你看一下你的例子:

(((planyear:2020))%20AND%20((ein:814699012))%20AND%20((pn:001)))

可以推导出格式为:

(((KEY:VALUE)) AND ((KEY:VALUE)) AND <...>)

所以它是(),其中每个键:值对都被(())包围,空格被引号引到%20

我们可以在自己的代码中复制这种编码:

>>> params = {"planyear": "2020", "ein": 814699012, "pn": "001"}
>>> encoded = '%20AND%20'.join(f"(({k}:{v})" for k, v in params.items())
>>> f"({encoded})"
'(((planyear:2020))%20AND%20((ein:814699012))%20AND%20((pn:001)))'

然后把这个作为你的q参数。

编辑:我已经为你的具体情况编译了这个:

import requests
q = {'ein': '814699012', 'planyear': '2020', 'pn': '001'}
# convert Q parameter to website's encoding: 
q = ' AND '.join(f"(({k}:{v}))" for k, v in q.items())
q = f"({q})"
# put all params together: normal key:value parameters + special Q parameter: 
params = {'q.parser': 'lucene', 'size': 200, 'sort': 'planname asc', 'q': q}
url = "https://www.efast.dol.gov/services/afs"
response = requests.get(url, params=params)
print(response.status)
print(response.url)
# 200
# 'https://www.efast.dol.gov/services/afs?q.parser=lucene&size=200&sort=planname+asc&q=%28%28%28ein%3A814699012%29%29+AND+%28%28planyear%3A2020%29%29+AND+%28%28pn%3A001%29%29%29'

您似乎对请求和响应感到困惑。在您的示例中,应该使用长URL作为请求,然后解析响应JSON数据。因此,下面的代码应该可以为您工作,您需要解析响应:

import requests
url = "https://www.efast.dol.gov/services/afs?q.parser=lucene&size=200&sort=planname%20asc&q=(((planyear:2020))%20AND%20((ein:814699012))%20AND%20((pn:001)))&facet.planyear=%7Bsize:30%7D%26facet.plancode=%7Bsize:100%7D&facet.plancode=%7Bsize:100%7D&facet.assetseoy=%7Bbuckets:%5B%22%7B,100000%5D%22,%22%5B100001,500000%5D%22,%22%5B500001,1000000%5D%22,%22%5B1000001,10000000%5D%22,%22%5B10000001,%7D%22%5D%7D&facet.plantype=%7Bsize:20%7D&facet.businesscodecat=%7Bsize:30%7D&facet.businesscode=%7Bsize:30%7D&facet.state=%7Bsize:100%7D&facet.countrycode=%7Bbuckets:%5B%22CA%22,%22GB%22,%22BM%22,%22KY%22%5D%7D&facet.formyear=%7Bsize:30%7D"
payload={}
headers = {}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)

查看:

import requests
parser='lucene'
query='(((planyear:2020))%20AND%20((ein:814699012))%20AND%20((pn:001)))'
url = "https://www.efast.dol.gov/services/afs?q.parser=" + parser + "&size=200&sort=planname%20asc&q=" + query + "&facet.planyear=%7Bsize:30%7D%26facet.plancode=%7Bsize:100%7D&facet.plancode=%7Bsize:100%7D&facet.assetseoy=%7Bbuckets:%5B%22%7B,100000%5D%22,%22%5B100001,500000%5D%22,%22%5B500001,1000000%5D%22,%22%5B1000001,10000000%5D%22,%22%5B10000001,%7D%22%5D%7D&facet.plantype=%7Bsize:20%7D&facet.businesscodecat=%7Bsize:30%7D&facet.businesscode=%7Bsize:30%7D&facet.state=%7Bsize:100%7D&facet.countrycode=%7Bbuckets:%5B%22CA%22,%22GB%22,%22BM%22,%22KY%22%5D%7D&facet.formyear=%7Bsize:30%7D"
payload={}
headers = {}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)

最新更新