YouTube API 搜索 list_next() 抛出 UnicodeEncodeError

当我将非英语字符串输入 YouTube API 库时搜索，它仅在初始搜索期间有效。如果我调用 list_next((，它抛出一个UnicodeEncodeError。

当我使用简单的 ascii 字符串时，一切正常。

关于我应该做什么的任何建议？

这是我正在做的事情的简化代码：

# -*- coding: utf-8 -*-
import apiclient.discovery
def test(query):
    youtube = apiclient.discovery.build('youtube', 'v3', developerKey='xxx')
    ys = youtube.search()
    req = ys.list(
        q=query.encode('utf-8'),
        type='video',
        part='id,snippet',
        maxResults=50
    )
    while (req):
        res = req.execute()
        for i in res['items']:
            print(i['id']['videoId'])
        req = ys.list_next(req, res)
test(u'한글')
test(u'日本語')
test(u'uD55CuAE00')
test(u'u65E5u672Cu8A9E')

错误信息：

Traceback (most recent call last):
  File "E:prjscriptsytsearch.py", line 316, in _search
    req = ys.list_next(req, res)
  File "D:AppsPythonlibsite-packagesgoogleapiclientdiscovery.py", line 966, in methodNext
    parsed[4] = urlencode(newq)
  File "D:AppsPythonliburllib.py", line 1343, in urlencode
    v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-9: ordinal not in range(128)

版本：

Google-API-python-client (1.6.2(
蟒蛇 2.7.13 (Win32(

编辑：我在下面发布了一个解决方法。

如果其他人感兴趣，这里有一个适合我的解决方法：

googleapiclient/discovery.py:
(old) q = parse_qsl(parsed[4])
(new) q = parse_qsl(parsed[4].encode('ascii'))

解释

在 discovery.py 中，list_next(( 解析并取消转义以前的 url，然后从中创建一个新 url：

pageToken = previous_response['nextPageToken']
parsed = list(urlparse(request.uri))
q = parse_qsl(parsed[4])
# Find and remove old 'pageToken' value from URI
newq = [(key, value) for (key, value) in q if key != 'pageToken']
newq.append(('pageToken', pageToken))
parsed[4] = urlencode(newq)
uri = urlunparse(parsed)

似乎问题是当parse_qsl取消转义解析的 unicode 时[4]，它以 Unicode 类型返回 UTF-8 编码值。乌伦科不喜欢这：

q = urlparse.parse_qsl(u'q=%ED%95%9C%EA%B8%80')
[(u'q', u'xedx95x9cxeaxb8x80')]
urllib.urlencode(q)
UnicodeEncodeError

如果parse_qsl被赋予一个普通的 ascii 字符串，它会返回一个普通的 utf-8 编码字符串，urlencode 喜欢：

q = urlparse.parse_qsl(u'q=%ED%95%9C%EA%B8%80'.encode('ascii'))
[('q', 'xedx95x9cxeaxb8x80')]
urllib.urlencode(q)
'q=%ED%95%9C%EA%B8%80'

相关内容

最新更新

热门标签：