解码/编码Href链接



如何获得expected的结果以返回可读字符串?换句话说,当给定/wiki/Cookbook:Cao_l%E1%BA%A7u时,它应该返回/wiki/Cookbook:Cao_lầu

注意:我在Python 2.7.2 上运行

import urllib
test_array = [
    '/wiki/Cookbook:Bulgarian_Meatball_Soup_(Supa_Topcheta)',
    '/wiki/Cookbook:Campfire_S%27mores',
    '/wiki/Cookbook:Candied_Almonds_(Br%C3%A4nda_mandlar)',
    '/wiki/Cookbook:Chicken_%26_Pasta_Alfredo',   
    '/wiki/Cookbook:Cozido_%C3%A0_Portuguesa'
]
actual = [urllib.unquote(i).decode('utf-8') for i in test_array]
assert '/wiki/Cookbook:Bulgarian_Meatball_Soup_(Supa_Topcheta)' == actual[0]
assert "/wiki/Cookbook:Campfire_S'mores" == expected[1]
assert '/wiki/Cookbook:Candied_Almonds_(Brända_mandlar)' == actual[2]
assert '/wiki/Cookbook:Chicken_&_Pasta_Alfredo' == actual[3]
assert '/wiki/Cookbook:Cozido_à_Portuguesa' == actual[4]

您需要指定unicode文字(前缀为u)而不是字符串文字,因为str.decode返回unicode对象。

assert u'/wiki/Cookbook:Bulgarian_Meatball_Soup_(Supa_Topcheta)' == expected[0]
assert u"/wiki/Cookbook:Campfire_S'mores" == expected[1]
assert u'/wiki/Cookbook:Candied_Almonds_(Brända_mandlar)' == expected[2]
assert u'/wiki/Cookbook:Chicken_&_Pasta_Alfredo' == expected[3]
assert u'/wiki/Cookbook:Cozido_à_Portuguesa' == expected[4]

顺便说一句,我会给expected起一个不同的名字,比如actualgot。(字符串文字是预期结果,对吧?)

最新更新