如何修复带有阿拉伯语单词的链接将变成没有阿拉伯语单词的链接



我在抓取某些网站时遇到了一些链接问题:

'https://www.booking.com/searchresults.en-gb.html?aid=304142&label=gen173nr-1FCAEoggI46AdIM1gEaMQBiAECmAEBuAEJyAER2AEB6AEB-AELiAIBqAIDuAKkxcbwBcACAQ&sid=ccf3ed025a0327718f1b96dae900f584&tmpl=searchresults&checkin=2020-01-15&checkin_monthday=15&checkin_year_month=2020-1&checkout=2020-01-16&checkout_monthday=16&checkout_year_month=2020-1&class_interval=1&dest_id=-3096949&dest_type=city&group_adults=2&group_children=0&label_click=undef&lang=en-gb&nflt=class%3D5%3B&no_rooms=1&offset=0&order=distance_from_search&raw_dest_type=city&room1=A%2CA&sb_price_type=total&search_form_id=e7c33dd2f07c015a&shw_aparth=1&slp_r_match=0&soz=1&srpvid=82633e8bdf320042&ss=مكة%20المكرمة&ssb=empty&ssne=مكة%20المكرمة&ssne_untouched=مكة%20المكرمة&top_ufis=1&lang_click=top&cdl=en-us&lang_changed=1'

如您所见,此链接有一些阿拉伯语单词和

"https://www.booking.com/searchresults.html?aid=304142&label=gen173nr-1FCAEoggI46AdIM1gEaMQBiAECmAEBuAEJyAER2AEB6AEB-AELiAIBqAIDuAKkxcbwBcACAQ&sid=b9b7a61ae544dc90ed678f4d6474c47c&checkin=2020-01-15&checkin_monthday=15&checkin_year_month=2020-1&checkout=2020-01-16&checkout_monthday=16&checkout_year_month=2020-1&class_interval=1&dest_id=-3096949&dest_type=city&dtdisc=0&group_adults=2&group_children=0&inac=0&index_postcard=0&label_click=undef&nflt=class%3D5%3B&no_rooms=1&offset=0&order=distance_from_search&postcard=0&raw_dest_type=city&room1=A%2CA&sb_price_type=total&search_form_id=e7c33dd2f07c015a&shw_aparth=1&slp_r_match=0&srpvid=2bed3e3ab8640088&ss=%D9%85%D9%83%D8%A9%20%D8%A7%D9%84%D9%85%D9%83%D8%B1%D9%85%D8%A9&ss_all=0&ssb=empty&sshis=0&ssne=%D9%85%D9%83%D8%A9%20%D8%A7%D9%84%D9%85%D9%83%D8%B1%D9%85%D8%A9&ssne_untouched=%D9%85%D9%83%D8%A9%20%D8%A7%D9%84%D9%85%D9%83%D8%B1%D9%85%D8%A9&top_ufis=1&lang_changed=1&lang=en-us&selected_currency=SAR">

上一个链接指向与阿拉伯语单词相同的页面。怎么办?

这似乎有效:

>>> import urllib.parse
>>> urllib.parse.quote('s=مكة%20المكرمة')
's%3D%D9%85%D9%83%D8%A9%2520%D8%A7%D9%84%D9%85%D9%83%D8%B1%D9%85%D8%A9'

(%3D 和等号"="在 URL 中是相同的内容(

最新更新