Python URL 解析'function'对象没有属性'urlparse'



试图解析我使用scrapy

的URL
def parse_info_has_id(self, css_path):
    profileID = ""
    for div in css_path.xpath('div'):
        url = "".join(div.css('div > a::attr(href)').extract())
        if "add_friend.php?id" in url:
            print(url)
            #parsed = urlparse.urlparse(url)
            #print urlparse.parse_qs(parsed.query)['id']
    return profileID

此打印

/a/mobile/friends/add_friend.php?id=100003669247258&hf=search&sld=eyJzZWFyY2hfc2lkIjoiNGYxMmNhZGJhZDVkOGQ5ZGFkN2RkZTdhYjc3MTMwNTQiLCJxdWVyeSI6IjIwMjM2MDg3OTciLCJzZWFyY2hfdHlwZSI6IlNlYXJjaCIsInNlcXVlbmNlX2lkIjoxOTg2MTg0OTIzLCJwYWdlX251bWJlciI6MSwiZmlsdGVyX3R5cGUiOiJTZWFyY2giLCJlbnRfaWQiOjEwMDAwMzY2OTI0NzI1OCwicG9zaXRpb24iOjAsInJlc3VsdF90eXBlIjoyMDQ4fQ%3D%3D&gfid=AQB03j5V7CqqGQSD/graphsearch/100003669247258/photos-of?ent=100003669247258&refid=0&query=2023608797&sld=eyJzZWFyY2hfc2lkIjoiNGYxMmNhZGJhZDVkOGQ5ZGFkN2RkZTdhYjc3MTMwNTQiLCJxdWVyeSI6IjIwMjM2MDg3OTciLCJzZWFyY2hfdHlwZSI6IlNlYXJjaCIsInNlcXVlbmNlX2lkIjoxOTg2MTg0OTIzLCJwYWdlX251bWJlciI6MSwiZmlsdGVyX3R5cGUiOiJTZWFyY2giLCJlbnRfaWQiOjEwMDAwMzY2OTI0NzI1OCwicG9zaXRpb24iOjAsInJlc3VsdF90eXBlIjoyMDQ4fQ%3D%3D&source=pivot

我想从字符串中获取ID = 100003669247258,但是当我尝试

#parsed = urlparse.urlparse(url)
#print urlparse.parse_qs(parsed.query)['id']

我有'function' object has no attribute 'urlparse'错误,如何解析该URL字符串以从add_friend.php?id=10000366924725/graphsearch/100003669247258/

获得ID

您可以使用

import urllib.parse as urlparse

将库导入:

from urlparse import urlparse

使用该方法为:

urlparse(url)

而不是:

urlparse.urlparse(url)

如果您的目标仅是从字符串中获取ID,则可以使用re实现它。

import re
match_object =  re.search("id=(d+)", "/a/mobile/friends/add_friend.php?id=100003669247258&hf=search&sld=eyJzZWFyY2hfc2lkIjoiNGYxMmNhZGJhZDVkOGQ5ZGFkN2RkZTdhYjc3MTMwNTQiLCJxdWVyeSI6IjIwMjM2MDg3OTciLCJzZWFyY2hfdHlwZSI6IlNlYXJjaCIsInNlcXVlbmNlX2lkIjoxOTg2MTg0OTIzLCJwYWdlX251bWJlciI6MSwiZmlsdGVyX3R5cGUiOiJTZWFyY2giLCJlbnRfaWQiOjEwMDAwMzY2OTI0NzI1OCwicG9zaXRpb24iOjAsInJlc3VsdF90eXBlIjoyMDQ4fQ%3D%3D&gfid=AQB03j5V7CqqGQSD/graphsearch/100003669247258/photos-of?ent=100003669247258&refid=0&query=2023608797&sld=eyJzZWFyY2hfc2lkIjoiNGYxMmNhZGJhZDVkOGQ5ZGFkN2RkZTdhYjc3MTMwNTQiLCJxdWVyeSI6IjIwMjM2MDg3OTciLCJzZWFyY2hfdHlwZSI6IlNlYXJjaCIsInNlcXVlbmNlX2lkIjoxOTg2MTg0OTIzLCJwYWdlX251bWJlciI6MSwiZmlsdGVyX3R5cGUiOiJTZWFyY2giLCJlbnRfaWQiOjEwMDAwMzY2OTI0NzI1OCwicG9zaXRpb24iOjAsInJlc3VsdF90eXBlIjoyMDQ4fQ%3D%3D&source=pivot")
id = match_object.group(1)
print id

您可以将URLPARSE导入为:

from urlparse import urlparse

,也可以从urllib导入为:

from urllib import parse as urlparse

最新更新