我有一个文本文件(links.txt(,格式如下:
www.independent.co.uk www.bbc.co.uk www.theguardian.com www.telegraph.co.uk
www.dailymail.co.uk en.wikipedia.org www.huffingtonpost.co.uk www.bbc.co.uk
www.newsnow.co.uk www.express.co.uk
我有另一个文本文件(keys.txt(,格式如下:
www.independent.co.uk www.bbc.co.uk www.theguardian.com
我想比较两个文本文件和URL,这两个文件中常见的都必须打印
我尝试在python中使用urltools包,但无法对多个url进行
这个怎么样:
links = open('links.txt', 'r')
links_data = links.read()
links.close()
keys = open('keys.txt', 'r')
keys_data = keys.read()
keys.close()
keys_split = keys_data.split()
for url in keys_split:
if url in links_data:
print(url)
只要确保links.txt
和keys.txt
在当前工作目录中,一切都应该正常。我假设您的URL将始终以空格分隔。
要只打印唯一URL而不是通用URL,只需修改条件not in
,这里是完整的代码-
links = open('links.txt', 'r')
links_data = links.read()
links.close()
keys = open('keys.txt', 'r')
keys_data = keys.read()
keys.close()
keys_split = keys_data.split()
for url in keys_split:
if url not in links_data:
print(url)