未知的URL类型使用urllib2.urlopen()



我正在尝试做以下事情:

  • 打开包含url (GET-Requests)列表的CSV文件
  • 读取CSV文件并将条目写入列表
  • 打开每个URL并读取答案
  • 将答案写回一个新的CSV文件

我得到以下错误:

Traceback (most recent call last):
  File "C:Usersl.buinuiDesktoprequest2.py", line 16, in <module>
    req = urllib2.urlopen(url)
  File "C:Python27liburllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "C:Python27liburllib2.py", line 404, in open
    response = self._open(req, data)
  File "C:Python27liburllib2.py", line 427, in _open
    'unknown_open', req)
  File "C:Python27liburllib2.py", line 382, in _call_chain
    result = func(*args)
  File "C:Python27liburllib2.py", line 1247, in unknown_open
    raise URLError('unknown url type: %s' % type)
URLError: <urlopen error unknown url type: ['http>
下面是我使用的代码:
import urllib2
import urllib
import csv
# Open and read the source file and write entries to a list called link_list
source_file=open("source_new.csv", "rb")
source = csv.reader(source_file, delimiter=";")
link_list = [row for row in source]
source_file.close()
# Create an output file which contains the answer of the GET-Request
out=open("output.csv", "wb")
output = csv.writer(out, delimiter=";")
for row in link_list:
    url = str(row)
    req = urllib2.urlopen(url)
    output.writerow(req.read())
out.close()

哪里出了问题?

提前感谢你的提示。

欢呼

使用row变量将传递一个列表元素(它只包含一个元素,url)给urlopen,但是传递row[0]将传递包含url的字符串。

csv.reader为它读取的每一行返回一个列表,无论该行中有多少项。

现在可以工作了。如果我在循环中直接引用row[0],则没有问题。

import urllib2
import urllib
import csv
# Open and read the source file and write entries to a list called link_list
source_file=open("source.csv", "rb")
source = csv.reader(source_file, delimiter=";")
link_list = [row for row in source]
source_file.close()
# Create an output file which contains the answer of the GET-Request
out=open("output.csv", "wb")
output = csv.writer(out)
for row in link_list:
    req = urllib2.urlopen(row[0])
    answer = req.read()
    output.writerow([answer])

out.close()

最新更新