在python中使用wikipedia模块



我在python代码中使用wikipedia模块。我想从用户那里得到一个输入,从维基百科中搜索,并从摘要中得到2行。由于可能有很多主题都有相同的名称,所以我使用了这样的名称。

import wikipedia
value=input("Enter what u want to search")
m=wikipedia.search(value,3)
print(wikipedia.summary(m[0],sentences=2))

在执行此操作时,它显示了大约3页的异常。这个怎么了?编辑:按照@Ruperto的建议,我这样修改了代码。

import wikipedia
import random
value=input("Enter the words: ")
try:
p=wikipedia.page(value)
print(p)
except wikipedia.exceptions.DisambiguationError as e:
s=random.choice(e.options)
p=wikipedia.summary(s,sentences=2)
print(p)

现在我得到的错误是,

Traceback (most recent call last):   File "C:UsersvdhanAppDataLocalProgramsPythonPython37-32libsite-packagesurllib3connection.py", line 160, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw   File "C:UsersvdhanAppDataLocalProgramsPythonPython37-32libsite-packagesurllib3utilconnection.py", line 84, in create_connection
raise err   File "C:UsersvdhanAppDataLocalProgramsPythonPython37-32libsite-packagesurllib3utilconnection.py", line 74, in create_connection
sock.connect(sa) TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
During handling of the above exception, another exception occurred:
Traceback (most recent call last):   File "C:UsersvdhanAppDataLocalProgramsPythonPython37-32libsite-packagesurllib3connectionpool.py", line 677, in urlopen
chunked=chunked, urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x03AEEAF0>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

现在该怎么办?

这可能是由于没有/糟糕的互联网连接,正如您的错误所说,

A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

您可以更改/检查您的internet连接,然后重试。两者都不是,这是您的python环境的问题。我的实现是,

import warnings
warnings.filterwarnings("ignore")
import wikipedia
import random

value=input("Enter the words: ")
try:
m=wikipedia.search(value,3)
print(wikipedia.summary(m[0],sentences=2))
# print(p)
except wikipedia.exceptions.DisambiguationError as e:
s=random.choice(e.options)
p=wikipedia.summary(s,sentences=2)
print(p)

输出:

Enter the words: programming
Program management or programme management is the process of managing several related projects, often with the intention of improving an organization's performance. In practice and in its aims, program management is often closely related to systems engineering, industrial engineering, change management, and business transformation.

它在谷歌colab中运行良好,我的实现colab文件你可以在这里找到。

上述错误是由于互联网的连接问题。然而,下面的代码工作

value=input("Enter the words: ")
try:
m=wikipedia.search(value,3)
print(wikipedia.summary(m[0],sentences=2))
except wikipedia.exceptions.DisambiguationError as e:
s=random.choice(e.options)
p=wikipedia.summary(s,sentences=2)
print(p)

然而,这里需要注意的是,由于这是一个更大代码块的一部分,因此最好使用任何NLP库进行抽象或提取摘要,因为wikipdia包只使用beautifulsoup和soupsieve进行网络抓取,并以一种不是摘要的方式还原仅有的几行顶部内容。此外,维基百科上的内容可以每2小时更改一次

我遇到了一个类似的问题,经过大量的挠头和谷歌搜索,找到了这个解决方案:

import wikipediaapi as api
import wikipedia as wk
# Wikipediaapi 'initialization'
wiki_wiki = api.Wikipedia('en')

# Getting fixed number of sentences from summary
def summary(pg, sentences=5):
summ = pg.summary.split('. ')
summ = '. '.join(summ[:sentences])
summ += '.'
return summ

s_term = 'apple'# Any term, ambiguous or not
wk_res = wk.search(s_term)
page = wiki_wiki.page(wk_res[0])
print("Page summary", summary(page))

基本上,从我所看到的情况来看,仅仅使用维基百科模块并不能得到一个好的解决方案。例如,如果我搜索"印度",我永远无法获得印度这个国家的页面,这正是我想要的。之所以会出现这种情况,是因为印度(国家(维基百科页面的标题只是"印度"。然而,由于它可以引用的东西太多,这个标题是无效的。这种情况也适用于许多其他东西。

然而,wiki_wiki_page可能会得到一个标题不明确的页面,这就是该代码所依赖的系统。

最新更新