使用web抓取创建数据框架



我正在尝试抓取一个名为WikiCFP的网站,并将表中的信息作为数据框架返回。现在我有这个代码

import requests
from bs4 import BeautifulSoup
import pandas as pd
import re
df = pd.DataFrame(columns=["abbreviation", "name", "dates", "place", "deadline"])
url = "http://www.wikicfp.com/cfp/call?conference=computer%20science&page=1"
response = requests.get(url)
soup= BeautifulSoup(response.content, "html.parser")
table = soup.find("table", align="center",cellpadding="3",cellspacing="1",width="100%")
for row in table.find_all("tr")[1:]:
values= row.find_all("td")
print(values[0].text.split("/n")[0])

我特别不知道如何将每行中的文本转换为可行的列表或其他一些可以制作数据框架的东西。提前感谢

您可以直接使用read_html:

dfs = pd.read_html(url, header=0)  # return all tables in a list of dataframe
df = dfs[4]  # 4 is the index of the dataframe you want

输出:

>>> df
Event                                               When                                              Where                                           Deadline  Unnamed: 4
0                              DSA 2021  2nd International Conference on Data Science a...  2nd International Conference on Data Science a...  2nd International Conference on Data Science a...         NaN
1                              DSA 2021                        Nov 27, 2021 - Nov 28, 2021                                         Dubai, UAE                                       Oct 24, 2021         NaN
2                            NCWMC 2022  7th International Conference on Networks, Comm...  7th International Conference on Networks, Comm...  7th International Conference on Networks, Comm...         NaN
3                            NCWMC 2022                        Jul 30, 2022 - Jul 31, 2022                             London, United Kingdom                                       Oct 24, 2021         NaN
4                            ICAIT 2022  11th International Conference on Advanced Comp...  11th International Conference on Advanced Comp...  11th International Conference on Advanced Comp...         NaN
5                            ICAIT 2022                        Jul 23, 2022 - Jul 24, 2022                                    Toronto, Canada                                       Oct 24, 2021         NaN
6                             CNDC 2021  8th International Conference on Computer Netwo...  8th International Conference on Computer Netwo...  8th International Conference on Computer Netwo...         NaN
7                             CNDC 2021                        Nov 27, 2021 - Nov 28, 2021                                         Dubai, UAE                                       Oct 24, 2021         NaN
8                            CAIML 2022  3rd International Conference on Artificial Int...  3rd International Conference on Artificial Int...  3rd International Conference on Artificial Int...         NaN
9                            CAIML 2022                        Jul 23, 2022 - Jul 24, 2022                                    Toronto, Canada                                       Oct 24, 2021         NaN
10                           ITCSE 2022  11th International Conference on Information T...  11th International Conference on Information T...  11th International Conference on Information T...         NaN
11                           ITCSE 2022                        Jul 23, 2022 - Jul 24, 2022                                    Toronto, Canada                                       Oct 24, 2021         NaN
12                           CCSIT 2022  12th International Conference on Computer Scie...  12th International Conference on Computer Scie...  12th International Conference on Computer Scie...         NaN
13                           CCSIT 2022                        Jul 30, 2022 - Jul 31, 2022                             London, United Kingdom                                       Oct 24, 2021         NaN
14                            SOEN 2022  7th International Conference on Software Engin...  7th International Conference on Software Engin...  7th International Conference on Software Engin...         NaN
15                            SOEN 2022                        Jul 30, 2022 - Jul 31, 2022                             London, United Kingdom                                       Oct 24, 2021         NaN
16                            AIAA 2021  11th International Conference on Artificial In...  11th International Conference on Artificial In...  11th International Conference on Artificial In...         NaN
17                            AIAA 2021                        Nov 27, 2021 - Nov 28, 2021                                         Dubai, UAE                                       Oct 24, 2021         NaN
18                           NLPTA 2021  2nd International Conference on NLP Techniques...  2nd International Conference on NLP Techniques...  2nd International Conference on NLP Techniques...         NaN
19                           NLPTA 2021                        Nov 27, 2021 - Nov 28, 2021                                         Dubai, UAE                                       Oct 24, 2021         NaN
20                            CSTY 2021  7th International Conference on Computer Scien...  7th International Conference on Computer Scien...  7th International Conference on Computer Scien...         NaN
21                            CSTY 2021                        Dec 18, 2021 - Dec 19, 2021                                         Dubai, UAE                                       Oct 24, 2021         NaN
22                          KG@SAC 2022             ACM SAC 2022 Track on Knowledge Graphs             ACM SAC 2022 Track on Knowledge Graphs             ACM SAC 2022 Track on Knowledge Graphs         NaN
23                          KG@SAC 2022                        Apr 25, 2022 - Apr 29, 2022                               Brno, Czech Republic                                       Oct 24, 2021         NaN
24                             E&C 2021  5th International Conference on Electrical & C...  5th International Conference on Electrical & C...  5th International Conference on Electrical & C...         NaN
25                             E&C 2021                        Nov 27, 2021 - Nov 28, 2021                                         Dubai, UAE                                       Oct 24, 2021         NaN
26                            IoTE 2021  2nd International Conference on Internet of Th...  2nd International Conference on Internet of Th...  2nd International Conference on Internet of Th...         NaN
27                            IoTE 2021                        Nov 27, 2021 - Nov 28, 2021                                         Dubai, UAE                                       Oct 24, 2021         NaN
28                            EEEN 2021  5th International Conference on Electrical and...  5th International Conference on Electrical and...  5th International Conference on Electrical and...         NaN
29                            EEEN 2021                        Nov 27, 2021 - Nov 28, 2021                                         Dubai, UAE                                       Oct 24, 2021         NaN
30                CVIE--EI, Scopus 2022  2022 2nd International Conference on Computer ...  2022 2nd International Conference on Computer ...  2022 2nd International Conference on Computer ...         NaN
31                CVIE--EI, Scopus 2022                        Feb 18, 2022 - Feb 20, 2022                                       Sanya, China                                       Oct 25, 2021         NaN
32               ICCDA--Ei, Scopus 2022  2022 The 6th International Conference on Compu...  2022 The 6th International Conference on Compu...  2022 The 6th International Conference on Compu...         NaN
33               ICCDA--Ei, Scopus 2022                        Feb 18, 2022 - Feb 20, 2022                                       Sanya, China                                       Oct 25, 2021         NaN
34       ACM--ICMLC--Ei and Scopus 2022  ACM--2022 14th International Conference on Mac...  ACM--2022 14th International Conference on Mac...  ACM--2022 14th International Conference on Mac...         NaN
35       ACM--ICMLC--Ei and Scopus 2022                        Feb 18, 2022 - Feb 20, 2022                                   Guangzhou, China                                       Oct 25, 2021         NaN
36  IEEE CSP--EI Compendex, Scopus 2022  2022 IEEE 6th International Conference on Cryp...  2022 IEEE 6th International Conference on Cryp...  2022 IEEE 6th International Conference on Cryp...         NaN
37  IEEE CSP--EI Compendex, Scopus 2022                        Jan 14, 2022 - Jan 16, 2022                                     Tianjin, China                                       Oct 25, 2021         NaN
38       ACM--ICMIP--Ei and Scopus 2022  ACM--2022 7th International Conference on Mult...  ACM--2022 7th International Conference on Mult...  ACM--2022 7th International Conference on Mult...         NaN
39       ACM--ICMIP--Ei and Scopus 2022                        Jan 14, 2022 - Jan 16, 2022                                     Tianjin, China                                       Oct 25, 2021         NaN

Try

values.getText()

getText()函数返回bs4 HTML对象的文本内容。