我正在收集名称和数字,并将其导出到csv文件中。
列A=名称B列=数字
如何使列c用"0"填充所有行;Phoenix";所以每一个有名字或数字的列c都会说phoenix?
import csv
import requests
from bs4 import BeautifulSoup
realtor_data = []
for page in range(1, 6):
print(f"Scraping page {page}...")
url = f"https://www.realtor.com/realestateagents/phoenix_az/pg-{page}"
soup = BeautifulSoup(requests.get(url).text, "html.parser")
for agent_card in soup.find_all("div", {"class": "agent-list-card clearfix"}):
name = agent_card.find("div", {"class": "agent-name text-bold"}).find("a")
number = agent_card.find("div", {"itemprop": "telephone"})
realtor_data.append(
[
name.getText().strip(),
number.getText().strip() if number is not None else "N/A"
],
)
with open("data.csv", "w") as output:
w = csv.writer(output)
w.writerow(["NAME:", "PHONE NUMBER:", "CITY:"])
w.writerows(realtor_data)
有带phoenix的c列。
import csv
import requests
from bs4 import BeautifulSoup
realtor_data = []
for page in range(1, 6):
print(f"Scraping page {page}...")
url = f"https://www.realtor.com/realestateagents/phoenix_az/pg-{page}"
soup = BeautifulSoup(requests.get(url).text, "html.parser")
for agent_card in soup.find_all("div", {"class": "agent-list-card clearfix"}):
name = agent_card.find("div", {"class": "agent-name text-bold"}).find("a")
number = agent_card.find("div", {"itemprop": "telephone"})
realtor_data.append(
[
name.getText().strip(),
number.getText().strip() if number is not None else "N/A"
],
)
with open("data.csv", "w") as output:
w = csv.writer(output)
w.writerow(["NAME:", "PHONE NUMBER:", "CITY:"])
w.writerows(realtor_data)
import pandas as pd
a=pd.read_csv("data.csv")
a2 = a.iloc[:,[0,1]]
a3 = a.iloc[:,[2]]
a3 = a3.fillna("phoenix")
b=pd.concat([a2,a3],axis=1)
b.to_csv("data.csv")
因此,您可以对代码执行此操作,以使所有代码都具有凤凰城:
import csv
import requests
from bs4 import BeautifulSoup
realtor_data = []
for page in range(1, 6):
print(f"Scraping page {page}...")
url = f"https://www.realtor.com/realestateagents/phoenix_az/pg-{page}"
soup = BeautifulSoup(requests.get(url).text, "html.parser")
for agent_card in soup.find_all("div", {"class": "agent-list-card clearfix"}):
name = agent_card.find("div", {"class": "agent-name text-bold"}).find("a")
number = agent_card.find("div", {"itemprop": "telephone"})
data_list = []
if name:
data_list.append(name.getText().strip())
else:
data_list.append("N/A")
if number:
data_list.append(number.getText().strip())
else:
data_list.append("N/A")
data_list.append('Phoenix')
realtor_data.append(data_list)
with open("data.csv", "w") as output:
w = csv.writer(output)
w.writerow(["NAME:", "PHONE NUMBER:", "CITY:"])
w.writerows(realtor_data)
你可能想看看熊猫。DataFrame对象,并在其中附加一个包含每个字段值的字典,因为这将使访问数据变得更容易,而不是遍历csv文件。
*经过编辑以确保即使您没有有效的编号或名称,行中也始终有3个项目,并更改了一行if else语法