下面的代码片段可以正常工作,但是作为改进的一部分,我想将条目结果连接到一个用逗号分隔的字符串中。我一直在试,但是没有锁。
from bs4 import BeautifulSoup
from urllib import request
from urllib.request import Request, urlopen
url = 'https://bscscan.com/tx/0xb9044e77ae66b6f128866e049d55f09b3501de6fc75478e406e4c32d1de4bd6a'
headers = {'User-Agent': 'Mozilla/5.0'}
req = Request(url, headers=headers)
html = urlopen(req).read()
soup = BeautifulSoup(html, 'html.parser')
main_data = soup.select("ul#wrapperContent div.media-body")
for item in main_data:
all_span = item.find_all("span", class_='mr-1')
last_span = all_span[-1]
all_a = item.find_all("a")
last_a = all_a[-1]
print("{:>35} | {:18} | https://bscscan.com{}".format(last_span.get_text(strip=True), last_a.get_text(strip=True), last_a['href']))
当前输出: 2 ($598.51) | Wrapped BNB (WBNB) | https://bscscan.com/token/0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c
13.684565595242991082 | MoMo KEY (KEY) | https://bscscan.com/token/0x85c128ee1feeb39a59490c720a9c563554b51d33
4 | Chi Gastoken...(CHI) | https://bscscan.com/token/0x0000000000004946c0e9f43f4dee607b0ef1fa1c
需要改进:
2 ($598.51) | Wrapped BNB (WBNB) | https://bscscan.com/token/0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c
13.684565595242991082 | MoMo KEY (KEY) | https://bscscan.com/token/0x85c128ee1feeb39a59490c720a9c563554b51d33
4 | Chi Gastoken...(CHI) | https://bscscan.com/token/0x0000000000004946c0e9f43f4dee607b0ef1fa1c
-> Wrapped BNB (WBNB) , MoMo KEY (KEY) , Chi Gastoken...(CHI) #-- Concatenated String
首先,您试图连接的字符串似乎是来自链接的文本,而不是跨度。
其次:初始化一个空字符串(在您的情况下,它不会是空的,因为您希望它以'->'开头),然后在每次迭代时向其添加所需的字符串,并获得最终答案。试试以下命令:
from bs4 import BeautifulSoup
from urllib import request
from urllib.request import Request, urlopen
url = 'https://bscscan.com/tx/0xb9044e77ae66b6f128866e049d55f09b3501de6fc75478e406e4c32d1de4bd6a'
headers = {'User-Agent': 'Mozilla/5.0'}
req = Request(url, headers=headers)
html = urlopen(req).read()
soup = BeautifulSoup(html, 'html.parser')
main_data = soup.select("ul#wrapperContent div.media-body")
link_texts = '->' # initialize a new string
for item in main_data:
all_span = item.find_all("span", class_='mr-1')
last_span = all_span[-1]
all_a = item.find_all("a")
last_a = all_a[-1]
print("{:>35} | {:18} | https://bscscan.com{}".format(last_span.get_text(strip=True), last_a.get_text(strip=True), last_a['href']))
link_texts += last_a.get_text(strip=True) + "," # add the link text to the string you initialized on each iteration
link_texts = link_texts[:-1] # slice the string so as to remove the extra comma at the last :):):)
print(link_texts)
输出如下:
2 ($597.04) | Wrapped BNB (WBNB) | https://bscscan.com/token/0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c
13.684565595242991082 | MoMo KEY (KEY) | https://bscscan.com/token/0x85c128ee1feeb39a59490c720a9c563554b51d33
4 | Chi Gastoken...(CHI) | https://bscscan.com/token/0x0000000000004946c0e9f43f4dee607b0ef1fa1c
->Wrapped BNB (WBNB),MoMo KEY (KEY),Chi Gastoken...(CHI)
您应该将值存储在列表中(在for循环之前声明),并使用','.join(list_variable)连接
之类的temp_list = []
main_data = soup.select("ul#wrapperContent div.media-body")
for item in main_data:
all_span = item.find_all("span", class_='mr-1')
last_span = all_span[-1]
all_a = item.find_all("a")
last_a = all_a[-1]
print("{:>35} | {:18} | https://bscscan.com{}".format(last_span.get_text(strip=True), last_a.get_text(strip=True), last_a['href']))
temp_list.append(last_a.get_text(strip=True))
print(', '.join(temp_list))