我正试图基于以下语法生成我的抓取结果:
authors = item.find('ol', 'Authors')
结果是:
<ol class="Authors">
<li><span class="author">Author 1</span></li>
<li><span class="author">Author 2</span></li>
<li><span class="author">Author 3</span></li>
</ol>
当我添加.text
时,我得到的结果是:
Author 1Author 2Author 3
如何将其转换为:
Author 1, Author 2, Author 3
要添加逗号作为分隔符,而不是调用.text
,请使用.get_text()
方法,并将逗号,
传递给separator
参数:
print(
''.join(
tag.get_text(strip=True, separator=", ")
for tag in soup.find_all("ol", class_="Authors")
)
)
输出:
Author 1, Author 2, Author 3