<img><span> 你如何在标签和美丽汤之间找到文本,python?



我正在创建一个从股市网站获取数据的程序。我正在从网站顶部的字幕中获取数据。然而,为了分离和组织数据,我需要img标记和span之间的精确数据。

HTML:pic

要获得所需的文本,可以使用以下示例:

import requests
from bs4 import BeautifulSoup

url = 'http://nepalstock.com/todaysprice'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
text = soup.select_one('marquee').get_text(strip=True, separator='|')
for t in text.split('|')[::3]:
print(t.split('(')[0].strip())

打印:

LLBS 1,139.00
NTC 706.00
SCB 659.00
AKPL 194.00
EIC 567.00
NCCB 209.00
UPPER 260.00
BOKL 247.00
NABIL 853.00
SBL 302.00
PICL 568.00
NICA 579.00
RADHI 259.00
RBCLPO 10,140.00
CHCL 491.00
NLICL 780.00
...and so on.

您可能需要在此之后解析输出

from bs4 import BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
image = 'find the image here'
the_text_you_need = image.text
#or
the_text_you_need1 = image.innerHTML

您可以在使用re之后对其进行解析或者只使用.split()方法,如下所示:

result = your_output.split()

看看这是否有帮助:

from bs4 import BeautifulSoup
import requests
import re
hdr = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'}
html_page = requests.get("http://nepalstock.com/", headers=hdr, timeout=15)
soup = BeautifulSoup(html_page.content, 'html.parser')
number=[]
for img in soup.find_all("img"):
if img.text is not '':
number.append( re.sub(r'([^)]*)', '',img.text))
number = [el.replace('xa0',' ') for el in number]
my_list = re.sub('  +', ',', " ".join(str(x) for x in number)).split(",")
print(my_list)

结果:

['', 'NTC 706.00', 'SCB 659.00', 'AKPL 194.00', 'EIC 567.00', 'NCCB 209.00', 'UPPER 260.00', 'BOKL 247.00', 'NABIL 853.00', 'SBL 302.00', 'PICL 568.00', 'NICA 579.00', 'RADHI 259.00', 'RBCLPO 10', '140.00', 'CHCL 491.00', 'NLICL 780.00', 'NIL 834.00', 'RHPL 179.00', 'API 156.00', 'NBL 308.00', 'HDHPC 100.00', 'NRN 406.00', 'HPPL 195.00', 'CCBL 172.00', 'SHPC 300.00', 'SMFBS 1', '000.00', 'SHINE 241.00', 'BPCL 403.00', 'CIT 3', '400.00', 'PLIC 683.00', 'KBL 194.00', 'GBIME 263.00', 'RRHP 136.00', 'CZBIL 196.00', 'NRIC 884.00', 'RSDC 577.00', 'JBBL 169.00', 'NSEWA 735.00', 'CBL 145.00', 'JOSHI 103.00', 'GLBSL 776.00', 'HGI 544.00', 'CBBL 1', '115.00', 'SDLBSL 858.00', 'SKBBL 1', '377.00', 'IGI 567.00', 'NMB 419.00', 'BFC 106.00', 'PROFL 107.00', 'CMF1 11.33', 'GMFBS 830.00', 'MERO 732.00', 'VLBS 924.00', 'HIDCL 156.00', 'RLI 600.00', 'LBL 235.00', 'MLBBL 785.00', 'NICL 575.00', 'KMCDB 763.00', 'GLICL 654.00', 'MLBL 197.00', 'SHIVM 722.00', 'PRVU 228.00', 'LICN 1', '536.00', 'JFL 178.00', 'AIL 524.00', 'MNBBL 326.00', 'LBBL 184.00', 'SPDL 165.00', 'GHL 88.00', 'SLBBL 868.00', 'NFS 190.00', 'CLBSL 715.00', 'NIB 420.00', 'ADBL 430.00', 'NUBL 980.00', 'SWBBL 1', '272.00', 'NGPL 158.00', 'NBB 203.00', 'SJCL 163.00', 'SAPDBL 123.00', 'FOWAD 1', '590.00', 'MMFDB 950.00', 'NMBMF 835.00', 'OHL 399.00', 'SLICL 642.00', 'NIBLPF 8.85', 'ALICL 760.00', 'PMHPL 105.00', 'SABSL 868.00', 'NLIC 1', '404.00', 'SPARS 965.00', 'ICFC 174.00', 'PPCL 149.00', 'LGIL 560.00', 'SICL 1', '368.00', 'MHNL 107.00', 'PCBL 282.00', 'SFCL 112.00', 'PFL 149.00', 'GBLBS 549.00', 'ALBSL 891.00', 'PIC 816.00', 'DHPL 72.00', 'NIBPO 362.00', 'NHDL 199.00', 'MBL 227.00', 'MDB 364.00', 'UIC 454.00', 'UPCL 108.00', 'ACLBSL 745.00', 'NADEP 739.00', 'GFCL 159.00', 'EDBL 298.00', 'SRBL 233.00', 'SANIMA 343.00', 'EBL 732.00', 'SMB 866.00', 'NMBHF1 10.15', 'SLBS 979.00', 'SADBL 144.00', 'FMDBL 596.00', 'USLB 1', '142.00', 'GIMES1 9.09', 'ILBS 844.00', 'CFCL 127.00', 'HURJA 129.00', 'RBCL 11', '720.00', 'KPCL 141.00', 'SLBSL 809.00', 'UNHPL 93.00', 'PRIN 627.00', 'MFIL 299.00', 'UMHL 134.00', 'DDBL 876.00', 'NLBBL 775.00', 'SBI 422.00', 'AKJCL 82.00', 'BARUN 132.00', 'TMDBL 178.00', 'GMFIL 132.00', 'MSMBS 710.00', 'NICGF 9.32', 'GBBL 236.00', 'RMDC 862.00', 'SHL 180.00', 'SIL 781.00', 'GRDBL 116.00', 'KKHC 73.00', 'STC 3', '295.00', 'CORBL 131.00', 'RLFL 132.00', 'KRBL 105.00', 'SAEF 10.30', 'AHPC 162.00', 'GGBSL 817.00', 'LUK 9.26', 'NLG 770.00', 'MEGA 221.00', 'RHPC 129.00', 'CHL 123.00', 'NEF 8.75', 'NMFBS 1', '584.00', 'GILB 1', '217.00', 'JSLBB 1', '384.00', 'GUFL 125.00', 'BNT 6', '680.00', 'KSBBL 143.00', 'SINDU 124.00', 'NLBSL 1', '292.00', 'HBL 565.00', 'HDL 1', '815.00', 'SIC 965.00', 'NHPC 72.00', 'SIFC 162.00', 'SEF 9.66', 'PROFLP 101.00', 'TRH 240.00', 'BBC 1', '650.00', 'LEMF 8.56', 'NICBF 9.45', 'NBF2 9.00', 'SBLD2082 1', '026.00', 'NIBSF1 9.31', 'MPFL 113.00', 'NMB50 9.50', 'MDBPO 182.00', 'SRBLD83 1', '022.00', 'UFL 191.00', 'MMFDBP 403.00', 'EBLCP 694.00', 'NICAD8283 1', '058.00', 'GRDBLP 100.00', 'SBIBD86 1', '050.00', 'SCB 659.00', 'AKPL 194.00', 'EIC 567.00', 'NCCB 209.00', 'UPPER 260.00', 'BOKL 247.00', 'NABIL 853.00', 'SBL 302.00', 'PICL 568.00', 'NICA 579.00', 'RADHI 259.00', 'RBCLPO 10', '140.00', 'CHCL 491.00', 'NLICL 780.00', 'NIL 834.00', 'RHPL 179.00', 'API 156.00', 'NBL 308.00', 'HDHPC 100.00', 'NRN 406.00', 'HPPL 195.00', 'CCBL 172.00', 'SHPC 300.00', 'SMFBS 1', '000.00', 'SHINE 241.00', 'BPCL 403.00', 'CIT 3', '400.00', 'PLIC 683.00', 'KBL 194.00', 'GBIME 263.00', 'RRHP 136.00', 'CZBIL 196.00', 'NRIC 884.00', 'RSDC 577.00', 'JBBL 169.00', 'NSEWA 735.00', 'CBL 145.00', 'JOSHI 103.00', 'GLBSL 776.00', 'HGI 544.00', 'CBBL 1', '115.00', 'SDLBSL 858.00', 'SKBBL 1', '377.00', 'IGI 567.00', 'NMB 419.00', 'BFC 106.00', 'PROFL 107.00', 'CMF1 11.33', 'GMFBS 830.00', 'MERO 732.00', 'VLBS 924.00', 'HIDCL 156.00', 'RLI 600.00', 'LBL 235.00', 'MLBBL 785.00', 'NICL 575.00', 'KMCDB 763.00', 'GLICL 654.00', 'MLBL 197.00', 'SHIVM 722.00', 'PRVU 228.00', 'LICN 1', '536.00', 'JFL 178.00', 'AIL 524.00', 'MNBBL 326.00', 'LBBL 184.00', 'SPDL 165.00', 'GHL 88.00', 'SLBBL 868.00', 'NFS 190.00', 'CLBSL 715.00', 'NIB 420.00', 'ADBL 430.00', 'NUBL 980.00', 'SWBBL 1', '272.00', 'NGPL 158.00', 'NBB 203.00', 'SJCL 163.00', 'SAPDBL 123.00', 'FOWAD 1', '590.00', 'MMFDB 950.00', 'NMBMF 835.00', 'OHL 399.00', 'SLICL 642.00', 'NIBLPF 8.85', 'ALICL 760.00', 'PMHPL 105.00', 'SABSL 868.00', 'NLIC 1', '404.00', 'SPARS 965.00', 'ICFC 174.00', 'PPCL 149.00', 'LGIL 560.00', 'SICL 1', '368.00', 'MHNL 107.00', 'PCBL 282.00', 'SFCL 112.00', 'PFL 149.00', 'GBLBS 549.00', 'ALBSL 891.00', 'PIC 816.00', 'DHPL 72.00', 'NIBPO 362.00', 'NHDL 199.00', 'MBL 227.00', 'MDB 364.00', 'UIC 454.00', 'UPCL 108.00', 'ACLBSL 745.00', 'NADEP 739.00', 'GFCL 159.00', 'EDBL 298.00', 'SRBL 233.00', 'SANIMA 343.00', 'EBL 732.00', 'SMB 866.00', 'NMBHF1 10.15', 'SLBS 979.00', 'SADBL 144.00', 'FMDBL 596.00', 'USLB 1', '142.00', 'GIMES1 9.09', 'ILBS 844.00', 'CFCL 127.00', 'HURJA 129.00', 'RBCL 11', '720.00', 'KPCL 141.00', 'SLBSL 809.00', 'UNHPL 93.00', 'PRIN 627.00', 'MFIL 299.00', 'UMHL 134.00', 'DDBL 876.00', 'NLBBL 775.00', 'SBI 422.00', 'AKJCL 82.00', 'BARUN 132.00', 'TMDBL 178.00', 'GMFIL 132.00', 'MSMBS 710.00', 'NICGF 9.32', 'GBBL 236.00', 'RMDC 862.00', 'SHL 180.00', 'SIL 781.00', 'GRDBL 116.00', 'KKHC 73.00', 'STC 3', '295.00', 'CORBL 131.00', 'RLFL 132.00', 'KRBL 105.00', 'SAEF 10.30', 'AHPC 162.00', 'GGBSL 817.00', 'LUK 9.26', 'NLG 770.00', 'MEGA 221.00', 'RHPC 129.00', 'CHL 123.00', 'NEF 8.75', 'NMFBS 1', '584.00', 'GILB 1', '217.00', 'JSLBB 1', '384.00', 'GUFL 125.00', 'BNT 6', '680.00', 'KSBBL 143.00', 'SINDU 124.00', 'NLBSL 1', '292.00', 'HBL 565.00', 'HDL 1', '815.00', 'SIC 965.00', 'NHPC 72.00', 'SIFC 162.00', 'SEF 9.66', 'PROFLP 101.00', 'TRH 240.00', 'BBC 1', '650.00', 'LEMF 8.56', 'NICBF 9.45', 'NBF2 9.00', 'SBLD2082 1', '026.00', 'NIBSF1 9.31', 'MPFL 113.00', 'NMB50 9.50', 'MDBPO 182.00', 'SRBLD83 1', '022.00', 'UFL 191.00', 'MMFDBP 403.00', 'EBLCP 694.00', 'NICAD8283 1', '058.00', 'GRDBLP 100.00', 'SBIBD86 1', '050.00', '']

最新更新