删除特定单词周围的特定行数

  • 本文关键字:单词周 删除 python
  • 更新时间 :
  • 英文 :


我有一个程序,从网站上获取信息并将其打印成文本文档,很快我将把它格式化成一个更有用的程序一旦后端工作完成。

信息是按时间排序的,但它只是一个字符串,所以它基本上是原始数据。我想让它逐行读取,当它碰到关键字时,它会删除其余的信息。现在,它只是删除关键字,这是没有用的,因为它留下了大量的数据。

关键字是天,在列表中,当它变成天数因为更新发生时,它是无用的或可用的,包括在我的信息

PATH = "C:Program Files (x86)chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://coinmarketcap.com/new")
try:
main = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, '//*[@id="__next"]/div[1]/div[1]/div[2]/div/div[2]'))
)
except:
driver.quit()
time = "hours"
txt = main.text
if time in main.text:
print(main.text)
print("Newer cryptos found")
else:
print("No newer cryptos found")
driver.quit()

f = open("CoinMC.txt", "w")
f.write(txt)
f.close()
lines = []
with open("CoinMC.txt", 'r') as fp:
lines=fp.readlines()
with open("CoinMC.txt", 'w') as fp:
for number, line in enumerate(lines):
if number not in [0,1,2,3,4,5,6,7]:
fp.write(line)
with open("CoinMC.txt", "r") as input:
with open("CoinMCtemp.txt", "w") as output:
for line in input:
if "day" not in line.strip("n"):
output.write(line)
os.replace('CoinMCtemp.txt', "CoinMC.txt")

我让它删除前7行,因为它是不需要的。下面是它打印出来的

1
OEC SHIB
SHIBK
$0.000007142 1.20% 0.00%
--
$6,259
OKExChain
1 hours ago
2
OEC UNI
UNIK
$24.15 5.97% 0.00%
--
$852,208
OKExChain
1 hours ago
3
OEC FIL
FILK
$59.03 3.55% 0.00%
--
$125,513
OKExChain
1 hours ago
4
Asia Coin
ASIA
$0.1168 0.12% 0.00% $11,679,825 $69,817
Ethereum
1 hours ago
5
BabyDogeX
BDOGEX
$0.000002676 63.03% 0.00% $267,565 $132,537
Binance Coin
1 hours ago
6
Everest Token
EVRT
$0.32 39.45% 0.00% $32,003,854 $127,147
Avalanche
1 hours ago
7
Kurobi
KURO
$0.1113 104.31% 0.00% $44,527,852 $145,578
Solana
1 hours ago
8
Octaplex Network
PLX
$2.73 2.41% 0.00% $2,727 $72,107
Binance Coin
2 hours ago
9
PizzaBucks
PIZZAB
$0.000003016 20.01% 0.00% $603,224 $121,475
Binance Coin
6 hours ago
10
Little Angry Bunny v2
LAB V2
$0 0.00% 0.00%
--
$264,358
Binance Coin
6 hours ago
11
VPEX Exchange
VPX
$0.08024 28.08% 0.00% $76,225,869 $91,560
Binance Coin
7 hours ago
12
Synapse
SYN
$2.05 8.13% 0.00% $111,205,453 $33,736,237
Ethereum
9 hours ago
13
Block Farm
BFC
$2.26 7.46% 0.00% $677,248,899 $1,094,921
Binance Coin
13 hours ago
14
BabySafeMoon
BSFM
$0.01843 13.72% 0.00% $1,842,753 $1,439,599
Binance Coin
18 hours ago
15
Happiness
HPNS
$0.02939 0.08% 0.00% $15,139,185 $64,731
18 hours ago
16
SUCCESS INU
SUCCESS
$0.000000004294 27.82% 50.33% $4,281,116 $1,149,340
Binance Coin
17
GravitX
GRX
$0.1682 47.04% 733.58% $14,761,186 $1,949,928
Binance Coin
18
Prelax
PEA
$0.003075 0.50% 33.13% $1,848,248 $1,566,051
Binance Coin
19
Moonkafe Finance
KAFE
$22.14 0.23% 7.35%
--
$151,714
Moonriver
20
Mini Floki
MINIFLOKI
$0.0000001023 12.92% 21.45% $1,023,333 $969,215
Binance Coin
21
NFTrade
NFTD
$0.5436 4.07% 18.20% $73,387,288 $1,043,752
Binance Coin
22
SafeMoon-AVAX
$SAFEMOONA
$0.000000001448 0.51% 1.50% $1,448,031 $16,582
Avalanche
23
FlyPaper
STICKY
$0.001935 19.19% 102.35% $967,562 $1,020,175
Binance Coin
24
Toll Free Swap
TOLL
$3,936.02 0.48% 8.13%
--
$39,092
Ethereum
25
Fruits Eco
FRTS
$0.7254 0.76% 0.41% $290,176,247 $894,943
Ethereum
26
Decentralized data crypto system
DCS
$4.62 0.31% 3.68% $277,286,959 $1,108,303
Binance Coin
27
Sombra
SMBR
$0.01444 0.80% 13.06% $1,444,036 $92,698
Binance Coin
28
Ether Matrix
ETHMATRIX
$0.0007299 19.49% 80.95% $729,936 $546,747
Binance Coin
29
ForeverFOMO
FOREVERFOMO
$0.0001849 4.18% 683.81%
--
$3,561,670
Binance Coin
30
Mars Panda World
MPT
$0.2678 2.09% 29.60% $23,802,499 $82,606
Binance Coin

几乎一半的结果是不必要的,并且增加了混乱。你可以看到它是怎么写1的,然后是几行信息,然后是2,更多的信息,包括它被列出的时间等等。第一个是一天前列出的,我想删除它下面的所有内容以及关键字"day">

上面的一些行预期结果为

1
OEC SHIB
SHIBK
$0.000007142 1.20% 0.00%
--
$6,259
OKExChain
1 hours ago
2
OEC UNI
UNIK
$24.15 5.97% 0.00%
--
$852,208
OKExChain
1 hours ago
3
OEC FIL
FILK
$59.03 3.55% 0.00%
--
$125,513
OKExChain
1 hours ago
4
Asia Coin
ASIA
$0.1168 0.12% 0.00% $11,679,825 $69,817
Ethereum
1 hours ago
5
BabyDogeX
BDOGEX
$0.000002676 63.03% 0.00% $267,565 $132,537
Binance Coin
1 hours ago
6
Everest Token
EVRT
$0.32 39.45% 0.00% $32,003,854 $127,147
Avalanche
1 hours ago
7
Kurobi
KURO
$0.1113 104.31% 0.00% $44,527,852 $145,578
Solana
1 hours ago
8
Octaplex Network
PLX
$2.73 2.41% 0.00% $2,727 $72,107
Binance Coin
2 hours ago
9
PizzaBucks
PIZZAB
$0.000003016 20.01% 0.00% $603,224 $121,475
Binance Coin
6 hours ago
10
Little Angry Bunny v2
LAB V2
$0 0.00% 0.00%
--
$264,358
Binance Coin
6 hours ago
11
VPEX Exchange
VPX
$0.08024 28.08% 0.00% $76,225,869 $91,560
Binance Coin
7 hours ago
12
Synapse
SYN
$2.05 8.13% 0.00% $111,205,453 $33,736,237
Ethereum
9 hours ago
13
Block Farm
BFC
$2.26 7.46% 0.00% $677,248,899 $1,094,921
Binance Coin
13 hours ago
14
BabySafeMoon
BSFM
$0.01843 13.72% 0.00% $1,842,753 $1,439,599
Binance Coin
18 hours ago
15
Happiness
HPNS
$0.02939 0.08% 0.00% $15,139,185 $64,731
18 hours ago

的一半会被删除,因为程序会一直写/打印,直到到达单词&;day&;然后它会停止写入。

对代码的重要建议:您可以使用range()函数代替制作列表:

with open("CoinMC.txt", 'w') as fp:
for number, line in enumerate(lines):
if number not in range(0, 7):
fp.write(line)

你不能使用python保留的关键字,如input,你在这里使用:

with open("CoinMC.txt", "r") as input:
with open("CoinMCtemp.txt", "w") as output:
for line in input:
if "day" not in line.strip("n"):
output.write(line)

回答你的问题:试试这样:

str = "Your Data".split("ago")
del str[-1]
str = "ago".join(str)

这将从ago拆分字符串并删除最后一项,然后通过ago重新连接

最新更新