Step Time Apple_price fluctuation
BFGS: 0 18:21:43 -6442.333161 7.4744
BFGS: 1 18:21:43 *-6442.899477 5.8484*
Step Time Apple_price fluctuation
BFGS: 0 18:21:53 -6441.911200 16.3190
BFGS: 1 18:21:53 -6442.540975 10.6048
BFGS: 2 18:21:53 -6443.107163 7.6685
BFGS: 3 18:21:53 -6443.565044 6.2186
BFGS: 4 18:21:54 *-6443.954663 5.7485*
Step Time Apple_price fluctuation
BFGS: 0 18:27:00 -6440.611426 24.6802
BFGS: 1 18:27:00 -6441.602767 21.3009
BFGS: 2 18:27:00 -6442.446886 15.6698
BFGS: 3 18:27:01 -6443.084822 11.6312
BFGS: 4 18:27:01 -6443.582671 8.6795
BFGS: 5 18:27:01 -6444.019236 7.4906
BFGS: 6 18:27:01 -6444.389951 6.7435
BFGS: 7 18:27:02 *-6444.732455 6.5221*
我想提取"0"one_answers"0"之间的值*"如下所示:
-6442.899477 5.8484
-6443.954663 5.7485
-6444.732455 6.5221
我的代码如下:
import pandas as pd
import numpy as np
all_lines = []
file_name = input("What's the file name with extension?: ")
with open (f'{file_name}', 'r') as file:
for each_line in file:
all_lines.append(each_line.strip())
#print(all_lines)
for j in all_lines:
if j == 0:
j = j + 1
if 'fluctuation' in i:
all_lines.index(j-1)
print(j)
不幸的是,输出只是答案的第一行:
-6442.899477 5.8484
让我知道它如何提取某些列表中的索引值
导入正则表达式
import re
准备数据:
text = """ Step Time Apple_price fluctuation
BFGS: 0 18:21:43 -6442.333161 7.4744
BFGS: 1 18:21:43 *-6442.899477 5.8484*
Step Time Apple_price fluctuation
BFGS: 0 18:21:53 -6441.911200 16.3190
BFGS: 1 18:21:53 -6442.540975 10.6048
BFGS: 2 18:21:53 -6443.107163 7.6685
BFGS: 3 18:21:53 -6443.565044 6.2186
BFGS: 4 18:21:54 *-6443.954663 5.7485*
Step Time Apple_price fluctuation
BFGS: 0 18:27:00 -6440.611426 24.6802
BFGS: 1 18:27:00 -6441.602767 21.3009
BFGS: 2 18:27:00 -6442.446886 15.6698
BFGS: 3 18:27:01 -6443.084822 11.6312
BFGS: 4 18:27:01 -6443.582671 8.6795
BFGS: 5 18:27:01 -6444.019236 7.4906
BFGS: 6 18:27:01 -6444.389951 6.7435
BFGS: 7 18:27:02 *-6444.732455 6.5221*"""
定义正则表达式:*之间可能包含的字符
p = re.compile(r'*[- 0-9.]**')
匹配正则表达式和文本
a = p.findall(text)
a: 匹配数组。枚举检索索引和内容:
for k, v in enumerate(a):
print(k, v)
输出:
0-6442.899477 5.84841-6443.954663 5.7485264-44.732455 6.5221
我想我找到了一个简单的解决方案:
1-在bash
awk '{$1=$2=$3=""; print $0}' filename.out > filename.out2
2-型";错误";在最后一行3-以下代码
import numpy as np
import pandas as pd
f = open ('filename.out2', 'r')
all_lines = []
for each_line in f:
all_lines.append(each_line.strip())
#for j in all_lines:
# print(j)
df = pd.DataFrame(all_lines)
count_row = df.shape[0] # Gives number of rows
print("count_row=", count_row)
count_col = df.shape[1] # Gives number of columns
print("count_col=", count_col)
max_sw = 'Error'
lines = [i for i in range(len(all_lines)) if all_lines[i] == max_sw]
#print([i for i in range(len(all_lines)) if all_lines[i]== max_sw])
print(lines)
lines2 = []
for i in lines:
i = i - 1
lines2.append(i)
print(lines2)
lines3 = []
for i in lines2:
if i != -1:
# print(i)
# lines3 = [i for i in all_lines[i]]
# return
lines3.append(all_lines[i])
print (lines3)
4-答案:
count_row=19
count_col=1
[0,3,9,18]
[-1,2,8,17]
['-6442.899477 5.8484','-6443.954663 5.7485','64.44.732455 6.5221']
无论如何,我欢迎任何新的帮助。