Python处理问题——去掉日期-时间模式:
我有一些来自GSM单元的数据,格式为:
+CMGL: 1,"REC READ","+111111111111","13/05/25,05:15:16+04",25-05-13,05:15:20, 0.668
+CMGL: 2,"REC READ","+111111111111","13/05/25,12:15:14+04",25-05-13,12:15:20, 0.875
+CMGL: 3,"REC READ","+111111111111","13/05/25,10:15:15+04",25-05-13,10:15:20, 0.679
.
数据是作为单个字符串缓冲区检索的,所以它最初都在一行上。我可以使用data .replace(a,b)对数据进行排序和剥离,但我无法删除前4个逗号分隔的组,例如
+CMGL: 1,"REC READ","+111111111111","YY/MM/DD,HH:MM:SS+DELTA"
我的目标是提取数据看起来像这样(我不介意错误的日期-时间行顺序)-
25-05-13, 05:15:20, 0.668
25-05-13, 12:15:20, 0.875
25-05-13, 10:15:20, 0.679
.
建议欢迎
使用csv
模块处理带分隔符的文件。
gsm.txt
+CMGL: 1,"REC READ","+111111111111","13/05/25,05:15:16+04",25-05-13,05:15:20, 0.668
+CMGL: 2,"REC READ","+111111111111","13/05/25,12:15:14+04",25-05-13,12:15:20, 0.875
+CMGL: 3,"REC READ","+111111111111","13/05/25,10:15:15+04",25-05-13,10:15:20, 0.679
下面的示例代码
import csv
gsm = open('gsm.txt')
for row in csv.reader(gsm):
print row[4:]
输出['25-05-13', '05:15:20', ' 0.668']
['25-05-13', '12:15:20', ' 0.875']
['25-05-13', '10:15:20', ' 0.679']
像这样:
>>> strs = '+CMGL: 1,"REC READ","+111111111111","13/05/25,05:15:16+04",25-05-13,05:15:20, 0.668'
>>> ", ".join( x for x in strs.split(",")[5:] )
'25-05-13, 05:15:20, 0.668'
或:
>>> ", ".join( strs.split(",",5)[-1].split(",") )
'25-05-13, 05:15:20, 0.668'
多行:
>>> strs = """+CMGL: 1,"REC READ","+111111111111","13/05/25,05:15:16+04",25-05-13,05:15:20, 0.668
+CMGL: 2,"REC READ","+111111111111","13/05/25,12:15:14+04",25-05-13,12:15:20, 0.875
+CMGL: 3,"REC READ","+111111111111","13/05/25,10:15:15+04",25-05-13,10:15:20, 0.679"""
>>>
>>> for line in strs.splitlines():
... print ", ".join( line.split(",",5)[-1].split(","))
25-05-13, 05:15:20, 0.668
25-05-13, 12:15:20, 0.875
25-05-13, 10:15:20, 0.679
data = """+CMGL: 1,"REC READ","+111111111111","13/05/25,05:15:16+04",25-05-13,05:15:20, 0.668
+CMGL: 2,"REC READ","+111111111111","13/05/25,12:15:14+04",25-05-13,12:15:20, 0.875
+CMGL: 3,"REC READ","+111111111111","13/05/25,10:15:15+04",25-05-13,10:15:20, 0.679"""
import csv
from StringIO import StringIO
for row in csv.reader(StringIO(data), skipinitialspace=True):
print ', '.join(row[4:7])
#25-05-13, 05:15:20, 0.668
#25-05-13, 12:15:20, 0.875
#25-05-13, 10:15:20, 0.679
如果你能确保所有行都是类似的格式,即前缀单词的长度总是相同的。我认为最简单的方法是
line = '+CMGL: 1,"REC READ","+111111111111","13/05/25,05:15:16+04",25-05-13,05:15:20, 0.668'
line = line[59:]