我有以下两个CSV文件:
CSV 文件1:
Range1,2018-05-17 01:50:17+0000,2018-05-17 02:00:17+0000
Range2,2018-05-17 01:50:17+0000,2018-05-17 04:00:17+0000
Range3,2018-05-17 01:50:17+0000,2018-05-17 08:00:17+0000
CSV 文件2:
TimeStamp1,2018-05-17 01:59:17+0000
TimeStamp2,2018-05-17 03:59:17+0000
TimeStamp3,2018-05-17 07:59:17+0000
我想遍历 File1 中的每个范围,并确定哪个时间戳属于要比较的范围。 例如,我的 Python 脚本的输出将显示:
输出:
TimeStamp1 falls within Range1
TimeStamp1, TimeStamp2 falls within Range2
TimeStamp1, TimeStamp2, TimeStamp3 falls within Range3
我开始写这样的东西,但在获取输出和 if 语句时遇到问题,最初使用 File1 中的所有行正确迭代 File2,然后重复 File1 中的下一行,在 File2 中再次重复所有行。提前谢谢你。
import csv
with open('File1', 'rb') as range, open('File2', 'rb') as timeStamp:
range_reader = csv.reader(range, quotechar='"')
timeStamp_reader = csv.reader(timeStamp, quotechar='"')
for range_row in range_reader:
print range_row[2]
print range_row[3]
for timeStamp_row in timeStamp_reader:
print timeStamp_row[2]
if range_row[2] <= timeStamp_row[2] and range_row[3] >= timeStamp_row[2]
print " %s falls within %s "% (timeStamp_row[1], range_row[1])
import csv
with open('File1.csv', 'rb') as ranger, open('File2.csv', 'rb') as timeStamp:
range_reader = [x for x in csv.reader(ranger, quotechar='"')]
timeStamp_reader = [x for x in csv.reader(timeStamp, quotechar='"')]
for range_row in range_reader:
temp = []
for timeStamp_row in timeStamp_reader:
if range_row[1] <= timeStamp_row[1] and range_row[2] >= timeStamp_row[1]:
temp.append(timeStamp_row[0])
if temp:
print " %s falls within %s "% (','.join(temp), range_row[0])
Lukasas ans 很好,但如果你的数据集很大,每次在 for 循环中寻找可能不是一个好主意。 只需在开始时复制它们即可。 此外,要根据需要进行输出,您需要将它们保存在外循环的开头。
TimeStamp1 falls within Range1
TimeStamp1,TimeStamp2 falls within Range2
TimeStamp1,TimeStamp2,TimeStamp3 falls within Range3
你的代码中几乎没有错误。首先,你把索引搞砸了。此处的索引从 0 开始。因此,只需从所有索引中减去 1。
你不能重复地从文件中读取,因为阅读器会点击它的末尾,然后它不会再读取任何内容,因为它在末尾。因此,对于第二个循环,您需要将其读取器重置回开始。这可以通过设置搜索轻松完成。
import csv
with open('File1', 'r') as ranges, open('File2', 'r') as timeStamp:
range_reader = csv.reader(ranges, quotechar='"')
timeStamp_reader = csv.reader(timeStamp, quotechar='"')
rangeArray = {}
for range_row in range_reader:
print("%s / %s" % ( range_row[1], range_row[2])) # This looks better, and gives more info than just printing both timestamps on each line
timeStamp.seek(0) # This will set position of cursor in timeStamp back to start, so it can iterate repeatedly
rangeArray[range_row[0]] = []
for timeStamp_row in timeStamp_reader:
if range_row[1] <= timeStamp_row[1] and range_row[2] >= timeStamp_row[1]:
rangeArray[range_row[0]].append(timeStamp_row[0])
print (" %s falls within %s " % (timeStamp_row[0], range_row[0]))
print("nn")
# Desired Output:
for key in rangeArray:
print("%s falls within %s" % (', '.join([str(x) for x in rangeArray[key]]), key))
这给出了如下输出:
2018-05-17 01:50:17+0000 / 2018-05-17 02:00:17+0000
TimeStamp1 falls within Range1
2018-05-17 01:50:17+0000 / 2018-05-17 04:00:17+0000
TimeStamp1 falls within Range2
TimeStamp2 falls within Range2
2018-05-17 01:50:17+0000 / 2018-05-17 08:00:17+0000
TimeStamp1 falls within Range3
TimeStamp2 falls within Range3
TimeStamp3 falls within Range3
TimeStamp1 falls within Range1
TimeStamp1, TimeStamp2 falls within Range2
TimeStamp1, TimeStamp2, TimeStamp3 falls within Range3
正如你将看到的,我做了很多修改,首先是我用Python 3编写代码。你使用的是 Python 2 吗?
无论如何,很高兴回答问题。我认为这主要按照您想要的方式工作:
import csv
import datetime
with open('File1', 'r') as range, open('File2', 'r') as timeStamp:
range_rows = list(csv.reader(range, quotechar='"'))
timeStamp_rows = list(csv.reader(timeStamp, quotechar='"'))
range_list = []
d=datetime.datetime.now()
for row in range_rows:
time = [row[0], d.strptime(row[1][:-5],"%Y-%m-%d %H:%M:%S"), d.strptime(row[2][:-5],"%Y-%m-%d %H:%M:%S")]
range_list.append(time)
timeStamp_list = []
for row in timeStamp_rows:
time = [row[0], d.strptime(row[1][:-5],"%Y-%m-%d %H:%M:%S")]
timeStamp_list.append(time)
for i in range_list:
for e in timeStamp_list:
if i[1] <= e[1] and i[2] >= e[1]:
print(" %s falls within %s "% (e[0], i[0]))
输出:
TimeStamp1 falls within Range1
TimeStamp1 falls within Range2
TimeStamp2 falls within Range2
TimeStamp1 falls within Range3
TimeStamp2 falls within Range3