我有两个具有非常相似数据的表(相似但不相同的数据(,它们每天都会生成。
样本表RT20190101
Time Sensor Name StaTus Reading Temp Grade Acidity
20190101-830, A345CRT, XXXXXXX, Active, 5.6, 54, 8, 3.2
20190101-930, A445ADE, XXXXXXX, Active, 5.5, 54, 8, 1.2
20190101-1030, D546RTE, XXXXXXX, Active, 5.5, 55, 8, 2
20190101-1130, D678FRT, XXXXXXX, De-Active, 5.7, 57, 7, 3.5
样本表RT20190102
Time Sensor Name StaTus Reading Temp Grade Acidity
20190102-830, A345CRT, XXXXXXX, Active, 5.6, 54, 8, 3.2
20190102-930, A445ADE, XXXXXXX, De-Active, 5.6, 56, 7, 1.2
20190102-1030, D546RTE, XXXXXXX, De-Active, 5.5, 56, 8, 2
20190102-1130, D678FRT, XXXXXXX, Active, 5.5, 56, 9, 2
我想比较文件上的传感器并标记它们,其中临时,等级和酸度已更改。
我可以使用SQL解决此问题,但想在Python中复制以下代码
SELECT *
FROM (SELECT A.Time, B.Time, B.Sensor, B.Name, A.Status, B.Status,
IIf(A.Status<>B.Status,1,0) AS StatusChange, A.[Reading], B.[Reading],
IIf([A.Reading]<>[B.Reading],1,0) AS [Reading Change],
iif(A.Temp<>B.Temp,1,iif([A.Grade]<>[B.Grade],1,iif([A.Acidity]<>[B.Acidity],1,0))) AS Change
FROM RT20190101 AS A INNER
JOIN RT20190102 AS B ON A.Time = B.Time)
WHERE Change=1;
样本输出
A.Time B.Time Sensor Name A.Status B.Status StatusChange A.Reading B.Reading Reading Change
20190101-1130, 20190102-1130, D678FRT, XXXXXXX, De-Active, Active, 1, 5.7, 5.5, 1
我想获取所有传感器的列表,其名称,如果其状态更改以及阅读更改。仅当TEMP
,GRADE
和ACIDITY
级别也更改时。
只是一个一般的想法:
import pandas as pd
import numpy as np
df1 = pd.read_sql_table('first_table', 'postgres:///db_name')
df2 = pd.read_sql_table('second_table', 'postgres:///db_name')
df1['compare'] = np.nan
for i in range(0, len(df1)):
id = df1['id'].loc[i]
sensor1 = df1['sensor'].loc[i]
sensor2 = df2['sensor'].loc[df2['id'] == id]
if sensor1 != sensor2:
df1['compare'].loc[i] = False
else:
df1['compare'].loc[i] = True