如何在数据帧中的特定条件下将True和False替换为数值



我希望将True和False转换为DataFrame中的特定值。我希望在";时间";以秒为单位的变量小于300;1〃;。任何数字在任何数字之后(小于300秒(超过300秒将得到相同的特定数字"1"。在该数字之后的任何数字(高于300秒(应该总是小于300秒,并得到另一个特定的数字,例如"0";2〃;等等。

这是我的代码:

import time
from datetime import datetime, date, time, timedelta
from datetime import datetime as dt
import numpy as np
df['timestamp'] = pd.to_datetime (df['timestamp']) 
df['delta'] = (df['timestamp']-df['timestamp'].shift())
df['time'] = df['delta'].dt.total_seconds()
df['outlier'] =  df['time'] > 300
df['Column1'] = np.where(df['outlier'], np.where(df['time'] > 300, '1','1'),'na')

这是输入。这是我拥有的DataFrame的示例:

timestamp              delta            time     outlier   output 

0  2020-11-08 17:54:53       NaT              NaN      False      na 
1  2020-11-08 17:54:56   0 days 00:00:03      3.0      False      na 
2  2020-11-08 17:54:57   0 days 00:00:01      1.0      False      na 
3  2020-11-08 21:04:41   0 days 03:09:44    11384.0    True       1   
4  2020-11-08 21:04:52   0 days 00:00:11      11.0     False      na 
5  2020-11-08 21:04:53   0 days 00:00:01      1.0      False      na   
6  2020-11-10 20:36:32   1 days 23:31:39   171099.0    True       1   
7  2020-11-10 20:37:01   0 days 00:00:29      29.0     False      na 
8  2020-11-10 20:37:04   0 days 00:00:03      3.0      False      na

这是我正在寻找的实际输出:

timestamp              delta            time     outlier   output 

0  2020-11-08 17:54:53       NaT              NaN      False     NaN 
1  2020-11-08 17:54:56   0 days 00:00:03      3.0      False      1  
2  2020-11-08 17:54:57   0 days 00:00:01      1.0      False      1  
3  2020-11-08 21:04:41   0 days 03:09:44    11384.0    True       1  
4  2020-11-08 21:04:52   0 days 00:00:11      11.0     False      2  
5  2020-11-08 21:04:53   0 days 00:00:01      1.0      False      2    
6  2020-11-10 20:36:32   1 days 23:31:39   171099.0    True       2    
7  2020-11-10 20:37:01   0 days 00:00:29      29.0     False      3    
8  2020-11-10 20:37:04   0 days 00:00:03      3.0      False      3 

请注意,这只是Dataframe的一个示例,所以请帮助我修复上面的代码,并使其适用于具有大量行的Dataframe。

类似的东西?

df['output'] = (df.outlier.cumsum() + 1).map(str).shift()

如果您喜欢整数:

df['output'] = (df.outlier.cumsum() + 1).map(int).astype(object).shift()

最新更新