我有一个来自python脚本的数据帧输出,它给出了以下输出
日期时间由于Pandas还提供了跨数据帧的矢量化字符串操作,因此很容易获得包含字符串的行:
数据帧
>>> df
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
4 6/15/2021 15:30 15869.25000 15869.25000 15:30:00
结果:
方法一:
正在使用str.contains
。。。
>>> df[~df['Time'].str.contains('15:30:00')]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
或
如果您是基于Datetime
进行查找
>>> df[~df['Datetime'].str.contains('15:30')]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
或
>>> df[~df.Time.str.contains("15:30") == True]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
或
>>> df[df['Time'].str.contains('15:30') == False]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
或
>>> df[df['Time'].str.contains('15:30') == 0]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
方法二:
正在使用isin
。。。
>>> df[~df['Time'].isin(['15:30:00'])]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
方法三:
使用Not equal to of dataframe and other, element-wise (binary operator ne).
>>> df[df.Time != '15:30:00']
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
或
>>> df[df['Time'] != '15:30:00']
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
或
>>> df[df['Time'].ne('15:30:00')]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
我的做法如下,
首先,我们得到要从数据集中删除的时间,在本例中为15:30:00。
由于Datetime列采用的是Datetime格式,因此我们无法将时间作为字符串进行比较。因此,我们将给定的时间转换为datetime.time((格式。
rm_time = dt.time(15,30)
有了这个,我们可以开始使用DataFrame.drop()
df.drop(df[df.Datetime.dt.time == rm_time].index)
你可以试试这个:
import pandas as pd
test_data=pd.read_csv("test.csv")
test_data=test_data[test_data["Time"]!="15:30:00"]
print(test_data)
只需根据条件选择行。