if语句有问题?不平等规则行不通?

我有1000个具有相同结构的数据帧，但其中一些可能包含字符串作为值。对于所有这些帧，我需要做相同的计算，只是排除那些数据帧中出现子字符串的行。带string的数据集结构示例如下:

time    x           y               z
0.00  run_failed  run_failed   run_failed
0.02  run_failed  run_failed   run_failed
0.03  test_failed test_failed  test_failed
0.04  44            321         644
0.04  44            321         644
0.04  44            321         644
0.03  test_failed test_failed  test_failed
0.04  44            321         644
0.04  44            321         644

如果也看df。方法，包含子字符串的DFS将始终是对象类型，而"normal"DFS - of float64

因此，为了处理它，我编写了以下脚本:

for df in dfs:  
add = 0
z = pd.read_csv(df)
if type(z["x"]) != np.float64:
bmb = z[z['x'].str.contains('failed')]
z = z.drop(bmb.index)
add = len(bmb)
print(add)
....
and then the code for doing calculations assuming that if string occured, it was dropped inside if statement

但是当我运行代码时，它返回错误:"只能使用字符串值的。str访问器!"指向if语句块内部，然而数据集完全是float64类型，为什么它试图处理这个"BMB = z[z['x'].str.contains('failed')]"命令对我来说一点也不清楚。

if type(z["x"]) != np.float64:

这里你得到DataFrame的列的类型，这是Series考虑以下简单的例子

import pandas as pd
df = pd.DataFrame({"x":[1,2,3]},dtype="int32")
print(type(df["x"]))

给输出

<class 'pandas.core.series.Series'>

因此你的条件总是成立(也就是说，它和写if True:一样)。如果您对保存值的类型感兴趣，请使用.dtype属性

print(df["x"].dtype)

给输出

int32

我想我找到问题了。首先，我修改了

if type(z["x"]) != np.float64

z['x'].dtype == 'object'

，在进一步的计算中，当我将pandas列作为序列时，我添加了.astype(float)，似乎如果通过if语句处理数据框架，列中的值将变成字符串，因为它们返回如下:

array(['0.00260045', '0.00257398', '0.00247482', ..., '0.02017634', '0.01997158','0.02019846'])

相关内容

最新更新

热门标签：