我使用的是以下格式的CSV文件:
"LatD", "LatM", "LatS", "NS", "LonD", "LonM", "LonS", "EW", "City", "State"
41, 5, 59, "N", 80, 39, 0, "W", "Youngstown", OH
42, 52, 48, "N", 97, 23, 23, "W", "Yankton", SD
46, 35, 59, "N", 120, 30, 36, "W", "Yakima", WA
42, 16, 12, "N", 71, 48, 0, "W", "Worcester", MA
43, 37, 48, "N", 89, 46, 11, "W", "Wisconsin Dells", WI
cities = pd.read_csv("cities.csv")
并尝试调用一个列:
print(cities[cities.City.str.contains("Y")])
我得到这个错误:
AttributeError: 'DataFrame' object has no attribute 'City'
我尝试使用修复它,但问题仍然存在:
cities.columns = cities.columns.str.strip()
这与第一行的引号有关吗?如果是这样,是否有一种方法可以将它们以编程方式转换?
提前谢谢你。
您可以尝试用空字符串替换"
(只要列不包含其他"
作为数据,它将工作):
from io import StringIO
with open("cities.csv", "r") as f_in:
df = pd.read_csv(
StringIO(f_in.read().replace('"', "")), sep=r"s*,s*", engine="python"
)
print(df[df.City.str.contains("Y")])
打印:
LatD LatM LatS NS LonD LonM LonS EW City State
0 41 5 59 N 80 39 0 W Youngstown OH
1 42 52 48 N 97 23 23 W Yankton SD
2 46 35 59 N 120 30 36 W Yakima WA
试试这个:
cities["City"].str.contains('Y').any()
参考了解更多:https://www.statology.org/pandas-check-if-column-contains-string/