当我试图阅读印度尼西亚出生时的预期寿命时(https://data.worldbank.org/indicator/SP.DYN.LE00.IN?locations=ID如果你想查看,这是链接)我不能,这是我的代码
import pandas as pd
import matplotlib.pyplot as plt
lifeexpectacion = pd.read_csv("API_SP.DYN.LE00.IN_DS2_en_csv_v2_4770434.csv")
print(lifeexpectacion)
,错误是
File "D:programaizardata economymain.py", line 4, in <module>
lifeexpectacion = pd.read_csv("API_SP.DYN.LE00.IN_DS2_en_csv_v2_4770434.csv")
CSV的前4行包含标题、最后更新日期等信息。您需要跳过数据文件的前4行。使用pd.read_csv("API_SP.DYN.LE00.IN_DS2_en_csv_v2_4770434.csv", skiprows=4)
我下载了链接的文件,看看是否可以重新创建错误。这篇文章也有类似的问题。
csv的前四行是:
"Data Source","World Development Indicators",
"Last Updated Date","2022-12-22",
如果您删除这些行,它将按预期工作。是元数据让熊猫觉得应该只有两列,而实际上有67列。
works for me
df = pd.read_csv(r'D:tempAPI_SP.DYN.LE00.IN_DS2_en_csv_v2_4770434.csv',skiprows=4)
df
Out[149]:
Country Name Country Code ... 2021 Unnamed: 66
0 Aruba ABW ... NaN NaN
1 Africa Eastern and Southern AFE ... NaN NaN
2 Afghanistan AFG ... NaN NaN
3 Africa Western and Central AFW ... NaN NaN
4 Angola AGO ... NaN NaN
.. ... ... ... ... ...
261 Kosovo XKX ... NaN NaN
262 Yemen, Rep. YEM ... NaN NaN
263 South Africa ZAF ... NaN NaN
264 Zambia ZMB ... NaN NaN
265 Zimbabwe ZWE ... NaN NaN
[266 rows x 67 columns]
df.columns
Out[150]:
Index(['Country Name', 'Country Code', 'Indicator Name', 'Indicator Code',
'1960', '1961', '1962', '1963', '1964', '1965', '1966', '1967', '1968',
'1969', '1970', '1971', '1972', '1973', '1974', '1975', '1976', '1977',
'1978', '1979', '1980', '1981', '1982', '1983', '1984', '1985', '1986',
'1987', '1988', '1989', '1990', '1991', '1992', '1993', '1994', '1995',
'1996', '1997', '1998', '1999', '2000', '2001', '2002', '2003', '2004',
'2005', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013',
'2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021',
'Unnamed: 66'],
dtype='object')