读取多个 csv 文件时出现未知语法错误



我正在尝试使用以下代码从多个子文件夹中读取一堆csv文件:

for csv in glob.glob('./data/*/*.csv', recursive=True): # all csv files in ./data
vname = 'data_' + csv.split('/')[3].split('.')[0].lower() # variable names created from lowercased filenames
print(csv, '-->', vname) # test print csv-path and variable (for debugging)
exec("{0} = {1}".format(vname, pd.read_csv(csv, encoding='latin1'))) # initialize data from csvs to varaible names

我尝试在单独的行中读取 csv,并使用 tmp 变量作为format的参数,但没有成功。 读取 csv 文件本身并分配整数exec("{0} = {1}".format(vname, 2))也可以。我无法理解为什么我总是收到以下语法错误:

Traceback (most recent call last):
File "/home/seb/.anaconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-173-4c89ac367e67>", line 4, in <module>
exec("{0} = {1}".format(vname, pd.read_csv(csv, encoding='latin1'))) # initialize data from csvs to varaible names
File "<string>", line 1
data_sharks_rays_chimaeras =           id_no                 binomial presence origin seasonal  
  ^
SyntaxError: invalid syntax

问题是您尝试将pd.read_csv的结果用作字符串格式参数。这是行不通的。

你可以试试:

exec("{0} = pd.read_csv({1}, encoding='latin1'))".format(vname, csv)

但是,不建议将exec用于此类任务(有关一些线索,请参阅此处(。您可以改用字典:

data = {}
for csv in glob.glob('./data/*/*.csv', recursive=True):
vname = 'data_' + csv.split('/')[3].split('.')[0].lower()
data[vname] = pd.read_csv(csv, encoding='latin1'))

最新更新