我正在尝试使用以下代码从多个子文件夹中读取一堆csv文件:
for csv in glob.glob('./data/*/*.csv', recursive=True): # all csv files in ./data
vname = 'data_' + csv.split('/')[3].split('.')[0].lower() # variable names created from lowercased filenames
print(csv, '-->', vname) # test print csv-path and variable (for debugging)
exec("{0} = {1}".format(vname, pd.read_csv(csv, encoding='latin1'))) # initialize data from csvs to varaible names
我尝试在单独的行中读取 csv,并使用 tmp 变量作为format
的参数,但没有成功。 读取 csv 文件本身并分配整数exec("{0} = {1}".format(vname, 2))
也可以。我无法理解为什么我总是收到以下语法错误:
Traceback (most recent call last):
File "/home/seb/.anaconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-173-4c89ac367e67>", line 4, in <module>
exec("{0} = {1}".format(vname, pd.read_csv(csv, encoding='latin1'))) # initialize data from csvs to varaible names
File "<string>", line 1
data_sharks_rays_chimaeras = id_no binomial presence origin seasonal
^
SyntaxError: invalid syntax
问题是您尝试将pd.read_csv
的结果用作字符串格式参数。这是行不通的。
你可以试试:
exec("{0} = pd.read_csv({1}, encoding='latin1'))".format(vname, csv)
但是,不建议将exec
用于此类任务(有关一些线索,请参阅此处(。您可以改用字典:
data = {}
for csv in glob.glob('./data/*/*.csv', recursive=True):
vname = 'data_' + csv.split('/')[3].split('.')[0].lower()
data[vname] = pd.read_csv(csv, encoding='latin1'))