Name从导入的python脚本中访问导入的python包时出错



问题陈述:

我正在Jupyter Notebook中编写一个程序,它可以动态地编写另一个脚本(script.py(。在编写了script.py之后,编写该文件的函数通过import语句运行它,然后从script.py调用一个函数。

我需要在script.py中使用pandas,并在script.py的顶部导入它。我在执行了script.py顶部的import pandas as pd之后就得到了NameError: name 'pd' is not defined。我最初试图省略import语句,因为它已经在调用程序中执行了,但我得到了同样的错误。我试着把import语句放在script.py中的函数中,但我得到了同样的错误。

Update2,已解决:代码现在可以工作了。我确信我所做的唯一一件事就是离开并返回并输入%debug,然后重新启动内核并运行所有单元。它找不到要调试的回溯。我想你可以说这很神奇,但也许是重新启动了内核。魔术对我来说更有意义,哈哈。

Update1:原始示例代码实际上并没有重现错误。如果我测试运行它,我会更好地在实际代码中隔离问题。我的坏。我仍然无法解决这个问题,但构建write语句的循环似乎有些问题。因为在没有循环的情况下运行一次类似的代码是可行的。

这是我的真实代码:

import os
import pandas as pd
def read_files_in_folder(fp_list, path=None, arg_list=None):
'''Reads a folder of csv tables into a dictionary of dataframes.
Does this dynamically by writing a script to a file, importing the script,
and running a function from the script.
Parameters:
fp_list is [str]: list of filenames or filepaths of csv files.
path is str: (optional) filepath str filenames. os.curdir if None.
arg_list is [str]: (optional) list of pd.read_csv() arguments to pass.
Returns:
df_dict is {pd.DataFrame}: dict of dataframes created from csv files.'''

df_dict = {}

if path is None:
path = os.curdir

if arg_list is None:
for fp in fp_list:
fp_var_name = fp.split('/')[-1].split('.')[0]
df_dict[fp_var_name] = pd.read_csv(path + fp)
else:
args = ''
for arg in arg_list:
args += ', ' + arg
with open('script.py', 'w') as file:
file.write("""
import pandas as pd
def csvs_to_df_dict():
tdf_dict = {}
""")
for fp in fp_list:
fp_var_name = fp.split('/')[-1].split('.')[0]
statement = "tdf_dict['" + fp_var_name + "'] = pd.read_csv('" + path + fp + "'" + args + ")n"
file.write(statement)
file.write('treturn df_dict')
import script
df_dict = script.csvs_to_df_dict()

return df_dict

然后我执行:

csv_path = os.curdir + '/csv_tables/'
filename_list = os.listdir(path=csv_path)
df_dict = read_files_in_folder(fp_list=filename_list, path=csv_path,
arg_list=['index_col=0','skip_blank_lines=False'])
df_dict['abscorrup_idea.csv']

这写脚本.py:


import pandas as pd
def csvs_to_df_dict():
df_dict = {}
df_dict['abscorrup_idea'] = pd.read_csv('./csv_tables/abscorrup_idea.csv', index_col=0, skip_blank_lines=False)
# ... ... ...
df_dict['sorigeq_idea'] = pd.read_csv('./csv_tables/sorigeq_idea.csv', index_col=0, skip_blank_lines=False)
return df_dict

但是,一旦从df_dict = script.csvs_to_df_dict()进入script.py,在script.py的import pandas as pd之后,它就会返回NameError: name 'pd' is not defined。请参阅下面的完整错误输出。

如果您不传递arg_list,因此一开始就不创建script.py文件,那么它就可以工作。所以,它适用于我的即时使用,但我想明白为什么它不能以另一种方式工作。

我最初尝试将script.py写成一系列语句,而不是函数。我以为它会像把代码块插入到调用它的代码中一样运行,但我无法从一个脚本到另一个脚本调用df_dict。不同的命名空间?所以,我正在尝试一个函数。

以下是完整的错误输出:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-26-13999e7ca3af> in <module>
----> 1 df_dict = read_files_in_folder(fp_list=filename_list, path=csv_path,
2                                arg_list=['index_col=0','skip_blank_lines=False'])
<ipython-input-25-4f1e04e89145> in read_files_in_folder(fp_list, path, arg_list)
35             file.write('treturn df_dict')
36         import script
---> 37         df_dict = script.csvs_to_df_dict()
38 
39     return df_dict
~OneDriveEducationWGUC749_intro_to_data_scienceModule_3_Investigate_A_DatasetProjectscript.py in csvs_to_df_dict()
1 
2 import pandas as pd
----> 3 
4 def csvs_to_df_dict():
5     df_dict = {}
NameError: name 'pd' is not defined

更新前的原始示例,已清理并正确运行:

例如:

# script1.py #
import pandas as pd
# The following is actually part of a function
# that is called later in the same script1,
# but I'm keeping it simple for the example.
df_dict = {}
with open('script2.py', 'w') as file:
file.write("""
# script2.py #
import pandas as pd
def run_it():
tdf_dict = {}
""")
path = './csv_tables/'
fn = 'abscorrup_idea.csv'
file.write("tdf_dict['abscorrup_idea'] = pd.read_csv('" + path + fn + "', index_col=0, skip_blank_lines=False)n")
file.write('treturn df_dict')
import script2
df_dict = script2.run_it()
df_dict

这会写入以下文件,运行它,并调用函数:


# script2.py #
import pandas as pd
def run_it():
df_dict = {}
df_dict['abscorrup_idea'] = pd.read_csv('./csv_tables/abscorrup_idea.csv', index_col=0, skip_blank_lines=False)
return df_dict

我试图重现您的错误,但失败了。当我只是复制粘贴你的代码时,我会得到一个SyntaxError,因为你的转义有问题。但是这个

with open('script2.py', 'w') as file:
file.write("""
# script2.py #
import pandas as pd
def run_it():
df_dict = {}
df_dict["test"] = pd.DataFrame(data={"test":[1,2,3]})
return df_dict
""")
import script2
df_dict = script2.run_it()
df_dict["test"]

在我的机器上运行得非常好。请注意,我不得不举一个不同的例子dataframe,因为我没有你的csv文件。

如帖子更新中所示,以下代码有效。重新启动内核似乎已经成功了。那还是魔法。

import os
import pandas as pd
def read_files_in_folder(fp_list, path=None, arg_list=None):
'''Reads a folder of csv tables into a dictionary of dataframes.
Does this dynamically by writing a script to a file, importing the script,
and running a function from the script.
Parameters:
fp_list is [str]: list of filenames or filepaths of csv files.
path is str: (optional) filepath str filenames. os.curdir if None.
arg_list is [str]: (optional) list of pd.read_csv() arguments to pass.
Returns:
df_dict is {pd.DataFrame}: dict of dataframes created from csv files.'''

df_dict = {}

if path is None:
path = os.curdir

if arg_list is None:
for fp in fp_list:
fp_var_name = fp.split('/')[-1].split('.')[0]
df_dict[fp_var_name] = pd.read_csv(path + fp)
else:
args = ''
for arg in arg_list:
args += ', ' + arg
with open('script.py', 'w') as file:
file.write("""
import pandas as pd
def csvs_to_df_dict():
tdf_dict = {}
""")
for fp in fp_list:
fp_var_name = fp.split('/')[-1].split('.')[0]
statement = "tdf_dict['" + fp_var_name + "'] = pd.read_csv('" + path + fp + "'" + args + ")n"
file.write(statement)
file.write('treturn df_dict')
import script
df_dict = script.csvs_to_df_dict()

return df_dict

最新更新