我正在做一个项目,由于公司的合规规则,数据必须保存在一个共享目录中,在程序员之间同步。另一方面,项目的代码不能在共享目录上,否则我们将无法版本它并一起工作,因为它都是同步的。共享文件夹的路径几乎是相同的C:Users<employee name><path to data>
,有一种方法,我可以设置C:Users<employee name>
作为我的数据目录在Kedro的基本路径?
我尝试创建一个catalog.py
文件,具有以下代码:
from kedro.io import DataCatalog
from kedro.extras.datasets.pandas import (
CSVDataSet,
ExcelDataSet,
)
from pathlib import Path
DEFAULT_DATA_PATH = Path.expanduser(
Path(
"~",
"Path to Data"
)
)
DATA_CATALOG = DataCatalog(
{
"data": ExcelDataSet(
filepath=Path(EXTERNAL_DATA_PATH, "data.xlsx").as_uri()
)
}
)
然后在setting.py
上我添加了这个:
from .catalog import DATA_CATALOG
DATA_CATALOG_CLASS = DATA_CATALOG
,然后我得到以下错误:
Traceback (most recent call last):
File "...Miniconda3Scriptskedro-script.py", line 9, in <module>
sys.exit(main())
File "...Miniconda3libsite-packageskedroframeworkclicli.py", line 205, in main
cli_collection = KedroCLI(project_path=Path.cwd())
File "...Miniconda3libsite-packageskedroframeworkclicli.py", line 114, in __init__
self._metadata = bootstrap_project(project_path)
File "...Miniconda3libsite-packageskedroframeworkstartup.py", line 155, in bootstrap_project
configure_project(metadata.package_name)
File "...Miniconda3libsite-packageskedroframeworkproject__init__.py", line 166, in configure_project
settings.configure(settings_module)
File "...Miniconda3libsite-packagesdynaconfbase.py", line 223, in configure
self._wrapped = Settings(settings_module=settings_module, **kwargs)
File "...Miniconda3libsite-packagesdynaconfbase.py", line 271, in __init__
self.validators.validate()
File "...Miniconda3libsite-packagesdynaconfvalidator.py", line 318, in validate
validator.validate(self.settings)
File "...Miniconda3libsite-packageskedroframeworkproject__init__.py", line 34,
in validate
if not issubclass(setting_value, default_class):
TypeError: issubclass() arg 1 must be a class
当您提供数据目录的实例时,DATA_CATALOG_CLASS
正在等待一个类,因此出现错误。
我认为这里的方法是使用TemplatedConfigLoader
,并将共享目录作为变量传递。您可以通过global.yml
或仅通过变量提供此SHARE_DIR
。
在你的catalog.yml
some_data:类型:熊猫。CSVDataSet
在这里查看更多文档。https://kedro.readthedocs.io/en/stable/kedro.config.TemplatedConfigLoader.html路径:$ {SHARE_DIR}/file_name