我开始使用Vertex AI并尝试创建一个自定义作业。requirements.txt
文件包含:
--extra-index-url https://europe-west4-python.pkg.dev/.../europe-west4-python/simple
my_package1==1.2.3
my_package2=4.5.6
在构建日志中,我得到以下输出:
Step #1 - "create job": Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com, https://europe-west4-python.pkg.dev/.../europe-west4-python/simple
Step #1 - "create job": WARNING: Compute Engine Metadata server unavailable on attempt 1 of 3. Reason: timed out
Step #1 - "create job": WARNING: Compute Engine Metadata server unavailable on attempt 2 of 3. Reason: timed out
Step #1 - "create job": WARNING: Compute Engine Metadata server unavailable on attempt 3 of 3. Reason: timed out
Step #1 - "create job": WARNING: Authentication failed using Compute Engine authentication due to unavailable metadata server.
Step #1 - "create job": WARNING: Failed to retrieve Application Default Credentials: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Step #1 - "create job": WARNING: Trying to retrieve credentials from gcloud...
Step #1 - "create job": WARNING: Could not open the configuration file: [/home/.config/gcloud/configurations/config_default].
Step #1 - "create job": ERROR: (gcloud.config.config-helper) You do not currently have an active account selected.
Step #1 - "create job": Please run:
Step #1 - "create job":
Step #1 - "create job": $ gcloud auth login
Step #1 - "create job":
Step #1 - "create job": to obtain new credentials.
Step #1 - "create job":
Step #1 - "create job": If you have already logged in with a different account:
Step #1 - "create job":
Step #1 - "create job": $ gcloud config set account ACCOUNT
Step #1 - "create job":
Step #1 - "create job": to select an already authenticated account to use.
Step #1 - "create job": WARNING: Failed to retrieve credentials from gcloud: gcloud command exited with status: Command '['gcloud', 'config', 'config-helper', '--format=json(credential)']' returned non-zero exit status 1.
Step #1 - "create job": WARNING: Artifact Registry PyPI Keyring: No credentials could be found.
Step #1 - "create job": WARNING: Keyring is skipped due to an exception: Failed to find credentials, Please run: `gcloud auth application-default login or export GOOGLE_APPLICATION_CREDENTIALS=<path/to/service/account/key>`
Step #1 - "create job": User for europe-west4-python.pkg.dev: ERROR: Exception:
Step #1 - "create job": Traceback (most recent call last):
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/cli/base_command.py", line 160, in exc_logging_wrapper
Step #1 - "create job": status = run_func(*args)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/cli/req_command.py", line 247, in wrapper
Step #1 - "create job": return func(self, options, args)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/commands/install.py", line 400, in run
Step #1 - "create job": requirement_set = resolver.resolve(
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/resolution/resolvelib/resolver.py", line 92, in resolve
Step #1 - "create job": result = self._result = resolver.resolve(
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 481, in resolve
Step #1 - "create job": state = resolution.resolve(requirements, max_rounds=max_rounds)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 348, in resolve
Step #1 - "create job": self._add_to_criteria(self.state.criteria, r, parent=None)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 172, in _add_to_criteria
Step #1 - "create job": if not criterion.candidates:
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/resolvelib/structs.py", line 151, in __bool__
Step #1 - "create job": return bool(self._sequence)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 155, in __bool__
Step #1 - "create job": return any(self)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 143, in <genexpr>
Step #1 - "create job": return (c for c in iterator if id(c) not in self._incompatible_ids)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 44, in _iter_built
Step #1 - "create job": for version, func in infos:
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/resolution/resolvelib/factory.py", line 279, in iter_index_candidate_infos
Step #1 - "create job": result = self._finder.find_best_candidate(
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/package_finder.py", line 889, in find_best_candidate
Step #1 - "create job": candidates = self.find_all_candidates(project_name)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/package_finder.py", line 830, in find_all_candidates
Step #1 - "create job": page_candidates = list(page_candidates_it)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/sources.py", line 134, in page_candidates
Step #1 - "create job": yield from self._candidates_from_page(self._link)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/package_finder.py", line 790, in process_project_url
Step #1 - "create job": index_response = self._link_collector.fetch_response(project_url)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/collector.py", line 461, in fetch_response
Step #1 - "create job": return _get_index_content(location, session=self.session)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/collector.py", line 364, in _get_index_content
Step #1 - "create job": resp = _get_simple_response(url, session=session)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/collector.py", line 135, in _get_simple_response
Step #1 - "create job": resp = session.get(
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/requests/sessions.py", line 600, in get
Step #1 - "create job": return self.request("GET", url, **kwargs)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/network/session.py", line 518, in request
Step #1 - "create job": return super().request(method, url, *args, **kwargs)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/requests/sessions.py", line 587, in request
Step #1 - "create job": resp = self.send(prep, **send_kwargs)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/requests/sessions.py", line 708, in send
Step #1 - "create job": r = dispatch_hook("response", hooks, r, **kwargs)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/requests/hooks.py", line 30, in dispatch_hook
Step #1 - "create job": _hook_data = hook(hook_data, **kwargs)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/network/auth.py", line 270, in handle_401
Step #1 - "create job": username, password, save = self._prompt_for_password(parsed.netloc)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/network/auth.py", line 233, in _prompt_for_password
Step #1 - "create job": username = ask_input(f"User for {netloc}: ")
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/utils/misc.py", line 204, in ask_input
Step #1 - "create job": return input(message)
Step #1 - "create job": EOFError: EOF when reading a line
Step #1 - "create job": The command '/bin/sh -c pip install --no-cache-dir -r ./requirements.txt' returned a non-zero code: 2
Step #1 - "create job": ERROR: (gcloud.ai.custom-jobs.create)
Step #1 - "create job": Docker failed with error code 2.
Step #1 - "create job": Command: docker build --no-cache -t gcr.io/.../cloudai-autogenerated/...:20221212.14.42.28.274055 --rm -f- .
Step #1 - "create job":
安装keyrings.google-artifactregistry-auth
包。
service-...@gcp-sa-aiplatform-cc.iam.gserviceaccount.com
和我在构建触发器中指定的服务帐户具有从工件注册表读取的访问权限。同样的,我在本地尝试过,在我的PC上也有同样的问题。
我的第一个理解是顶点AI容器没有网络连接,但至少我可以访问谷歌主页。但是,metadata.google.internal
超时。
我试图将network = "default"
和network = "cloudbuild"
(同时读取)添加到我的°配置。创建自定义作业的Yaml文件,但仍然得到错误。
进一步,我通过RUN
和ONBUILD RUN
添加了一些输出到我的基本映像的Dockerfile
,可以看到第一个具有构建触发集的项目和服务帐户,但gcloud ai custom-jobs create
完成的docker build
不再具有它。
是否有另一种方法,而不是硬编码一个服务帐户的访问密钥到基本映像?
我不使用Vertex AI
,但通常在GCP
中,如果你想使用Artifact Registry
中的Python
包,有2种方法(文档是完整的,并给出了不同的步骤)。
- 密匙环(更多推荐)
- 使用令牌密钥进行认证
最后,您将生成一个pip.conf
文件,其中包含extra index url
,针对Artifact
注册表的url。
如果您使用令牌密钥作为base64
的方法,下面的命令将为您生成pip.conf
文件:
gcloud artifacts print-settings python --project=PROJECT
--repository=REPOSITORY
--location=LOCATION --json-key=KEY-FILE
在这种情况下,必须遵循Json
键的最佳实践。
在所有的情况下,最后你必须复制pip.conf
文件到预期的地方,给Vertex AI
从Artifact Registry
下载包的可能性。