如何在Colab中永久安装图书馆



在Google Colaboratory中,我可以使用!pip install package-name安装新库。但是,当我明天再次打开笔记本时,我需要每次重新安装它。

有没有办法永久安装库?无需花费时间安装每次使用?

是。您可以在Google Drive中安装库。然后将路径添加到sys.path

import os, sys
from google.colab import drive
drive.mount('/content/drive')
nb_path = '/content/notebooks'
os.symlink('/content/drive/My Drive/Colab Notebooks', nb_path)
sys.path.insert(0,nb_path)

然后,您可以安装库,例如jdc,并指定目标。

!pip install --target=$nb_path jdc

稍后,当您再次运行笔记本电脑时,您可以跳过!pip install行。您只能使用import jdc并使用它。这是一个示例笔记本。

https://colab.research.google.com/drive/1kpmdi9cjimudrzxsytdaurjtbahzivjq

顺便说一句,我真的很喜欢jdc%%add_to。它使与大型班级的合作变得更加容易。

如果您想要无授权的解决方案。您可以使用笔记本中嵌入的GCSFUSE Service-Account键使用安装。这样:

# first install gcsfuse
%%capture
!echo "deb http://packages.cloud.google.com/apt gcsfuse-bionic main" > /etc/apt/sources.list.d/gcsfuse.list
!curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
!apt update
!apt install gcsfuse

然后从Google Cloud Console获取您的服务帐户并将其嵌入笔记本

%%writefile /key.json
{
  "type": "service_account",
  "project_id": "kora-id",
  "private_key_id": "xxxxxxx",
  "private_key": "-----BEGIN PRIVATE KEY-----nxxxxxxx==n-----END PRIVATE KEY-----n",
  "client_email": "colab-7@kora-id.iam.gserviceaccount.com",
  "client_id": "100380920993833371482",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/colab-7%40kora-id.iam.gserviceaccount.com"
}

然后设置环境以查找此凭据文件

%env GOOGLE_APPLICATION_CREDENTIALS=/key.json

您必须创建(或已经拥有)GCS存储桶。并将其安装到化妆目录中。

!mkdir /content/my-bucket
!gcsfuse my-bucket /content/my-bucket

最后,将库安装在此处。就像我上述答案一样。

import sys
nb_path = '/content/my-bucket'
sys.path.insert(0, nb_path)
# Do this just once
!pip install --target=$nb_path jdc

您现在可以在不!pip install的情况下进行import jdc

如果您需要安装多个库,则是一个摘要:

def install_library_to_drive(libraries_list):
  """ Install library on gdrive. Run this only once. """
  drive_path_root = 'path/to/mounted/drive/directory/where/you/will/install/libraries'
  for lib in libraries_list:
    drive_path_lib = drive_path_root + lib
    !pip install -q $lib --target=$drive_path_lib
    sys.path.insert(0, drive_path_lib)
def load_library_from_drive(libraries_list):
""" Technically, it just appends install dir to a sys.path """
  drive_path_root = 'path/to/mounted/drive/directory/where/you/will/install/libraries'
  for lib in libraries_list:
    drive_path_lib = drive_path_root + lib
    sys.path.insert(0, drive_path_lib)
libraries_list = ["torch", "jsonlines", "transformers"] # list your libraries
install_library_to_drive(libraries_list) # Run this just once
load_library_from_drive(libraries_list)

我使用虚拟环境在Google Colab中永久安装了库。将此博客用作参考https://netraneupane.medium.com/how-to-install-libraries-permanase-in-google-colab-fb15a5858a5a5

最新更新