您可以将现有的 git 存储库转换为"blobless"存储库吗?



今天,git提供"部分克隆";选项,允许下载一个存储库的提交和树,同时允许按需下载blob,节省网络带宽和磁盘空间。

可以在初始化git clone时通过传递--filter=blob:none使能。但是,是否有一种方法可以将已经存在的本地存储库转换为blobles"格式吗?通过删除已知可用的任何本地blob,可以节省一些磁盘空间"promisor"遥远。

虽然我不知道有专门的就地命令,但是您仍然可以执行本地克隆,然后用blobless副本替换原始文件夹。
(灵感来自knittl comment)

# If it's your first time, you'd need to enable filtering locally.
git config --global uploadpack.allowFilter true
# Filter your existing repo into a new one.
# The `file://` protocol followed by a full path is required.
git clone --filter=blob:none file:///full_path_to_existing_repo/.git path_to_new_blobless_copy
# Reconfigure the origin of your new repo.
# You can retrieve it with `git remote -v` in your existing repo.
cd path_to_new_blobless_copy
git remote set-url origin remote_path_to_origin.git
cd -
# Replace your existing repo with the new one.
# Destructive operation that will free up the space of the blobs.
# But will also destroy your local stashes, branches and tags that you didn't clone!
rm -rf /full_path_to_existing_repo
mv path_to_new_blobless_copy /full_path_to_existing_repo

我相信.git/config在带有和不带有--filter=blob:none的克隆之间的唯一区别是以下配置:

...
[remote "origin"] # or whatever your remote is named
promisor = true
partialclonefilter = blob:none
...

可以用以下命令修改:

# change <origin> to the name of your remote
git config remote.origin.promisor true
git config remote.origin.partialclonefilter blob:none

根据文档的改变,这只影响新提交的获取,但我相信git gc --prune=now应该清理不必要的对象。

首先通过推入任何提交来确保repo中的所有blob都存在于远程。

upstream=origin
gitdir=$(realpath $(git rev-parse --git-dir))
mv "$gitdir"/objects "$gitdir"/objects.bak  # Create backup
mkdir "$gitdir"/objects  # Required else git thinks its not a repository
# Tell git this repo is blobless
git config remote."$upstream".promisor true
git config remote."$upstream".partialclonefilter blob:none
# Fetch branches, trees, and tags
git fetch --refetch --tags --no-auto-gc 
# Now fetch the objects reachable from HEAD:
git fetch-pack --refetch --keep --quiet git@github.com:user/repo.git  $(git rev-list --missing=allow-promisor --objects HEAD | cut -c 1-40) > /dev/null
git fsck --full && echo "Remember to: rm -rf "$gitdir"/objects.bak"

关于本地存储和标签(如果有的话,最终的git fsck --full会抱怨):

  • 使用git-pack-objects创建一个仅限本地对象的包,然后恢复这个单个手动创建的包

  • 获取存储的哈希值:

    git for-each-ref refs/stash --format='%(objectname)'
    
  • 获取带注释标签的哈希值:

    git for-each-ref --format="%(if:equals=tag)%(objecttype)%(then)%(objectname)%(else)%(end)" --sort=taggerdate refs/tags
    
  • 让上面的对象更小,在X中,而不是在父

    git rev-list --objects 68268d5 '^68268d5^^'
    

    (第一个^表示"非";(^^表示父节点)

感谢Oded Niv对配置提示的回答。

最新更新