这真是一个令人担忧和紧张的问题,我找不到任何适合我的答案,所以我不能备份我的代码超过一年了(太糟糕了!)
我实际上使用BFG repo - cleaner来清理我的repo。实际的总文件大小只有几个KB,但远程大小增长到大约9.8GB,这使得我无法git push
.
我是这样做的:
repo-clean$ git clone --mirror https://gitlab.com/our-projects/my-specific-project.git
Cloning into bare repository 'my-specific-project.git'...
Username for 'https://gitlab.com': my-username
Password for 'https://my-username@gitlab.com':
remote: Enumerating objects: 1306, done.
remote: Total 1306 (delta 0), reused 0 (delta 0), pack-reused 1306
Receiving objects: 100% (1306/1306), 9.73 GiB | 37.61 MiB/s, done.
Resolving deltas: 100% (232/232), done.
检查库存大小:
repo-clean$ cd my-specific-project.git
repo-clean/my-specific-project.git$ du -sh *
4.0K branches
4.0K config
4.0K description
4.0K HEAD
64K hooks
8.0K info
9.8G objects
4.0K packed-refs
12K refs
repo-clean/my-specific-project.git$ cd ..
repo-clean$
然后运行BFG清理我的repo:
repo-clean$ java -jar bfg.jar --strip-blobs-bigger-than 50M my-specific-project.git
Using repo : ~/repo-clean/my-specific-project.git
Scanning packfile for large blobs: 1306
Scanning packfile for large blobs completed in 172 ms.
Found 48 blob ids for large blobs - biggest=4510353716 smallest=74220532
Total size (unpacked)=25890690884
Found 132 objects to protect
Found 3 commit-pointing refs : HEAD, refs/heads/master, refs/merge-requests/1/head
Protected commits
-----------------
These are your protected commits, and so their contents will NOT be altered:
* commit 628fb69b (protected by 'HEAD') - contains 3 dirty files :
- models/RF_modelGeolife.h5 (146.7 MB)
- models/RF_modelSMF.h5 (249.3 MB)
- models/RF_modelgeolife.h5 (146.7 MB)
WARNING: The dirty content above may be removed from other commits, but as
the *protected* commits still use it, it will STILL exist in your repository.
Details of protected dirty content have been recorded here :
~/repo-clean/my-specific-project.git.bfg-report/2022-05-09/15-50-17/protected-dirt/
If you *really* want this content gone, make a manual commit that removes it,
and then run the BFG on a fresh copy of your repo.
Cleaning
--------
Found 53 commits
Cleaning commits: 100% (53/53)
Cleaning commits completed in 150 ms.
Updating 2 Refs
---------------
Ref Before After
------------------------------------------------
refs/heads/master | 628fb69b | 12113214
refs/merge-requests/1/head | f1182758 | 6c3ad899
Updating references: 100% (2/2)
...Ref update completed in 30 ms.
Commit Tree-Dirt History
------------------------
Earliest Latest
| |
..............DDDmmmDDDDmmmmDDDDDDDDDDDDDDDmmmmmmmmmm
D = dirty commits (file tree fixed)
m = modified commits (commit message or parents changed)
. = clean commits (no changes to file tree)
Before After
-------------------------------------------
First modified commit | fc7cf2f9 | a772ae4a
Last dirty commit | d4a1a3d4 | c4a6ad7f
Deleted files
-------------
Filename Git id
-------------------------------------------------------------------------------------------------------------------------
3Class_Instances.pkl | ceebb395 (558.1 MB)
Beijing_KerasData.pkl | 8681a270 (133.4 MB)
Filtered_Trajectory.pkl | bfe06d09 (137.8 MB)
Foot_Car_Instances.pkl | c4bea045 (537.3 MB)
Foot_Car_Instances2.pkl | 8d9b96ad (537.3 MB)
Instance_Geolife.pickle | ee16e13b (412.5 MB)
Instance_Geolife_Beijing.pkl | c2cd394a (409.6 MB)
RF_modelGeolife.h5 | 5629ee4d (146.7 MB)
RF_modelSMF.h5 | 14372982 (249.3 MB)
RF_modelgeolife.h5 | 36293e2c (146.7 MB)
Revised_InstanceCreation+NoJerkOutlier+NOSmoothing.pickle | 29ff8dd4 (269.6 MB)
Revised_KerasData_NoSmoothing.pickle | 2421f835 (91.7 MB), 775b6041 (1.5 GB)
Revised_Trajectory_Label_Array.pickle | 059a4596 (84.5 MB)
Revised_Trajectory_Label_Array2017.pickle | 7e24d6f7 (216.7 MB)
Revised_Trajectory_Label_Array2018.pickle | cee1e176 (791.3 MB)
...
In total, 71 object ids were changed. Full details are logged here:
~/repo-clean/my-specific-project.git.bfg-report/2022-05-09/15-50-17
BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive
去掉不需要的脏数据:
repo-clean$ cd my-specific-project.git
~/repo-clean/my-specific-project.git$ git reflog expire --expire=now --all && git gc --prune=now --aggressive
Enumerating objects: 1310, done.
Counting objects: 100% (1310/1310), done.
Delta compression using up to 8 threads
Compressing objects: 100% (1242/1242), done.
Writing objects: 100% (1310/1310), done.
Building bitmaps: 100% (53/53), done.
Total 1310 (delta 245), reused 962 (delta 0), pack-reused 0
然后尝试push to remote,这应该是最后一步,但是失败了:
~/repo-clean/my-specific-project.git$ git push
Username for 'https://gitlab.com': my-username
Password for 'https://my-username@gitlab.com':
Enumerating objects: 1310, done.
Writing objects: 100% (1310/1310), 2.08 GiB | 21.02 MiB/s, done.
Total 1310 (delta 0), reused 0 (delta 0), pack-reused 1310
remote: Resolving deltas: 100% (245/245), done.
remote: GitLab: Your push to this repository has been rejected because it would exceed storage limits. Please contact your GitLab administrator for more information.
To https://gitlab.com/our-projects/my-specific-project.git
! [remote rejected] master -> master (pre-receive hook declined)
! [remote rejected] refs/merge-requests/1/head -> refs/merge-requests/1/head (deny updating a hidden ref)
error: failed to push some refs to 'https://our-projects/my-specific-project.git'
似乎BFG已经将回购大小减少到约2.1GB(这包括未跟踪的dir
,如venv
和数据dir
)。
~/repo-clean/my-specific-project.git$ du -sh *
4.0K branches
4.0K config
4.0K description
4.0K HEAD
64K hooks
12K info
2.1G objects
4.0K packed-refs
16K refs
注意
我也使用过类似的工具,如这里描述的git-filter-repo,但产生了错误,我报告给GitLab社区,但没有得到任何帮助,以及这里描述的gitlab-rake
,但没有成功。
注意warning
。这告诉您在最近的提交中有三个非常大的文件,也称为HEAD
。BFG无法移除。
正如warning
告诉您的那样,您应该git rm --cached
这三个文件(以及任何其他不需要的大文件,当您使用它时),然后git commit
,然后再次运行BFG以解决问题。
(当然,请确保将所有这些文件也添加到.gitignore
中,这样它们就不会意外地再次添加到提交中。)