Archive:
Subtopics:
Comments disabled |
Mon, 08 Oct 2018
Notes on using git-replace to get rid of giant objects
A couple of years ago someone accidentally committed a 350 megabyte
file to our Git repository. Now it's baked in. I wanted to get rid
of it. I thought that I might be able to work out a partial but
lightweight solution using Summary: It didn't work. DetailsIn 2016 a programmer commited a 350 megabyte file to my employer's repo, then in the following commit they removed it again. Of course it's still in there, because someone might check out the one commit where it existed. Everyone who clones the repo gets a copy of the big file. Every copy of the repo takes up an extra 350 megabytes on disk. The usual way to fix this is onerous:
I thought I'd tinker around with The
I can turn this small file into an object with
This creates
So far this doesn't help much. The checkout is smaller, but nobody was likely to have that commit checked out anyway. The large file is still in the repository, and clones and transfers still clone and transfer it. The first thing I tried was a wan hope: will Now comes the hacking part: I am going to destroy the actual object. Say for example, what if:
Now the repository is smaller! And maybe Git won't notice, as long as
I do not use Indeed, much normal Git usage doesn't notice. For example, I can make
new commits with no trouble, and of course any other operation that
doesn't go back as far as 2016 doesn't notice the change. And
But some things become wonky. You get an error message when you clone
the repo because an object is missing. The replacement refs are local
to the repo, and don't get cloned, so clone doesn't know to use the
replacement object anyway. In the clone, you can use No. Unfortunately, there is a show-stopper:
and it doesn't create the pack files. It dies, and leaves behind a
I think I've reached the end of this road. Oh well, it was worth a look. [ Addendum 20181009: A lot of people have unfortunately missed the point of this article, and have suggested that I use BFG or reposurgeon. I have a small problem and a large problem. The small problem is how to remove some files from the repository. This is straightforward, and the tools mentioned will help with it. But because of the way Git works, the result is effectively a new repository. The tools will not help with the much larger problem I would have then: How to get 350 developers to migrate to the new repository at the same time. The approach I investigated in this article was an attempt to work around this second, much larger problem. ] [Other articles in category /prog] permanent link |