Things I wish everyone knew about Git (Part II)
This is a writeup of
a talk I gave in December for
my previous employer. It's long so I'm publishing it in several parts:
- Part I:
- Part II (you are here):
- More coming later:
- Branches are fictitious
- Committing partial changes
- Push and fetch; tracking branches
- Aliases and custom commands
The most important material is in
Part I.
It is really hard to lose stuff
A Git repository is an append-only filesystem. You can add snapshots
of files and directories, but you can't modify or delete anything.
Git commands sometimes purport to modify data. For
example git commit --amend suggests that it amends a commit.
It doesn't. There is no such thing as amending a commit; commits are
immutable.
Rather, it writes a completely new commit, and then
kinda turns its back on the old one. But the old commit is still in
there, pristine, forever.
In a Git repository you can lose things, in the sense of forgetting
where they are. But they can almost always be found again, one way or
another, and when you find them they will be exactly the same as they
were before. If you git commit --amend and change your mind later,
it's not hard to get the old ⸢unamended⸣ commit back if you want it
for some reason.
If you have the SHA for a file, it will always be the exact same
version of the file with the exact same contents.
If you have the SHA for a directory (a “tree” in Git jargon) it will
always contain the exact same versions of the exact same files with
the exact same names.
If you have the SHA for a commit, it will always contain the exact
same metainformation (description, when made, by whom, etc.) and the
exact same snapshot of the entire file tree.
Objects can have other names and descriptions that come and go, but
the SHA is forever.
(There's a small qualification to this: if the SHA is the only way
to refer to a certain object, if it has no other names, and if you
haven't used it for a few months, Git might
discard it from the repository entirely.)
But what if you do lose something?
There are many good answers to this question but I think the one to
know first is git-reflog , because it covers the great majority of
cases.
The git-reflog command means:
“List the SHAs of commits I have visited
recently”
When I run git reflog the top of the output says what commits I
had checked out at recently, with the top line being the commit I have checked
out right now:
523e9fa1 HEAD@{0}: checkout: moving from dev to pasha
5c31648d HEAD@{1}: pull: Fast-forward
07053923 HEAD@{2}: checkout: moving from pr2323 to dev
...
The last thing I did was check out the branch named pasha ; its tip
commit is at 523e9f1a.
Before
that, I did git pull and Git updated my local dev branch from the
remote one, updating it to 5c31648d .
Before that, I had switched to dev from a different branch,
pr2323 . At that time, before the pull, dev referred to commit
07053923 .
Farther down in the output are some commits I visited last August:
...
58ec94f6 HEAD@{928}: pull --rebase origin dev: checkout 58ec94f6d6cb375e09e29a7a6f904e3b3c552772
e0cfbaee HEAD@{929}: commit: WIP: model classes for condensedPlate and condensedRNAPlate
f8d17671 HEAD@{930}: commit: Unskip tests that depend on standard seed data
31137c90 HEAD@{931}: commit (amend): migrate pedigree tests into test/pedigree
a4a2431a HEAD@{932}: commit: migrate pedigree tests into test/pedigree
1fe585cb HEAD@{933}: checkout: moving from LAB-808-dao-transaction-test-mode to LAB-815-pedigree-extensions
...
Suppose I'm caught in some horrible Git nightmare. Maybe I deleted
the entire test suite or accidentally put my Small Wonder fanfic
into a commit message or overwrote the report templates with 150
gigabytes of goat porn. I can go back to how things were before. I
look in the reflog for the SHA of the commit just before I made my big
blunder, and then:
git reset --hard 881f53fa
Phew, it was just a bad dream.
(Of course, if my colleagues actually saw the goat porn, it can't
fix that.)
I would like to nominate Wile E. Coyote to be
the mascot of Git. Because Wile E. is always getting himself into
situations like this one:
But then, in the next scene, he is magically unharmed. That's Git.
Finding old stuff with git-reflog
git reflog by itself lists the places that HEAD has been
git reflog some-branch lists the places that some-branch has been
- That
HEAD@{1} thing in the reflog output is another way to name
that commit if you don't want to use the SHA.
- You can abbreviate it to just
@{1} .
The following locutions can be used with any git command that wants you to identify a commit:
@{17} (HEAD as it was 17 actions ago)
@{18:43} (HEAD as it was at 18:43 today)
@{yesterday} (HEAD as it was 24 hours ago)
dev@{'3 days ago'} (dev as it was 3 days ago)
some-branch@{'Aug 22'} (some-branch as it was last August 22)
(Use with git-checkout , git-reset , git-show , git-diff , etc.)
Also useful:
git show dev@{'Aug 22'}:path/to/some/file.txt
“Print out that file, as it was on dev , as dev was on August 22”
It's all still in there.
What if you can't find it?
Don't panic! Someone with more experience can probably find it for
you. If you have a local Git expert, ask them for help.
And if they are busy and can't help you immediately, the thing you're looking for
won't disappear while you wait for them. The repository is append-only.
Every version of everything is saved. If they could have found it
today, they will still be able to find it tomorrow.
(Git will eventually throw away lost and unused snapshots, but
typically not anything you have used in the last 90 days.)
What if you regret something you did?
Don't panic! It can probably put it back the way it was.
Git leaves a trail
When you make a commit, Git prints something like this:
your-topic-branch 4e86fa23 Rework foozle subsystem
If you need to find that commit again, the SHA 4e86fa23 is in your
terminal scrollback.
When you fetch a remote branch, Git prints:
6e8fab43..bea7535b dev -> origin/dev
What commit was origin/dev before the fetch? At 6e8fab43 .
What commit is it now? bea7535b .
What if you want to look at how it was before? No problem, 6e8fab43
is still there. It's not called origin/dev any more, but the SHA is
forever. You can still check it out and look at it:
git checkout -b how-it-was-before 6e8fab43
What if you want to compare how it was with how it is now?
git log 6e8fab43..bea7535b
git show 6e8fab43..bea7535b
git diff 6e8fab43..bea7535b
Git tries to leave a trail of breadcrumbs in your terminal. It's
constantly printing out SHAs that you might want again.
A few things can be lost forever!
After all that talk about how Git will not lose things, I should point
out the exceptions. The big exception is that if you have created
files or made changes in the working tree, Git is unaware of them
until you have added them with git-add . Until then, those changes
are in the working tree but not in the repository, and if you discard
them Git cannot help you get them back.
Good advice is Commit early and often. If you don't commit, at
least add changes with git-add . Files added but not committed are
saved in the repository,
although they can be hard to find
because they haven't been packaged into a commit with a single SHA id.
Some people automate this: they have a process that runs every few
minutes and commits the current working tree to a special branch that
they look at only in case of disaster.
The dangerous commands are git-reset and git-checkout
which
modify the working tree, and so might wipe out changes that aren't in
the repository. Git will try to warn you before doing something
destructive to your working tree changes.
git-rev-parse
We saw a little while ago that Git's language for talking about
commits and files is quite sophisticated:
my-topic-branch@{'Aug 22'}:path/to/some/file.txt
Where is this language documented? Maybe not where you would expect: it's in the
manual for git-rev-parse .
The git rev-parse command is less well-known than it should be. It takes a
description of some object and turns it into a SHA.
Why is that useful? Maybe not, but
The git-rev-parse man page explains the
syntax of the descriptions Git understands.
A good habit is to skim
over the manual every few months. You'll pick up something new and useful
every time.
My favorite is that if you use the syntax :/foozle you get the most
recent commit on the current branch whose message mentions
foozle . For example:
git show :/foozle
or
git log :/introduce..:/remove
Coming next week (probably), a few miscellaneous matters about using Git more
effectively.
[Other articles in category /prog/git]
permanent link
|