Quick refresher to git
$ git init
$ git add foo.txt
$ git commit
$ git push
What is git? A version control system
What is git? A content-addressable filesystem/object-store
What is git? A content-addressable filesystem/object-store
What is git? A content-addressable filesystem/object-store
Side note: hash functions
h(x):{0,1}∗→{0,1}N
Side note: hash functions
h(x):{0,1}∗→{0,1}N
SHA1(x):{0,1}∗→{0,1}160
Side note: hash functions
Take CO487 to learn more :)
Blobs
- Stores data for individual files
blob 4\0 test
- Filename is SHA1 Hash of content
Trees
- Stores data for directories
tree N\0
100644 blob 30d258... test.txt
- Filename is SHA1 Hash of content
Commits
commit N\0tree 095a05...
parent <parent sha>
[parent <parent sha> if several parents from merges]
author Neil Parikh <parikh.neil@me.com> 1658703268 -0400
committer Neil Parikh <parikh.neil@me.com> 1658703268 -0400
hello
- Filename is SHA1 Hash of content
Refs
- How does git know what the latest commit is (to use for the parent sha)?
- Refs
Refs
- Refs are pointers to commits
- Branches are just refs (see
.git/refs/heads/BRANCH_NAME
)
Refs
- How does git know which ref is "current/active"?
- Special ref called
HEAD
Implications of this approach
- Git doesn't store diffs
- Checkout arbitrary commits in O(1) time
- Deduplication of identical files
Git Internals
Neil Parikh
git.prkh.me