[Review] Git Community Book
|Book||Git Community Book|
|Author||people in the Git community|
- Git Object Model
- Advanced Git
- Internals and Plumbing
Git Object Model
- Represents object name.
- 40-digit long.
- Use SHA1 hash to generate based on the object content.
- Keeps the identity.
Every object consists of three things: type, size, content.
There are four different types of objects: blob, tree, commit, tag.
blob is a chunk of binary data, used to stroe file data.
The blob is entirely defined by its data, totally independent of its location.
tree is basically like a directory - it references a bunch of other trees and/or blobs.
Since trees and blobs, like all other objects, are named by the SHA1 hash of their contents, two trees have the same SHA1 name if and only if their contents (including, recursively, the contents of all subdirectories) are identical.
commit points to a single tree, marking it as what the project looked like at a certain point in time. It contains meta-information about that point in time, such as a timestamp, the author of the changes since the last commit, a pointer to the previous commit(s), etc.
1 2 3 4 5 6 7 8
tag is a way to mark a specific commit as special in some way. It is normally used to tag certain commits as specific releases or something along those lines.
A tag object contains an object name (called simply ‘object’), object type, tag name, the name of the person (“tagger”) who created the tag, and a message, which may contain a signature
Different from SVN
GIT stores a snapshot, while other SCM systems stores the differences between one commit and the next.
Create New Empty Branches
Use symobolic-ref. A symbolic ref is a regular file that stores a string that begins with ref: refs/. For example, your .git/HEAD is a regular file whose contents is ref: refs/heads/master.
In the past, .git/HEAD was a symbolic link pointing at refs/heads/master. When we wanted to switch to another branch, we did ln -sf refs/heads/newbranch .git/HEAD, and when we wanted to find out which branch we are on, we did readlink .git/HEAD. But symbolic links are not entirely portable, so they are now deprecated and symbolic refs (as described above) are used by default.
1 2 3 4 5 6 7 8 9
Modifying Your History
git filter-branch to rewrite branches.
When merging, one parent will be HEAD, and the other will be the tip of the other branch, which is stored temporarily in MERGE_HEAD.
During the merge, the index holds three versions of each file. Each of these three “file stages” represents a different version of the file:
1 2 3
Some special diff options allow diffing the working directory against any of these stages:
1 2 3 4 5 6
Git and Email
git format-patch origin will produce a numbered series of files in the current directory, one of each patch in the current branch but not in origin/HEAD.
git am patches.mbox
Client Side Hookds
Create the submodules:
1 2 3 4 5 6 7 8 9 10 11 12
Create the superproject and add all the submodules:
1 2 3 4 5 6 7
See what files git-submodule created:
git-submodule add command does a couple of things:
- It clones the submodule under the current directory and by default checks out the master branch.
- It adds the submodule’s clone path to the gitmodules file and adds this file to the index, ready to be committed.
- It adds the submodule’s current commit ID to the index, ready to be committed.
Commit the superproject:
Clone the superproject:
1 2 3
Check submodule status:
1 2 3 4 5
Register the submodule into
Clone the submodules and check out the commits specified in the superproject:
1 2 3 4
One major difference between
git-submodule update and
git-submodule add is that git-submodule update checks out a specific commit, rather than the tip of a branch. It’s like checking out a tag: the head is detached, so you’re not working on a branch.
1 2 3
Check out or create a new branch:
Do work and commit:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Cautions on Submodules:
Always publish the submodule change before publishing the change to the superproject that references it. If you forget to publish the submodule change, others won’t be able to clone the repository:
1 2 3 4 5 6 7 8 9 10 11 12 13
It’s not safe to run git submodule update if you’ve made and committed changes within a submodule without checking out a branch first. They will be silently overwritten:
1 2 3 4 5 6 7 8 9 10
Internals and Plumbing
How Git Stores Objects
Loose objects are the simpler format. It is simply the compressed data stored in a single file on disk.
If the sha of your object is
ab04d884140f7b0cf8bbf86d6883869f16a46f65, then the file will be stored in the following path:
The Ruby implementation of object storage:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Packed Objects. In order to save that space, Git utilizes the packfile. This is a format where Git will only save the part that has changed in the second file, with a pointer to the file it is similar to.
The Git Index
The index is a binary file (generally kept in .git/index) containing a sorted list of path names.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
The index contains all the information necessary to generate a single (uniquely determined) tree object.
The index enables fast comparisons between the tree object it defines and the working tree.
It can efficiently represent information about merge conflicts between different tree objects.
The Packfile Index
Importantly, packfile indexes are not neccesary to extract objects from a packfile, they are simply used to quickly retrieve individual objects from a pack.
Updating a Branch Ref
a safer way of doing that is to use the
git update-ref command: