Showing posts with label Git. Show all posts
Showing posts with label Git. Show all posts

Sunday, 27 March 2016

Git - The Onion Model

Disclaimer: This post doesn't teach you git basics. This will help you understand how git works.

In this article we will understand Git's Onion Model - the layers that finally makes it a powerful Distributed Version/Revision Control System.

The four layers are -
4. Distributed Version Control System - Supports push, fetch, pull etc. operations
3. Version Control System - History, branches, merges, rebase etc.
2. Simple Content Tracker - Commits, versions (labels/tags)
1. Map -  key mapped to value persisted on disk




Layer 1 - Map
At its core, git is just a map persisted on disk. This map is a table of key and value. This is also called object database.

The key is SHA1 hash. The value can be of type
  • blob
  • tree
  • commit
  • tag
You can give any value to git and it will calculate its SHA1 hash.

E.g. using the plumbing command hash-object

$ echo "sibtain" | git hash-object --stdin 
64219700ea1c10634d4141fcd1c3f01163cb03d1

To persist this value use -w flag of hash-object.
Note: you need to execute following command inside a git repostiry. (use $ git init to initialize a new repository).

$ echo "sibtain" | git hash-object --stdin -w
64219700ea1c10634d4141fcd1c3f01163cb03d1

To find where/how it is persisted in object store,

$ ls -l .git/objects/
total 12
drwxr-xr-x 2 sibtain sibtain 4096 Mar 27 11:34 64
drwxr-xr-x 2 sibtain sibtain 4096 Mar 27 11:34 info
drwxr-xr-x 2 sibtain sibtain 4096 Mar 27 11:34 pack

You see a directory 64. These are first two characters of SHA1. Inside that directory you will find a file with name as remaining part of SHA1 generated. This is a binary file.

$ ls -l .git/objects/64
total 4
-r--r--r-- 1 sibtain sibtain 23 Mar 27 11:34 219700ea1c10634d4141fcd1c3f01163cb03d1


To get content I'll use another plumbing command cat-file -

$ git cat-file -p 64219700ea1c10634d4141fcd1c3f01163cb03d1
sibtain

To get type of the object
$ git cat-file -t 64219700ea1c10634d4141fcd1c3f01163cb03d1
blob

This clearly explains the inner most layer of git - The Map of key and value pairs & how it is persisted.

The directory structure of a nearly empty git repo is as follows.

$ tree -a
.
`-- .git
    |-- branches
    |-- config
    |-- description
    |-- HEAD
    |-- hooks
    |   |-- applypatch-msg.sample
    |   |-- commit-msg.sample
    |   |-- post-update.sample
    |   |-- pre-applypatch.sample
    |   |-- pre-commit.sample
    |   |-- prepare-commit-msg.sample
    |   |-- pre-rebase.sample
    |   `-- update.sample
    |-- info
    |   `-- exclude
    |-- objects
    |   |-- 64
    |   |   `-- 219700ea1c10634d4141fcd1c3f01163cb03d1
    |   |-- info
    |   `-- pack
    `-- refs
        |-- heads
        `-- tags
11 directories, 13 files

Layer 2 - Simple Content Tracker

The features of a content tracking system is to have provision for maintaining versions and commit checkpoints.

Here we will explore where/how a commit is persisted.

Following directory structure is commited.

$ tree
.
|-- city.lst
`-- city_profile
    |-- mumbai.txt
    `-- pune.txt
 
$ git log
commit 2130ce8e0f697af309e47ab1f0dc916fece0eb9a
Author: sibtain <sibtain@sibtain-linuxmint.(none)>
Date:   Sun Mar 27 17:09:25 2016 +0530

    Adds city details

commit 8408f82db302b32f02510a7afd1749210a3ab9bc
Author: sibtain <sibtain@sibtain-linuxmint.(none)>
Date:   Sun Mar 27 17:09:07 2016 +0530

    Adds City list

Let's focus on commit 2130ce8. Check what's in object database.

$ ls .git/objects
10  21  5e  64  84  8f  a6  ab  info  pack
 
$ ls -l .git/objects/21/
total 4
-r--r--r-- 1 sibtain sibtain 166 Mar 27 17:09 30ce8e0f697af309e47ab1f0dc916fece0eb9a

What is the type of this object?

$ git cat-file -t 2130ce8e0f697af309e47ab1f0dc916fece0eb9a
commit

OK. So it is a commit object. What it contains?

$ git cat-file -p 2130ce8e0f697af309e47ab1f0dc916fece0eb9a
tree 8f7b3eb4e75d78e50dd9d37a8464c3855c1c190e
parent 8408f82db302b32f02510a7afd1749210a3ab9bc
author sibtain <sibtain@sibtain-linuxmint.(none)> 1459078765 +0530
committer sibtain <sibtain@sibtain-linuxmint.(none)> 1459078765 +0530

Adds city details

Therefore, a commit is a simple piece of text generated and stored by git as object in object database. It is having message, committer/author details with timestamp, tree and parent references holding SHA1 values.

Parent points to previous commit. In case of a 3-way merge commit there will be 2 parent entries.

It is also having pointer to a tree. Let's explore that.

$ git cat-file -t 8f7b3eb4e75d78e50dd9d37a8464c3855c1c190e
tree
$ git cat-file -p 8f7b3eb4e75d78e50dd9d37a8464c3855c1c190e
100644 blob 8f4272c240a23d814ee963abcccf9f871aae9be8    city.lst
040000 tree a6ec82fc89c19390894fd7685d32b5124bb24516    city_profile

The tree object is having 2 references. One for a blob (city.lst) and another for a tree (city_profile). The initial numbers specify permission of those objects in hexadecimal. File names and permissions are not stored in blobs, they are stored in tree. Blob is just text.

$ git cat-file -p 8f4272c240a23d814ee963abcccf9f871aae9be8
Mumbai
Pune

$ git cat-file -p a6ec82fc89c19390894fd7685d32b5124bb24516
100644 blob 1013a5511947260b727bd9f79946517121c682ef    mumbai.txt
100644 blob ab4f45300c9270dbb2ba92bc06c0a670271b8f33    pune.txt

Note: You can also use just first few digits of SHA1 in any of the commands.

I'll add a new name to city.lst and then commit changes.

$ git log --oneline
1c57ddc Adds a city to list
2130ce8 Adds city details
8408f82 Adds City list

$ git cat-file -p 1c57ddc
tree 949bd0423891cee02f38c75ac8ec8623ea3f59ff
parent 2130ce8e0f697af309e47ab1f0dc916fece0eb9a
author sibtain <sibtain.masih@gmail.com> 1459079797 +0530
committer sibtain <sibtain.masih@gmail.com> 1459079797 +0530

Adds a city to list

$ git cat-file -p 949bd0
100644 blob 7dc571a82b903bbe28a391600ad9b2a68f752f62    city.lst
040000 tree a6ec82fc89c19390894fd7685d32b5124bb24516    city_profile

Observe that SHA1 of city profile is not changed. So this commit also points to same object is database for city_profile as previous commit. Only there is a new object created and referenced for city.lst



To find how many objects are persisted in object database - 

$ git count-objects
12 objects, 48 kilobytes

The count of 12 comes from the following division.
1 - blob object for demo of hash-objects text - "sibtain"
3 - commit objects
3 - tree objects as commit trees
2 - blob objects for city.lst
1 - tree object for directory city_profile
2 - blob objects for 2 files inside city_profile directory

We have discussed about commits till here. Another feature of a Simple Content Tracker is versioning via tags or labels. A tag is a label for current state of the project. Git supports two types of tags viz.
  1. Lightweight 
  2. Annotated

Lightweight Tags

A lightweight tag just contains a SHA1 value as reference to a commit.

$ git tag lw-1.1

$ ls .git/refs/tags/
lw-1.1


$ cat .git/refs/tags/lw-1.1
1c57ddc8852ecfd621a35df5a93caf7c8f6987d6


$ git cat-file -t 1c57ddc
commit


Annotated Tags

Annotated tag comes with a message and creates an object in git's object db.

$ git tag -a 1.0 -m "Stable 1.0 version"

You  will find an entry for this tag in .git/refs/tags

$ ls -l .git/refs/tags/
total 4
-rw-r--r-- 1 sibtain sibtain 41 Mar 27 20:21 1.0

It contains a reference to a tag object in git's object database.

$ cat .git/refs/tags/1.0
14498a628e939bda2ec6d53032f944a6889c0ecd

The object starts with 14,

$ ls .git/objects/14
498a628e939bda2ec6d53032f944a6889c0ecd
What is the type of this object and what it contains?

$ git cat-file -t 14498a628e939bda2ec6d53032f944a6889c0ecd
tag

$ git cat-file -p 14498a628e939bda2ec6d53032f944a6889c0ecd
object 1c57ddc8852ecfd621a35df5a93caf7c8f6987d6
type commit
tag 1.0
tagger sibtain <sibtain.masih@gmail.com> Sun Mar 27 20:21:00 2016 +0530

Stable 1.0 version

It contains pointer to a commit object, tag name, tagger details with timestamp and message.

Another way to retrieve same information is by using the tag directly.

$ git cat-file -t 1.0
tag

$ git cat-file -p 1.0
object 1c57ddc8852ecfd621a35df5a93caf7c8f6987d6
type commit
tag 1.0
tagger sibtain <sibtain.masih@gmail.com> Sun Mar 27 20:21:00 2016 +0530

Stable 1.0 version

While branches move, tags don't. They stay with same object forever.

Just to revise, the four types of objects that git's object database can store are -
  • Blobs
  • Trees
  • Commits
  • Annotated Tags
You can think of git as - a high level file system built on top of a native file system.

Layer 3 - Version Control System

A version control system is just a single repository. It has history, branches, merges and tags.

History

References between commits are used to track history. All other references viz. commit to a tree, tree to another tree and tree to blob are used to track content of each commit.

Branches

A branch makes a file in .git/refs/heads directory. The file has same name as branch and it contains a SHA1 value as reference to the commit to which it points.

$ git checkout -b villages

$ ls .git/refs/heads/
master 
villages

$ cat .git/refs/heads/villages
1c57ddc8852ecfd621a35df5a93caf7c8f6987d6

$ git cat-file -t 1c57dd
commit

How git finds current branch?

The HEAD pointer contains reference to a file in .git/refs/heads which becomes the current branch.

$ cat .git/HEAD
ref: refs/heads/villages

When you make a new commit, value of HEAD is not changed. The village branch pointer moves & as HEAD is a pointer to village it looks like HEAD is also moved.

Garbage Collection in git looks for objects which cannot be ultimately reached from a branch or a tag. Such objects are garbage collected. As an object is a file in .git/objects/. Hence garbage collection means removing files of those objects.

Rebase

Click Here to learn how to do rebasing in git.

There is some twist here. Remember that -
Commits are database objects & database objects are immutable.
When I do a rebase, the parent of one of the commit is set to a new commit. As the parent value changes, the commit gets a new SHA1. But commits are immutable.

Therefore, when we do a rebase, new copy commits are created which have same data as old commits except the commit which points to a new parent commit. The branch pointer is moved to the tip of the commit chain. As the old commits become unreachable, they are garbage collected including trees and blobs (if any).
Rebasing is an operation that creates new commits. Q.E.D.
History, Branches, Merges, Rebases - that's pretty much a Version Control System.
Layer 4 - Distributed Version Control System

To learn how to work with Git as D-VCS, refer to my posts -

Few points to note here -

.git/refs/heads/remotes/ – contains only reference to HEAD. To optimize the references to all other branches are in .git/packed-refs file.

$ git show-ref master
will show references of all branches (local+remote) having master in their name.

That's it.. All four layers of the Git explained - This completes our Onion Model !!

Saturday, 23 January 2016

Git - Working with Remotes (2)

In this post we will see how we can follow a git workflow and collaborate project development.

Aim
Build a website for Imperial College of Engineering (IEC). (from 3 Idiots ;)

Team
unckle-bob [role = project owner, maintainer]
sibtainmasih [role = contributor]
+ many others (including you :)

Steps

1. unckle-bob creates a new repository on github.

Repository name = ice-website
Description = Dummy website project for Imperial College of Engineering from 3 Idiots
Select initialize this repository with README


2. sibtainmasih forks this repository.

Search for ice-website, go to repository's page and click on fork. Now he is having a fork under his name.
3. sibtainmasih clone's his repository

$ git clone https://github.com/sibtainmasih/ice-website.git
Cloning into 'ice-website'...
remote: Counting objects: 4, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 4 (delta 0), reused 4 (delta 0), pack-reused 0
Unpacking objects: 100% (4/4), done.
Checking connectivity... done.


Then he sets config for user.name and user.email

4. sibtainmasih prepares skeleton file structure, commits changes and pushes them to his repo.

$ git log --oneline --decorate --all
77bfa0c (HEAD -> master) Add home file
59c9189 Add index file
a9780bf (origin/master, origin/HEAD) Initial commit

$ git push
Counting objects: 6, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (5/5), done.
Writing objects: 100% (6/6), 647 bytes | 0 bytes/s, done.
Total 6 (delta 1), reused 0 (delta 0)
To https://github.com/sibtainmasih/ice-website.git
   a9780bf..77bfa0c  master -> master



5. unckle-bob takes responsibility of adding courses.html page.

He clones his repository, commits a course.html file and pushes to unckle-bob/ice-website.

sibtainmasih will see not see these changes in his forked repository.

6. Meanwhile sibtainmasih adds few more commits and then creates a Pull Request (PR).

He clicks on Create pull request. It takes to comparing changes page which warns that can't automatically merge due to conflicts.


He continues, clicks on Create Pull Request, provides title, details and completes that. It ends in a warning suggesting branch has conflicts which must be resolved by someone who has write access i.e. unckle-bob

Hold On! As a contributor it is your responsibility to resolve the conflict(s) before making a pull request. Therefore sibtainmasih does the following.

A) Configure a remote upstream to project's central repository -

$ git remote add upstream https://github.com/unckle-bob/ice-website.git

B) Fetch upstream -

$ git fetch upstream
remote: Counting objects: 12, done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 12 (delta 4), reused 12 (delta 4), pack-reused 0
Unpacking objects: 100% (12/12), done.
From https://github.com/unckle-bob/ice-website
 * [new branch]      master     -> upstream/master


C) Check commit history -








D) Merge local master with upstream/master -

$ git merge upstream/master
Auto-merging README.md
CONFLICT (content): Merge conflict in README.md
Automatic merge failed; fix conflicts and then commit the result.


Check which file(s) have conflicts,

$ git status -s
UU README.md
A  courses.html
A  m_c_a.html


resolve that and do $ git commit

Finally see merged commit history






E) Push changes

$ git push


F) And then create a PR. This will now show a green able to merge text -.


7. unckle-bob reviews the PR and merges in main project repository.

This will show that there is no conflict and the pull request can be merged.

unckle-bob clicks on Merge pull request & the history is properly interlaced.


Remember - There is not autosync. You need to setup upstream and merge/resolve conflicts to make your PRs readily merge-able.

That's it ! This is how git helps to do development in collaboration with other team members.

Git - Working with Remotes (1)

Create a repository for your project named "demo-project" on github. Google will help if you don't know.

What is a Remote Repository?
Remote repository of a project is a git repository hosted on a network or Internet. E.g. github repository named "demo-project" which we just created. These repositories are created to enable collaboration among team members. Everyone will PUSH their code (after resolving conflicts if any) to the remote repository and others will get latest copy of project code with a PULL from remote repository. 

There can be two kinds of remote repositories for a project.
1. Central Remote Repo [Only 1]
2. Forked Remote Repo [1 per team member]

The central remote repo (CRR) of a project will be a read-only repository. All the team members will fork their own remote repos from CRR and will push their changes to forked remote repos (FRR). Once they are ready to merge code in CRR, they need to create a Pull Request (PR) from branch of their FRR to a branch of CRR. Then the code will be reviewed before merging into CRR and making it available to all the FRR of other team members. 

Note: There is option to keep FRR in sync with CRR so that if any pull requests are accepted in CRR, the FRR also gets updated.

With backdrop set, let's dig deeper and understand how to work with remote repositories.

Understanding Git Remote

Scenario 1 
I am already having a local git repository. How can I connect it to remote repository and push my commits?

Solution
Ok. So I already have a git repository which I created using git init command and have my commits in it.

You can use following commands for preparation.

1. Create a project directory -
$ mkdir demo_project_repo
$ cd demo_project_repo/

2. Initialize it as git repository
$ git init

3. Set configs
$ git config --global user.name unckle-bob
$ git config --global user.email bob@gmail.com

4. Check branches (you will not find any branch till first commit).
$ git branch

5. Create index.html
$ vi index.html

6. Status will show index.html as untracked file
$ git status -s
?? index.html

7. Add index.html to staging and check status
$ git add index.html
$ git status -s
A  index.html

8. Commit snapshot
$ git commit -m "Add index.html of project"

(I have made one more commit - total 2)

9. Now check branch & you can see master branch created.
$ git branch
* master

10. -r flag is used to list remote branches
$ git branch -r 

11. -a flag is to list all branches (local + remote)
$ git branch -a
* master

12. To list all the remotes repositories configured in local repo. -v for verbose. (No output as we haven't configured any)
$ git remote -v

 Now I want to connect it to a remote git repository. On github you will find URL for the repo, copy that.



13. Add remote
Syntax - $ git remote add <alias> <url>
$ git remote add origin https://github.com/unckle-bob/demo-project.git

It is just a convention to give alias to central repo as origin. You can use any other alias.

14. Check remotes with -v (verbose) flag 
$ git remote -v
origin  https://github.com/unckle-bob/demo-project.git (fetch)
origin  https://github.com/unckle-bob/demo-project.git (push)

15. You can also see origin remote added in config file.
$ cat .git/config
[core]
        repositoryformatversion = 0
        filemode = false
        bare = false
        logallrefupdates = true
        symlinks = false
        ignorecase = true
        hideDotFiles = dotGitOnly
[user]
        name = unckle-bob
        email = vjtimca11@gmail.com
[remote "origin"]
        url = https://github.com/unckle-bob/demo-project.git
        fetch = +refs/heads/*:refs/remotes/origin/*

16. Let's push our master branch to origin. Note that master in following command specifies local branch.
$ git push -u origin master
Counting objects: 3, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 280 bytes | 0 bytes/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://github.com/unckle-bob/demo-project.git
 * [new branch]      master -> master

17. -u flag in command of step 16 tells git to maintain mapping in config. Next time a simple git push will automatically connect to mapped remote branch.
$ cat .git/config
[core]
        repositoryformatversion = 0
        filemode = false
        bare = false
        logallrefupdates = true
        symlinks = false
        ignorecase = true
        hideDotFiles = dotGitOnly
[user]
        name = unckle-bob
        email = vjtimca11@gmail.com
[remote "origin"]
        url = https://github.com/unckle-bob/demo-project.git
        fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
        remote = origin
        merge = refs/heads/master

18. You can see commits pushed to remote repository.


19. To view remote branches from your local repo 
$ git branch -r
  origin/master

20. To view all branches i.e. local + remote
$ git branch -a
* master
  remotes/origin/master

21. The way local branches are pointers to SHA-1 values commits, same are remote branches.
$ ls -l .git/refs/remotes/origin/
total 1
-rw-r--r-- 1 Usern 197121 41 Jan 21 09:02 master

$ cat .git/refs/remotes/origin/master
02de0b7e6dac98d27b61e31cb0d2722e768f0135
$ git log --oneline --decorate
02de0b7 (HEAD -> master, origin/master) Update index file
00ab6f6 Add index.html of project

$ git remote show origin
* remote origin
  Fetch URL: https://github.com/unckle-bob/demo-project.git
  Push  URL: https://github.com/unckle-bob/demo-project.git
  HEAD branch: master
  Remote branch:
    master tracked
  Local branch configured for 'git pull':
    master merges with remote master
  Local ref configured for 'git push':
    master pushes to master (up to date)



Scenario 2
I am starting with a new project & have created a repository on github. How do I clone it on my local system and continue with development?

Solution
In this case we use clone command. It takes remote repo URL and optional directory name which will be create for local repository.

$ git clone <repo-url> [directory_name]

Scenario 3
I want to contribute to an open source project "django-rest-framework" hosted on github. How can I do that?

Solution

1. Go to central repo of the project on github and fork it.

2. Get URL of your fork of repository and execute following command on your system.

$ git clone https://github.com/unckle-bob/django-rest-framework.git drf

Now do your changes and push to your forked github repo.

3. To merge with central project repo raise a Pull Request (PR)

4. The project owner will review the PR and merge/close it.


Finally few more commands before closing this post.

To rename a remote alias -
$ git remote rename origin gitrepo

To delete a remote alias
$ git remote rm origin

To delete a remote branch
$ git push origin :remote_branch_name

What is the colon (:) magic in last command? - When you do $ git push origin <local_branch_name> it automatically appends :<remote_branch_name> & makes the command as -
$ git push origin local_branch_name:remote_branch_name

Now by not providing any local branch name before colon(:) we action deletion of remote branch. Another command is -
$ git push origin --delete remote_branch_name

You can also do force push using -f command.

Git - Using diff command

In this post I am directly going to hit command line to demonstrate how $ git diff works.

Use Case 1
I changed a tracked file. Now want to see difference between working directory & repo version.

Solution
$ git diff HEAD index.html
diff --git a/index.html b/index.html
index 2a6a819..02b838e 100644
--- a/index.html
+++ b/index.html
@@ -3,6 +3,6 @@

        </head>
        <body>
-               This is index.html page. Adding some more content.
+               Incredible India
        </body>
 </html>
Use Case 2
I staged my changes. Now want to see difference between staged and repo versions of a file.

Solution
In this case if you execute $ git diff there will be no results. Use --cached flag to compare staged and commit versions.

$ git diff --cached index.html

Use Case 3
I edited a staged file & now want to see difference between staged and working copy of the file.

Solution
$ git diff index.html
diff --git a/index.html b/index.html
index 02b838e..6b97ecf 100644
--- a/index.html
+++ b/index.html
@@ -4,5 +4,6 @@
        </head>
        <body>
                Incredible India
+               New Ambassadors - Big B & PC
        </body>
 </html>
Following diagram summarizes the diff commands to use to compare the versions between any of the 3 git repository states.



Use Case 4
Find difference made to a file between two commits.

Solution
I am having index.html file in my repo. I am having two commits.
$ git log --oneline
02de0b7 Update index file
00ab6f6 Add index.html of project

I want to find what is changed in index.html between old and current commit (HEAD).

$ git diff 00ab6f6..HEAD index.html
diff --git a/index.html b/index.html
index 0ce595f..2a6a819 100644
--- a/index.html
+++ b/index.html
@@ -3,6 +3,6 @@

        </head>
        <body>
-               This is index.html page.
+               This is index.html page. Adding some more content.
        </body>
 </html>

It shows a line removed (red) and a new line added (green) replacing old line.

Note: The order of commits in command is important. First old commit then latest commit.

I want to see the words changed.
$ git diff --color-words 00ab6f6..HEAD index.html
diff --git a/index.html b/index.html
index 0ce595f..2a6a819 100644
--- a/index.html
+++ b/index.html
@@ -3,6 +3,6 @@

        </head>
        <body>
                This is index.html page. Adding some more content.
        </body>
</html>

It just shows me words added in green. If any content is removed from file then it will be shown in red.

You can use $ git diff with tree-ish i.e. SHA-1 values or branch names.

E.g.
1. To get difference between two branches -
$ git diff master..feature-home

2. To get differences between two commits with summary and stat-
$ git diff --summary --stat 02de0b7..18ddc0d
 home.html  | 1 +
 index.html | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)
 create mode 100644 home.html

You can use file name(s) in all the above commands to get difference made (if any) in a particular file or set of files.

Setup p4Merge as diff and merge tool

I found it difficult to understand the difference between versions of a file when using a simple console. A better approach is to use P4Merge Tool.

Click here to access the blog post I referred for configuring P4Merge as merge tool in git on Windows.

After download and install, I executed following commands in my git repo.

$ git config --global merge.tool p4merge
$ git config --global mergetool.p4merge.path "C:/Program Files/Perforce/p4merge.exe"

And then to launch P4Merge tool

$ git difftool 


Git - Rebase

Merging branches in git happens in one of the following two fashions.
  1. Fast Forward Merge: It occurs when the commit histories of two branches are linear. It doesn't introduce a new commit.
  2. 3-Way Merge: When the commit histories of the branches are diverged then the last commits of each branch are merged by introducing a new commit.
What if you want to perform a Fast Forward merge with a diverged commit history? The solution is Rebase.  

Illustrating a Rebase

Consider following diagram representing current state of repository's commit history.
Repo Commit History
Merging feature with master will invoke 3-way merge and introduce a new commit as follows.

3-Way Merge
 
To avoid a 3-way, I'll first perform rebase for feature -
$ git checkout feature
$ git rebase master
Repo Commit History - After Rebase
 It has changed the base of feature branch. Repo's commit history depicts that feature branch has been created from master's commit 4. Now a merge in master will be a fast forward merge.
$ git checkout master
$ git merge feature
Fast Forward Merge - After Rebase
We successfully performed rebasing of feature branch onto master. 

How git performs a Rebase?

1. Find common ancestor of master and feature (i.e. 2) by rewinding HEAD of feature
2. Store all changes post commit 2 viz. A, B in a temp location
3. Make HEAD of feature point to master
4. Replay commits A, B on top of master
5. Make HEAD of feature point to tip of commits

If there are any conflicts you need to resolve that and continue with rebase. 
$ git rebase --continue

And if you find rebasing making a mess in the repo, you can abort it too.
$ git rebase --abort









Advanced Rebasing

Consider following as present state of repo history graph.

Now I want to merge commits introduced ONLY by client branch with master. What you think?
$ git checkout master
$ git merge client

NO! It will be a 3-way merge and will also have S1, S2, S3 along with C1, C2.


What we want is -

& following commands will help to achieve this.
$ git rebase --onto master server client
$ git checkout master
$ git merge client

Following are the steps performed for  
$ git rebase --onto master server client
  1. Checkout client branch
  2. Find common ancestor of server and client i.e. S3
  3. Store all commits on client post common ancestor in a temp location viz. C1, C2
  4. Make HEAD of client point to master
  5. Replay commits stored in temp location on top of master
  6. Make HEAD of client point to tip of the commits
A merge between master and server will again be 3-way. I can avoid that and do a rebase without checking out server branch.
Syntax - $ git rebase <base_branch> <topic_brach>
Command - $ git rebase master server


And now you can do a fast forward merge with master.
$ git checkout master
$ git merge server

Notice that the final commit graph is not representing actual state of when from which commit of which branch which other branch is checked out. But it gives a clean view of commit history by means of rebase manipulation. And you doesn't loose your work at all.


Caution Notes:
1. Don't rebase commits which are not yours.
2. Don't rebase your own commits if you have pushed on server and someone has already pulled it in their repo.

Why Rebase

Rebase is not a mandatory action to perform in a usual git workflow. But you can use it to keep your commit history clean. Otherwise 3-way merges will keep on cluttering your commit history with extra commits. Only perform rebase for commits which you own and haven't shared with others.

Sunday, 3 January 2016

Git - log command

In this post I'll list the different flags for cruising through git commit logs of your repository.

To view complete details of each commit of current branch,
$ git log

To view only to n commits,
$ git log -n

To view one line log message with sha-1 values of last 5 commits,
$ git log --oneline -5
4b0deff Merge branch 'seo_title' into website
7aa136d Merge branch 'visiting_places' into website - 3 way
fec6334 Merge branch 'hotfix_correct_title' into website
3c32660 Adds last line to visiting places
63105a2 Corrects title and adds punch line

There is another option for oneline commits but it will print complete 40 characters of sha.
$ git log --format=oneline -2
4b0deff022231b7f2edf9941cbcf26b0df5a260d Merge branch 'seo_title' into website
7aa136da9c9aa3ad025dad0fee7d6e3bec48d3d2 Merge branch 'visiting_places' into website - 3 way

The other options which can be used with formatter flag are,
$ git log --format=oneline | short | medium | full | fuller | email | raw

You can also define your own pretty formats.

$ git log --pretty=format:"%h - %an, %ar : %s" -2
4b0deff - Alex, 7 days ago : Merge branch 'seo_title' into website
7aa136d - Alex, 7 days ago : Merge branch 'visiting_places' into website - 3 way

Following table taken from Pro Git summarizes different pretty formats.




To view log between two commits 
$ git log --oneline <sha-1_old_commit>..<sha-1_new_commit>
This will log all commit message excluding sha-1_old_commit to sha-1_new_commit included.

E.g.
$ git log --oneline 9bffbda..3c32660
3c32660 Adds last line to visiting places
cb10517 Improves list of visiting places in India
d821d77 Adds header to website

You can also view the commits in which a particular file is changed.
$ git log --oneline header.html
4b0deff Merge branch 'seo_title' into website
63105a2 Corrects title and adds punch line
185c63d Updates header to perform SEO
d821d77 Adds header to website

To view what is changed in a file from a particular commit & before that, use -p (patch) option.
$ git log --oneline -p 63105a2 header.html
63105a2 Corrects title and adds punch line
diff --git a/header.html b/header.html
index bc4e3cf..0258179 100644
--- a/header.html
+++ b/header.html
@@ -1,5 +1,6 @@
 <html>
        <body>
-               <h1>India - Heaven on Mars</h1>
+               <h1>Incredible India - Heaven on Mars</h1>
+               India offers a different aspect of her personality – exotic, extravagant, elegant, eclectic -- to each traveller to the country.
        </body>
 </html>
d821d77 Adds header to website
diff --git a/header.html b/header.html
new file mode 100644
index 0000000..bc4e3cf
--- /dev/null
+++ b/header.html
@@ -0,0 +1,5 @@
+<html>
+       <body>
+               <h1>India - Heaven on Mars</h1>
+       </body>
+</html>

To view commits between a duration use following pair or individual flags with apt values.
  • before, after or
  • since, until 
$ git log --since="2014-12-06'
$ git log --since=2.weeks --until=3.days
$ git log --after='2015-12-06'
$ git log --oneline --after='2015-12-06' --before=1.weeks

To filter commits by author -
$ git log --oneline --author='author name'

To grep commit messages,
$ git log --oneline --grep='Merge'
4b0deff Merge branch 'seo_title' into website
7aa136d Merge branch 'visiting_places' into website - 3 way
fec6334 Merge branch 'hotfix_correct_title' into website

Note - The value of grep is case sensitive. Merge and merge are different.

This is a place where a good commit message will help you. You can use BugFix to find all the commits which are related to some bug fixing process.

This takes a regex. E.g. list all the commit messages which begin with Add -
$ git log --oneline --grep='^Adds'
3c32660 Adds last line to visiting places
d821d77 Adds header to website
9bffbda Adds home page
9f76373 Adds Index file

If you are interested in stats surrounding changes in each commit,
$ git log --stat --summary --oneline
4b0deff Merge branch 'seo_title' into website
7aa136d Merge branch 'visiting_places' into website - 3 way
fec6334 Merge branch 'hotfix_correct_title' into website
3c32660 Adds last line to visiting places
 home.html | 1 +
 1 file changed, 1 insertion(+)
63105a2 Corrects title and adds punch line
 header.html | 3 ++-
......

The best option is to use -
$ git log --graph --oneline --all --decorate


To view changes in a commit,

$ git show HEAD --oneline
4b0deff Merge branch 'seo_title' into website

diff --cc header.html
index 0258179,b1635f6..bcadd56
--- a/header.html
+++ b/header.html
@@@ -1,6 -1,5 +1,6 @@@
  <html>
        <body>
-               <h1>Incredible India - Heaven on Mars</h1>
 -              <h1>India - Heaven on planet Earth</h1>
++              <h1>Incredible India - Heaven on Earth</h1>
 +              India offers a different aspect of her personality – exotic, extravagant, elegant, eclectic -- to each traveller to the country.
        </body>
  </html>

To view what has been changed in a particular file for a specific commit -
$ git show --oneline 3c32660 home.html
3c32660 Adds last line to visiting places
diff --git a/home.html b/home.html
index 26fdd05..9e0455a 100644
--- a/home.html
+++ b/home.html
@@ -9,5 +9,6 @@
                        <li>Hawa Mahal</li>
                        <li>Jantar Mantar</li>
                </ul>
+               Do visit Bengalure, Corbett National Park and Darjeeling.
        </body>
 </html>

I think this is sufficient enough to build awareness about git log, git show, it's switches and knobs.

Do you like this article?