Git is a
distributed Version control System. What is a distributed in here. Git has a
feature that is not available in other Version Control systems and that is it
allows developers to work on a project without requiring them to share a common
network.
Much like in other Systems,
Git maintains a Repository locally and developer will make all changes to the
local. Once the developer thinks that changes needs to be pushed, then he
commits changes from the local repository to the remote (main) repository.
The available version
control tools are much like peer-to-peer approach. Git gives us the
client-server approach. Rather than a single, central repository on which
clients synchronize, each peer's working copy of the codebase is a complete
repository
So every Git working
directory in a machine is a full-fledged repository with complete history and
full version tracking capabilities independent to the network access or a
central server
Git when configured
contains 2 data structures. A Stage location (or cache) that caches information
regarding the working directory and next version to be committed. The other one
is a object database
The files when pushed to
the GIT repository are stored in the Object Database. It follows a process when
storing the files,
1) Blob (Binary
large Object) is stored with the contents of the file.
2) A Tree object which holds the structure of the
directory being stored. This describes a snapshot of the source tree. This
contains a list of file names with the blob information that has the file
contents.
3) There exists another object like container which
contains information regarding the commit object corresponding to a particular
release of the data being tracked by Git
The index serves as
connection point between the object database and the working tree.
The above objects
are identified by a SHA-A hash of its contents. The computation is done by GIT
and uses the value for the object name. The object is put into a directory
matching the first two characters of its hash. The rest of the hash is used as
the file name for that object.
The blob objects
are compressed using the Zlib compression. GIT also uses other compression
tools to compress this Zlib blob files. Git servers typically listen on TCP
port 9418
Git also provides
ways to clean objects. Every object in the Git database which is not referred
to may be cleaned up by using a garbage collection command, or automatically.
This is due to the way blobs and objects are linked and references.
Why do we need GIT – Svn vs Git
As we do have many
version control tools available in market? Why do we need to go to Git.
Git as said is
distributed. This is the main difference.
So consider a case,
where you want to go back to 3 years for some code. In other tools , this can
be complex. The repository may be in a different location that we cannot reach
or we cannot commit. Now If you want to make a copy of your code, you have to
literally copy/paste it.
With Git, you do not have
this problem. Your local copy is a repository, and you can commit to it and get
all benefits of source control. When you regain connectivity to the main
repository, you can commit against it.
Some other
differences include,
1) Git has a Clean
command. Every Source control tool dumps extra files , git provides us the
facility to clean these with commands which still need to be available in SVN
2) SVN creates .svn
directories in every single folder (Git only creates one .git
directory). Every script you write, and every grep you do, will need to be
written to ignore these .svn directories.
3) You have to tell
SVN whenever you move or delete something. Git will just figure it out.
4) Ignore semantics
– If you want to ignore a pattern to coming (such as *.pyc), it will be ignored
for all subdirectories. But in SVN it is not possible.
5) GIT allows us to
track content of the files rather than just files
6) Branches in GIT
are light weight and easy to maintain
7) It's
distributed, basically every repository is a branch. It's much easier to
develop concurrently and collaboratively than with Subversion, in my opinion. It
also makes offline development possible.
8) The staging area is awesome, it
allows you to see the changes you will commit, commit partial changes and do
various other stuff.
9) Git repositories are much smaller in file
size than Subversion repositories. There's only one ".git"
directory, as opposed to dozens of ".svn" repositories
10) When we are working with a subversion , we create working copes on the machine by checking-out version. This represents a snapshot in time of what the repository looks like. You update your working copy via updates, and you update the repository via commits.
But with GIT ,we don’t
have a snapshot but a full codebase.
11) Want to check out code
from last 3 months, we don’t need to connect to the remote repository as in SVN
since in git it is available in local only
12) SVN is a single point
of failure. That is when the repository on the remote machine fails all fails
including the code base too but in the case GIT, every developer has his own
repository and there is no single point of failure.
13) SSH with Git – It
allows other developers to ssh to a GIT server on a developer machines and
access the repository. This does not work in this case of SVN
More To come J
No comments :
Post a Comment