File Compression is very
useful in our daily activities. We use the compression for storing or moving
large files. When we compress file, we actually shrink the files to save disk space.
There are many compression algorithms available to perform the compression.
Once we compress the files, we need to UN compress them to view them again.
Linux provides various tools for performing the compression operations.
Compressing a File ( Gzip )
For performing the
compression we can gzip ("GNU Zip") tool .It takes the arguments like
the name of the file and also files that need to be compressed. The output will
be a file with '.gz' extension. This uses Lempel-Ziv coding (LZ77).
gzip one
We are trying to compress the
"one" file. Once the command is executed, we can see one.gz available
in the same location. The original file will be deleted
Decompressing a File
To access the contents of a
compressed file, use gunzip to decompress it.
Like gzip, gunzip takes as an
argument the name of the file or files to work on. It expands the specified
files, writing the output to new files without the `.gz' extensions, and then
deletes the compressed files.
gunzip one.gz (or) gzip -d one.gz
This command expands the file
`one.gz' and puts it in a new file called `one'; gunzip then deletes the
compressed file, `one.gz'.
Multiple Files can be
concatenated in which gunzip will extract all members at one. For example:
gzip -c one1 > sam.gz
gzip -c one2 >> sam.gz
then gunzip -c sam is equivalent to cat one1 one2
Note: gunzip -c myfile.gz > myfile.txt
Uncompress the file myfile.tz
to the myfile.txt file, however, don't delete the .gz file. This is useful if
you don't want to delete the .gz and keep it with the uncompressed file.
Bzip Compression
bzip2 compresses files using
the Burrows-Wheeler block sorting text compression algorithm, and Huffman
coding. Compression is generally considerably better than that achieved by bzip
command
The syntax will be same as
gzip but the extension will be '.bz2'
bzip2 one
Decompressing a File:
Decompress the file using
bzip2 -d one.bz2
bunzip one.bz2
NOTE: gzip vs. bzip2: bzip2
takes more time to compress and decompress than gzip. bzip2 archival size is
less than gzip.
Zip compression
Zip compression is one of the
most basic one in most operating systems. Compressing files using this zip
command can be done,
zip sam.zip one two
The syntax will be little
different zip {.zip-filename} {filename-to-compress}.In this compression, the
original files are not deleted once the compression is done
Decompressing a File:
Decompress the file using
unzip sam.zip
Viewing The Compressed contents
Zcat: We can use Zcat command
to view the contents in the compressed files without uncompressing them. This
is useful when we want to view but not to write any changes to it.
Zcat can also be used for
viewing the contents of the files that does not have a '.gz' extension. Here is
scenario
[root@vx111a test]# cat one
This is Jagadesh
This Is Kiran
This Is Tarun
This Is Jagan
This Is Madan
This Is Naren
This is Pavan
This Is Bhuvan
[root@vx111a test]# gzip one
[root@vx111a test]# mv one.gz one
[root@vx111a test]# gunzip one
gunzip: one: unknown suffix --
ignored
[root@vx111a test]# zcat one
This is Jagadesh
This Is Kiran
This Is Tarun
This Is Jagan
This Is Madan
This Is Naren
This is Pavan
This Is Bhuvan
Zless & Zmore: We can use
Zless & Zmore commands to view the contents in the compressed files without
uncompressing them. This is useful when we want to view but not to write any
changes to it.
[root@vx111a test]# zcat
filename.gz | more
[root@vx111a test]# zcat
filename.gz | less
(or)
[root@vx111a test]# zless
filename.gz
[root@vx111a test]# zmore
filename.gz
Searching inside the compressed file with zgrep / zegrep
Linux also provides utilities
for searching inside the compressed files using zgrep / zegrep. These commands
are same as 'grep -i filename' where file name is an uncompressed file.
[root@vx111a test]# zgrep -i
pavan one.gz
This is Pavan
Comparison & Difference
We can compare and find
differences in the compressed files using zdiff / zcmp.
[root@vx111a test]# cat >
file1
this is jagadesh
this is sam
[root@vx111a test]# cat file1
this is jagadesh
this is sam
[root@vx111a test]# cat >
file2
this is jagadesh
this is ram
[root@vx111a test]# diff file1
file2
2c2
< this is sam
---
> this is ram
[root@vx111a test]# gzip file1
[root@vx111a test]# gzip file2
[root@vx111a test]# zdiff
file1.gz file2.gz
2c2
< this is sam
---
> this is ram
[root@vx111a test]# zcmp file1.gz
file2.gz
- /tmp/file2.xXFFTg7034 differ:
byte 26, line 2
File Archives
An archive is a single file
that contains a collection of other files, and often directories. Archives are
usually used to transfer or make a backup copy of a collection of files and directories
-- this way; you can work with only one file instead of many. This single file
can be easily compressed as explained in the previous section, and the files in
the archive retain the structure and permissions of the original files.
Tar Ball
Linux provides a utility call
'tar' which can be used to create, list and extract files from archives. The
extension will be '.tar'.
* Creating Archives: Creating an archive of
files.
* Listing Archives: Listing the contents of
an archive.
* Extracting Archives: Extracting the files
from an archive.
Creating a File Archive : Creating
a file achieve is done using,
[root@vx111a test]# tar -vcf
sam.tar file1.gz file2.gz
file1.gz
file2.gz
The syntax will be much like,
tar -zcvf {.tgz-file} {files} : To compress files using gzip
tar -jcvf {.tbz2-file} {files} : To compress files using bzip2
This command creates an
archive file called `sam.tar' containing the `file1.gz and file2.gz' directory
and all of its contents. The original files remains unchanged.
Use the `-z' option to
compress the archive as it is being written. This yields the same output as
creating an uncompressed archive and then using gzip to compress it, but it
eliminates the extra step.
To list the contents of a tar
archive without extracting them, use tar with the `-t' option.
[root@vx111a test]# tar -tvf
sam.tar
-rw-r--r-- root/root 10240 2012-07-04 15:09:33 file1.gz
-rw-r--r-- root/root 48 2012-07-04 15:03:07 file2.gz
Extracting Files from an
Archive
To extract (or unpack) the
contents of a tar archive, use tar with the `-x' ("extract") option.
tar -zxvf {.tgz-file} {files} :
To un compress files using gzip
tar -jxvf {.tbz2-file} {files} :
To un compress files using bzip2
[root@vx111a test]# tar -xvf
sam.tar
file1.gz
file2.gz
Some more additional examples
tar -tf archive.tar : To
Show the contents of the Archive
tar -xvf archive.tar -C /tmp : To Extract a Tar ball into Tmp
tar cvfj archive_name.tar.bz2 dirname : Creating a bzipped tar archive
tar tvf archive_name.tar : Listing an Archive
tar xvf archive_file.tar /path/file : Extract a single file from tar
tar xvfz archive_file.tar.gz /path/file : Extract a single file from gz (
Carefull with the arguments )
tar xvfj archive_file.tar.bz2 /path/file : Extract a single file from bgiz (
Carefull with the arguments )
tar xvf archive_file.tar /path/to/dir/ : extract a Single directory
tar xvf archive_file.tar /path/dir1/ /path/dir2/ : Extract Multiple Directories
tar xvf archive_file.tar --wildcards '*.pl' : Extract all the files with pl extension
tar rvf archive_name.tar newfile : Add a file to the existing tar file
tar -cf - /directory/to/archive/ | wc -c : Tar size
tar --delete -f sam.tar ./sample : Delete a file from the tar
tar --wildcards --delete -f sam.tar './sam*' : Delete similar files from tar using
wild cards
tar --list --verbose --file=music.tar practice : to find out about files in the
directory `practice', in the archive file 'music.tar'
Happy Learning J