Pages

Saturday, March 13, 2021

Storage Basics - Object, File and Block

In Simple terms, Object storage is a computer data storage mechanism that manages data as units called objects, as opposed to other storage mechanisms like file storage where data is stored as file hierarchy and block storage which manages data as blockers within sectors and tracks. In this article, we will discuss more details about the storage mechanisms and see how they work.


Object Storage

Object Storage is a collection of data with one unique identifier and amount of metadata that is stored as Objects. Data is managed as units called blocks.


Data : The data can be anything that makes up the object. It can be anything from audio file to photo album.


Identifier : Data that is added to the object storage will get a UUID ( Universally unique identifier )and GUID ( Globally unique identifier ). These 2 identities are unique and 128bit long. These identifiers are very useful when accessing the data from a large set of object storage.


Metadata : this is data about the data or we can call them as labels to data. Metadata is data attached to the original data. This data can be any information that is used to classify or identify the data. Metadata can be taught as labels for the data.


Advantages & disadvantages : The primary advantage using this object storage is the huge amount of data that can be stored. Since the data is unstructured, there is lots of data that can be stored but yet provides an easy way of accessing the data. So though we have large unstructured data stored, we can still access that quite easily. Huge amount of data storage is achieved due to its flat structure - by using GUIDs instead of hierarchies structure like file and block.


The Data is easily accessible with also using the metadata that we attach to the data. This metadata is quite customizable and expanded thus allowing more easier access to the data being stored in object storage.


Backup and Archiving : Since data is quite unstructured, performing backup and archiving the data is quite easy and fast. 


Advantages for object storage include:

Greater data analytics : Since data is driven by metadata and deep level of classification attached to that, analysis or accessing of data is Very good

Infinite scalability : Add as much as data and there is no limit

Faster data retrieval :Since lot of label and metadata attached to data and no specific structuring way , data can be accessed very fast 

Reduction in cost : Cost is very low for storing data

Optimization of resources : Since lot of data is being stored with no limits, resources can be efficiently utilized


Data that benefits the most from object storage includes:

Unstructured data such as music, images, and videos

Backup and log files

Large sets of historical data

Archived files

 

Tools available : Amazon S3 bucket, Microsoft Azure Blob


File Storage

File storage is the simplest storage available to us. In file storage, the data is stored in files. These files in turn are organized in folders or directories in a hierarchical fashion. To access a file, users or machines only need the path from directory to subdirectory to folder to file.


Advantages & disadvantages : Since data is being stored in files and folders, it's quite easy to organise and access the data. The major advantage of using this file storage is sharing and security. The file storage can be shared with multiple people and anyone can set permissions on the files and folders in file storage. This helps in sharing, securing, collaborating and accessing files and folders.


The primary disadvantage with this file storage is, however we plan to increase the data in files and folders at some point it will be very complex to handle the sharing, permissions and security. Things will get more complex with more and more file storage. 


In contrast to block storage, a system with file storage does not take the data of the file apart. The file is stored as a whole and called up again in this form. File level storage other than built in harddrives, we have 2 other types 

Network attached Storage ( NAS ) : Storage system connected to a network and available to all participants of the network. 


Direct Attached Storage ( DAS ) : Storage system directly connected to the Computer in the form of a External System

The other major advantage is the inexpensive storage drives. If you want more storage we can attach the external drive or network drives as file systems to the current machine. The pricing of these drives is very low when compared. 


The major disadvantage is the rising complexity with growing file systems. When the file system grows, managing it will be complex. File storage can be when we need

Local file sharing

Centralized file collaboration

archiving/storing

Backup/disaster recovery


Tools Available : Amazon Elastic File System, Azure Files


Block Storage

The final storage type is the block storage and currently favorite for many cloud based applications. In this type, data is broken into pieces called blocks and then stored across a system that can be physically distributed to maximize efficiency. Each block will have a unique identifier which allows the storage system to put these blocks together when data is needed. Data is stored in fixed sized blocks and a unique address serves as a metadata identifying each block


Advantages & disadvantages

The major advantage with this type of storage is the ability to quickly retrieve and modify data that is spanned across locations. Block storage divides the data into blocks and span them across different environments thus creating multiple paths to data. This helps in retrieving the data faster when required. When a user requests for data, the underlying Operating system gathers the blocks and reassembles them into one data block and provides that to the application. The Server operating system will be responsible for separating storage by fixed sized blocks, spanning them to different environments, reassembling them when needed. This reassembling will be done by using the Server address of blocks. Protocols like Fiber Channel over ethernet(FCoE) , Internet Small Computer System Interface ( iSCSi) etc used to access the block storage data. These are commonly used in a storage area network (SAN) where high performance is required with High I/O and low latency


The primary disadvantages are the ability to add more metadata to the blocks. The other challenge is that this block storage cannot be accessed by multiple participants at the same time unlike File storage.


Tools Available : Azure Managed Disk, Aws Elastic Block Storage


Hope this helps you to understand the basics storage mechanisms

Read More