Why a distributed database is used & types of distributed data
Many companies have left behind centralized databases in favor of distributed databases (in which the database, as its name implies, is distributed throughout an array of servers in various locations), for a variety of reasons. Let’s look at some of the basic advantages of distributed databases, a typical scenario in which they are used, and the different formats in which data is distributed throughout the system.
Why distributed databases are becoming increasingly popular
Here are the basic reasons why the centralized model is being left behind by many organizations in favor of database distribution:
- Reliability – Building an infrastructure is similar to investing: diversify to reduce your chances of loss. Specifically, if a failure occurs in one area of the distribution, the entire database does not experience a setback.
- Security – You can give permissions to single sections of the overall database, for better internal and external protection.
- Cost-effective – Bandwidth prices go down because users are accessing remote data less frequently.
- Local access – Similarly to #1 above, if there is a failure in the umbrella network, you can still get access to your portion of the database.
- Growth – If you add a new location to your business, it’s simple to create an additional node within the database, making distributionhighly scalable.
- Speed & resource efficiency – Most requests and other interactivity with the database are performed at a local level, also decreasing remote traffic.
- Responsibility & containment – Because any glitches or failures occur locally, the issue is contained and can potentially be handled by the IT staff designated to handle that piece of the company.
Who uses distributed databases?
Often distributed databases are used by organizations that have numerous offices or storefronts in different geographical locations. Typically an individual branch is interacting primarily with the data that pertain to its own operations, with much less frequent need for general company data.
There is an inconsistent need for any central information from the branches in that case. However, the home office of the company still must have a steady influx of information from every location.
To solve that issue, a distributed database usually operates by allowing each location of the company to interact directly with its own database during work hours. During non-peak times, each day, the whole database receives a batch of data from each branch.
Types of distributed data
Distributed data can be divided into five basic types, as outlined below:
Replicated data – Replication of data is used to create additional instances of data in different parts of the database. Using this tactic, a distributed database can avoid excessive traffic because the identical data can be accessed locally.
This form of data is subdivided into two different types: read-only and writable data. Read-only versions allows revisions to the first instance, and then the replications are adjusted accordingly. Writable versions can be adjusted, which then immediately changes the first instance, with various configurations for how and when all replications throughout the system experience the update.
In this type of system, updates can be configured based on how crucial it is that the database have the correct specifics moment by moment (or over whatever time period). Note that replication is especially valuable when you do not need any revisions to appear throughout the system in real time.
This type of data makes it easy to supply data from any section to any other section of the larger database if the latter section’s data is compromised by any type of error. Be aware, though, that with replication, collisions can occur. Safeguards must be in place to prevent/resolve them.
Horizontally fragmented data – This category of data distribution involves the use of primary keys (each of which refers to one record in the database). Horizontal fragmentation is commonly used for situations in which specific locations of a business usually only need access to the database pertaining to their specific branch.
Vertically fragmented data – With vertical fragmentation, primary keys are again utilized. However, in this case, copies of the primary key are available within each section of the database (accessible to each branch). This type of format works well for situations in which a branch of a business and the central location each interact with the same accounts but perhaps in different manners (such as changes to client contact information vs. changes to financial figures).
Reorganized data – Reorganization means that data has been adjusted in one way or another, as is typical for decision-support databases. In some cases there are two distinct systems handling transactions and decision-support. While decision-support systems can be trickier to maintain technically, on-line transaction processing (OLTP) often requires reconfiguration to allow for large amounts of requests.
Separate-schema data – This category of data partitions the database and software used to access it to fit different departments and situations – user data vs. product data, for example. Usually there is overlap between the various databases within this type of distribution.
***
As you can see, distributed databases represent a huge technological advancement. It’s not surprising that companies are shifting away from centralized databases and embracing the distributed model.
By Moazzam Adnan of VPS & cloud hosting provider Atlantic.Net.
The post Why a distributed database is used and types of distributed data appeared first on Atlantic.Net.