What is MariaDB Sharding

In this MariaDB tutorial, we will discuss the concept of MariaDB Sharding and look at several examples related to it. There are lists of the topic that comes under discussion:

  • MariaDB Sharding
  • MariaDB Sharding Types
  • MariaDB Sharding Table
  • MariaDB Sharding Spider
  • MariaDB Maxscale Sharding
  • MariaDB Galera Cluster Sharding

MariaDB Sharding

In this section, we will discuss the concept of MariaDB Sharding and its detail.

Sharding is a MariaDB technique for dividing a single database server into many pieces. This post will teach you how to shard in the simplest of ways. Each schema is on its own database server, and the schemarouter module in MariaDB MaxScale is used to bring them all together on one database server.

The client will see MariaDB MaxScale is a database server that has all of the schemas from all of the configured servers. The diagram of MariaDB Sharding is given below:

MariaDB Sharding
Diagram of MariaDB Sharding

Open database connection data sources, Table sharding and table federation are supported by MariaDB Enterprise Spider, a store engine for MariaDB Enterprise Server. In the diagram below, the MariaDB Spider Federation is depicted:

MariaDB Sharding Example
MariaDB Spider Storage Engine Example

Explanation of Spider Engine as MariaDB Spider Federation:

  • Remote ES nodes can read and write to tables.
  • Tables from a remote ES node should be migrated.
  • JOIN tables on a local ES node with tables on a remote ES node.

Read: MariaDB Reserved Words

MariaDB Sharding Types

In this section, we will learn about the types of sharding in detail in the MariaDB.

  1. Horizontal or Range Based Sharding

In this case, the data is split based on the value ranges that each object contains. If we save contact information for all online customers, you might choose to keep the information for customers whose last names begin with A-H on one shard and the rest on another.

The downside of this technique is that the customers’ last names may not be adequately disseminated. We may have more customers with names that fall between A and H than consumers with names that fall between I and Z. In this instance, the first shard will be subjected to a greater load than the second, and it may create a bottleneck in the system.

The advantage of this method is that it allows for simple sharding. Because we don’t need to aggregate numerous shards to answer any query, the application layer is easy.

2. Vertical Sharding

In this circumstance, distinct aspects of an entity will be placed in different shards on different machines. A person might have a profile, a list of connections, and a list of articles he’s published in a LinkedIn-like software, for example. In a vertical sharding technique, we might place individual user profiles on one shard, connections on a second shard, and articles on a third shard.

The key advantage of this approach is that it can manage a more critical portion of the data (for example, User Profiles) than a less critical portion of the data (eg; blog posts).

The following are the MariaDB Vertical Sharding’s two biggest drawbacks:

  • My application layer may need to join different shards to receive responses to our query, depending on our system.
  • If our website or system grows, it may become necessary to split a functioning database across multiple servers.

3. Key or Hash-Based Sharding

In this case, an entity has a value (for example, a client program’s IP address) that may be sent through a hash function to generate a hash value. This hash value determines the database server(shard) to use.

Consider the following scenario: each request included an application id that was incremented by one whenever a new application was registered, and we had four database servers.

In this case, you can simply module the application id with the number 4 and use the remainder to select which server the application data should be kept on.

The biggest disadvantage of this sharding is that dynamically removing or adding a database can be time-consuming and costly.

4. Directory-Based Hashing

A lookup service is installed in the front shard database. The lookup service is aware of the current partitioning strategy and keeps track of where each entity in the database shard is located. The lookup service is usually implemented as a web service.

The lookup service is used by the client application to discover which shard (database partition) the entity is on or should be on. The returned shard from the lookup service is then queried and updated.

Read: MariaDB Date_Format

MariaDB Sharding Table

In this section, we will know about MariaDB Sharding Table in detail.

Using table partitioning has more than one advantage. It is feasible to split a table across multiple servers and physical machines since the sub-tables can address a table on another server. This may be required to access data from multiple remote workstations, such as business branch servers, as a single table.

Alternatively, it can simply be used to partition a large table for performance reasons. We use the variant as RANGE COLUMN in the partition method. Let’s use the RANGE partitioning type of the Sharding table in the CREATE TABLE statement by the following query:

CREATE TABLE EMployee_Detail
(
	id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	timestamp DATETIME PRIMARY KEY
	employee_nO INT UNSIGNED,
	ip_Address BINARY(16) NOT NULL,
	action VARCHAR(20) NOT NULL
)
	ENGINE = InnoDB
PARTITION BY RANGE (YEAR(timestamp))
(
	PARTITION partition_0 VALUES LESS THAN (2017),
	PARTITION partition_1 VALUES LESS THAN (2018),
	PARTITION partition_2 VALUES LESS THAN (2019),
	PARTITION partition_3 VALUES LESS THAN (2020)
);

INSERT INTO Employee_Detail (TIMESTAMP,employee_name,ip_address,ACTION)
VALUES('2020-12-14 14:24:36',121,'124.0.2.1','YES');

SELECT * FROM Employee_Detail;

In the preceding query, we have created a table called EMPLOYEE_DETAIL with the primary key as ID and timestamp column. And then we have partition by the RANGE keyword on the TIMESTAMP column by 2017,2018,2019 and 2020.

Then we have inserted one record in the EMPLOYEE_DETAIL by using the INSERT INTO statement. If we want to retrieve all records from the EMPLOYEE_DETAIL by using the SELECT statement.

MariaDB Sharding Table Example
MariaDB Sharding Table Example

MariaDB Sharding Spider

We will learn about the SPIDER engine in sharding in this topic, which will be explained via an example.

The Spider storage engine is a cluster formation storage engine with security characteristics. It supports partitioning and treats tables from many MariaDB instances as if they were all from the same database.

When we create a table with the Spider storage engine, it links to a table on a distant server. For the remote table, any storage engine can be utilised. Establishing a connection between a local MariaDB server and a distant MariaDB server creates the table link. The connectivity between all tables in the same transaction is the same.

The COMMENT and/or CONNECTION clauses are used with the CREATE TABLE statement to connect the information in the remote server when creating a table in the SPIDER storage engine format. The following is an example of a CREATE TABLE statement using the SPIDER storage engine:

EXAMPLE:

CREATE TABLE spider_engine(
  spider_id INT NOT NULL AUTO_INCREMENT,
  user_code VARCHAR(10),
  PRIMARY KEY(spider_id)
)
ENGINE=SPIDER 
COMMENT 'host "127.0.0.1", user "root", password "[email protected]", port "3306"';

In the preceding query, we have created a table called spider_engine with using the engine= SPIDER and passing the value of the local host as 127.0.0.1, user as root, password as [email protected], and port as 3306 in the SELECT statement.

Read: MariaDB Rename Table

MariaDB Maxscale Sharding

In this sub-topic, we will know about MariaDB Maxscale Sharding in detail.

MariaDB MaxScale is a database adapter that boosts MariaDB Server’s availability, scalability, and security while also separating application development from the database architecture.

The max scale sharding feature in MariaDB is a powerful database proxy for MariaDB servers. It sits in between client applications and database servers, conveying client queries and responses. It also monitors the servers to see if any changes to their status or replication technologies are necessary.

Read: MariaDB Queries – Detailed Guide

MariaDB Galera Cluster Sharding

In this topic, we will know about the MariaDB Galera Cluster Sharding in detail.

MariaDB Galera Cluster is a virtually synchronous MariaDB multi-primary cluster. It’s solely for Linux and only works with the InnoDB storage engine (but MyISAM storage engine support is experimental).

The MariaDB Galera Cluster Sharding features are:

  • Any cluster node can be read and written to.
  • Failed nodes are automatically removed from the cluster due to membership control.
  • It provides true row-level parallel replication.

Also, take a look at some more MariaDB tutorials.

In this tutorial, we have discussed the concept of MariaDB Sharding and also look at some samples. There are lists of the topic that comes under discussion:

  • MariaDB Sharding
  • MariaDB Sharding Types
  • MariaDB Sharding Table
  • MariaDB Sharding Spider
  • MariaDB Maxscale Sharding
  • MariaDB Galera Cluster Sharding