next image
next image
Fernando DoglioJuly 6, 2023

Understanding Sharding: Enhancing Enterprise Performance with Memurai

Technical articles and news about Memurai.

In the world of enterprise applications, scalability and performance are crucial for handling large amounts of data. One powerful technique for achieving these goals is called “sharding”. In this guide, we'll explore the concept of sharding, its relevance in an enterprise scenario, and how to implement it using Memurai, a high-performance Redis-compatible database for Windows.

We'll also discuss the importance of hardware constraints and provide guidance on when to consider sharding based on these factors.

Understanding Sharding:

Sharding is a method of distributing data across multiple servers or nodes, each responsible for a portion of the dataset. By partitioning the data, sharding enables parallel processing, reduces the load on individual servers, and allows for horizontal scalability. This technique is especially beneficial for enterprise applications dealing with massive amounts of data and high traffic.

The diagram shows a very simple representation of this setup.

When sharding is enabled, data is distributed across multiple servers (instead of duplicated) and the client library connects to the right shard based on the internal sharding algorithm (i.e. it knows where the data was saved based on the ID of the record you’re requesting).

Is sharding important in an enterprise scenario?

In an enterprise environment, where large volumes of data need to be processed and accessed concurrently, sharding offers significant advantages. It allows for:

  • Improved performance since each server has less data to process.
  • Faster query response times for the same reason. Each server only holds a portion of the data, making single-shard requests especially fast.
  • Increased throughput.

By distributing the workload across multiple nodes, sharding ensures that the application can handle growing data demands, providing a seamless user experience even during peak loads. In the end, as a developer of a client application, you only need to worry about a single instance (as opposed to keeping track of the list of IPs for each node). The rest will be handled automatically by Memurai’s internal discovery algorithm.

How to implement Sharding with Memurai?

Memurai is a Redis-compatible in-memory database for Windows that supports sharding out of the box. To implement sharding in Memurai, you have to configure a cluster. Once configured, any Memurai client will connect to at least one of the nodes in the cluster, and then it’ll internally discover the rest of them (which is also how it knows which server to target based on the sharding algorithm).

To get the cluster up and running, you have to follow these steps:

Setup the cluster’s nodes

Configure each instance of the cluster by editing the memurai.conf file to add a configuration like this:

port <the instance port>
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout <timeout in milliseconds>
appendonly yes

This basic config will create a new instance on the specified port with the following characteristics: It’ll be in “cluster mode” (thanks to the cluster-enabled yes line). It’ll save the cluster's state inside the nodes.conf file. If several milliseconds pass without being able to reach the majority of the nodes, then It’ll stop itself from accepting requests.

The minimum number of replicas on the cluster must be of 3, but we advise creating a 6-replica cluster, where there are 3 masters and 3 replicas (one for each master).

Keep in mind that if you’re testing this setup on your local machine, you’ll have to use a different value for each cluster-config-file otherwise the next command will fail.

Create the cluster

Once you’ve configured your instances in “cluster mode”, the next thing is to, in fact, create the cluster. You can do that by running the following command:

memurai-cli.exe --cluster create 127.0.0.1:7000 127.0.0.1:7001 \
127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \
--cluster-replicas 1

This command is doing the following:

  • It’s forcing all nodes listed there (on those particular ports and IPs) to be part of the cluster.
  • It’s also specifying one replica per master (as we already discussed).

You should see an output similar to this:

>>> Performing hash slots allocation on 3 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383

This shows that the sharded slots are split into all three servers.

Hardware Constraints and Sharding Considerations:

When deciding whether to implement sharding, hardware constraints play a significant role. Here are some factors to consider:

  1. Storage capacity: If your dataset exceeds the storage capacity of a single server, sharding can help distribute the data across multiple servers.
  2. Memory and CPU utilization: Sharding can help distribute memory and CPU usage across multiple servers, allowing for better utilization and avoiding resource bottlenecks.
  3. Network bandwidth: Consider the network bandwidth available between servers and the impact of sharding on data transfer. If network capacity is a concern, evaluate how sharding will affect communication overhead.
  4. Cost-effectiveness: Assess the costs associated with hardware requirements for sharding, including additional servers and maintenance

Conclusion:

Sharding is a powerful technique for enhancing performance and scalability in enterprise scenarios. By leveraging Memurai's sharding capabilities, you can distribute data across multiple nodes, achieving improved performance and accommodating growing data demands. Consider the hardware constraints and evaluate the suitability of sharding based on factors such as storage capacity, memory and CPU utilization, network bandwidth, and cost-effectiveness. With careful planning and implementation, sharding with Memurai can unlock the full potential of your enterprise applications.

Remember, your feedback is invaluable to us. Let's build a vibrant community of Memurai users who can learn from each other and explore the full potential of this fantastic in-memory database for Windows.

Thank you for choosing Memurai, and we look forward to seeing what you build with it!!