Announcing the Voldemort 1.3 Open Source Release

Vinoth Chandar

•

Vinoth Chandar

•9 min read•advanced•

--

•View Original

ApacheAvroJavaOracle

Overview

The article announces the release of Voldemort 1.3.0, detailing significant performance improvements, new features, and enhanced operability. Key enhancements include a new storage layer, non-blocking client socket checkouts, and support for Avro schema evolution.

What You'll Learn

1

How to implement non-blocking client socket checkouts in Voldemort

2

Why using the new BDB-JE storage layer improves performance

3

How to leverage Avro schema evolution for data management

4

When to utilize the Build-And-Push feature for data loading

Prerequisites & Requirements

Understanding of distributed systems concepts
Familiarity with Hadoop and Avro(optional)

Key Questions Answered

What performance improvements does Voldemort 1.3.0 offer?

Voldemort 1.3.0 introduces several performance enhancements, including a new BDB-JE storage layer that eliminates sorted duplicates, resulting in up to 24x speed improvements for restore and rebalance operations. Additionally, non-blocking client socket checkouts improve latency for put operations.

How does the new Build-And-Push feature work?

The Build-And-Push feature allows users to move data from Hadoop clusters into Voldemort clusters for online serving, leveraging Hadoop's fault tolerance and parallelism to build data files for individual nodes. This process is facilitated through an Azkaban job designed for efficient data loading.

What is the significance of Avro support in Voldemort?

Avro support in Voldemort enables schema evolution, allowing developers to add new fields to existing data stores without breaking compatibility. This is achieved through a new serializer type, facilitating smoother transitions as application logic changes.

What enhancements have been made to server monitoring in Voldemort 1.3.0?

The new release includes improved server monitoring capabilities, such as statistics on streaming operations, insights into server scalability, and new monitoring points for BDB exceptions. These enhancements help in quickly identifying and addressing performance issues.

Key Statistics & Figures

Speed improvement for restore and rebalance operations

up to 24x

This improvement is due to the elimination of sorted duplicates in the new BDB-JE storage layer.

Performance improvement for read-only operations

up to 20%

This is achieved through the mlock(

Technologies & Tools

Database

Berkley Db Java Edition

Used as the new storage engine for Voldemort, improving performance and efficiency.

Data Serialization

Avro

Supports schema evolution for data management in Voldemort.

Security

Kerberos

Provides authentication for fetching data from Kerberized Hadoop grids.

Key Actionable Insights

1
Implement the new BDB-JE storage layer to enhance data processing speeds significantly.
By upgrading to the BDB-JE storage layer, users can eliminate sorted duplicates, which has been shown to provide up to 24x speed improvements for certain operations, thus optimizing overall system performance.

2
Utilize the Build-And-Push feature to streamline data loading from Hadoop to Voldemort.
This feature allows for efficient data management and reduces downtime during data migrations, making it essential for teams working with large datasets.

3
Adopt Avro schema evolution to maintain data integrity during application updates.
This feature allows for backward compatibility, enabling applications to adapt to changing data structures without losing access to older records, which is crucial for long-term data management.

Common Pitfalls

1

Failing to properly migrate data when upgrading to the new storage layer can lead to data inconsistencies.

As the migration path is not straightforward, it is crucial to follow the pre-upgrade instructions carefully to avoid issues with conflicting versions of keys.

Related Concepts

Distributed Systems

Data Management

Performance Optimization

Schema Evolution