Blog > Introduction to Zookeeper System Design
What is Apache Zookeeper System Design?
Apache Zookeeper is a pillar for so many distributed applications because of its unique features. It uses as coordination between distributed applications. It exposes a simple set of primitives to implement higher-level services for synchronization, configuration, maintenance, groups, and naming. Zookeeper’s design is easy to use and program. It is run on java and has bindings for Java, Python, and C language.
Apache Zookeeper also provides service for distributed open-source centralizes, coordination:
- Maintaining configuration information: Sharing configuration information across all nodes.
- Naming: Name the cluster of 1000s servers
- Providing distributed synchronization: Locals, Barriers, Queues
- Providing Groups Services: Leader selection
Companies Using Zookeeper System Design
Why need Apache Zookeeper System Design
- Coordination Services: The integration/communication of service in a distributed environment.
- Coordination services are complex to get right. They are especially prone to errors such as race conditions and deadlock.
- Race condition – Two or more operations trying to perform the same task.
- Deadlock – Two or more operations have to wait for each other.
- Relieve distributed applications with the responsibility of implementing coordination services from scratch.
“Primitive” Operations in a Distributed System
- Master Election
- One node registers itself as a master and holds a “lock” on that data
- Other nodes cannot become masters until that lock is released
- Only one node is allowed to hold the lock for processing at a time
- Crash Detection
- “Ephemeral” data on a node’s availability automatically goes away if the node disconnects or fails to refresh itself after some time-out period.
- Group Management
- List of outstanding tasks, task assignments
- Updates from any particular client can apply in the order.
- Updates either succeed or fail.
Single System Image
- A client will see the same view of the system. The new server will not accept the connection until it has caught up.
Once an update has succeeded, it will persist and will not be undone.
Rather than allow a client to see very stale data, a server will shut down.
Apache Zookeeper also has the following characteristics:
- It is simple
- Zookeeper is replicated
- It is ordered
- Zookeeper is fast
Apache Zookeeper Design Goals
- A shared hierarchical namespace looks like a standard file system. The namespace consists of data registers- called Znodes, and these are similar to files and directories.
- Zookeeper allows distributed processes to coordinate with each other through a shared hierarchical namespace organized similarly to a standard file system.
- Data will store in memory
- Achieve high throughput and low latency numbers
- High performance
- Used in a large, distributed system
- Highly available
- No single point of failure
- Strictly ordered access
- Unlike a typical file system designed for storage, Zookeeper stores data in memory, which means Zookeeper can achieve high throughput and low latency numbers.
Apache Zookeeper is Replicated
- Zookeeper itself is intended to be replicated over a set of hosts called an ensemble.
- The server that makes up the Zookeeper service must all know about each other.
- They maintain an in-memory image of the state, along with transaction logs and snapshots in a persistent store.
- As long as a majority of the servers are available, the Zookeeper service will be available.
- Zookeeper also stamps each update with a number that reflects the order of all Zookeeper transactions.
- Reflects the order of transactions.
- Used implement higher-level abstractions, such as synchronization primitives.
- Subsequent operations can use the order to implement higher-level abstractions, such as synchronization primitives.
Zookeeper is Fast
- It is especially fast in “read-dominant” workloads.
- Zookeeper applications run on thousands of machines, and it performs best where reads are more common than writes, at ratios of around 10:1.
- Batches together multiple operations to gather
- Either all fail or succeed in their entirety
- Possible to implement transactions
- Others never observe any inconsistent state
Author: SVCIT Editorial Copyright
Silicon Valley Cloud IT, LLC.