Galera Cluster for MySQL is a MySQL multi-master cluster setup that uses the Galera Replication Plugin. The replication is synchronous so that any changes happened at any one master node is immediately replicated to other master nodes as broadcasted transaction commits. This synchronization provides high-availability, high up-time and scalability. All master nodes in the cluster are available both READS/WRITES. The Galera Replication Plugin provides automatic node control to implement dropping of failed nodes from the cluster and rejoining recovered nodes back to the cluster. This prevents data loss and clients can connect to any node, as decided by the Replication Load Balancer. Since changes are synchronized between all nodes, unlike conventional replication, there is no slave lag, lost transactions and client latencies are kept at a minimum level.
Galera Cluster is implemented by activating the Galera Replication Plugin on the MySQL or MariaDB database server. This transactional replication plugin extends the MySQL replication plugin API and is called the Write-Set Replication API or wsrep API. This API creates the interface between the Galera Replication and the MySQL/MariaDB DBMS Engine. It is responsible for producing the multi-master synchronous replication that is also a certification-based replication. In certification-based replication, the Galera Plugin prepares WRITE transactions also called write-sets in each node, that include database row changes and all locks applied to the database at the time of transaction. This write-set is added to the transaction queue and each node certifies this write-set against other write-sets in the applier queue. This certification causes transaction commit and the WRITES are applied to the nodes’ tablespace. Even though this is a “logical synchronization”, since each node needs to certify the write-set independently, the actual writing and committing to the node’s tablespace is also independent and hence asynchronous.
Synchronous Vs. Asynchronous Replication
1. In synchronous replication, any changes happened in one node is applied to other nodes in the cluster with guarantee as certified transactions. In asynchronous replication, changes in master are replicated to slave nodes only upon request from the slave and hence there is no guarantee for the transaction to occur within a certain time limit. If master crashes or if there is network/propagation delay, the data replication is negatively affected.
2. Synchronous replication provides high-availability, since any node crash will not cause data loss or affect data integrity, since other nodes maintain a consistent replica of data and state.
3. Since the replication is transaction based, they are applied in parallel across all nodes.
However, the extra overhead in the form of complexity and distributed locking for synchronous replication cause delays in the cluster, compared to asynchronous replication.
Replication in the Galera Cluster
The wsrep API that powers Galera Replication Plugin uses the database-state model for implementing replication. The data, READ/WRITE/COMMIT etc. locks at a given point of time is the state of that database. When clients perform READ/WRITES, the database state changes. These state changes are translated as write-sets and recorded by the wsrep API as transactions. The nodes synchronize their states by applying these write-set transactions broadcasted through a transaction queue, in the same serial order. It is the Galera Replication Plugin that manages the write-set certification and its replication across the cluster.
The different components of the Galera Replication Plugin are:
- Certification Layer: creates write-sets from database state changes, perform certification checks on these write-sets and ensures that the write-sets can be applied to the nodes in the cluster.
- Replication Layer: Manages the entire replication protocol implementation and controls the cluster.
- Group Communication Framework: Provides a plugin architecture based communication system for the various component groups that connect to the Galera Cluster.
Advantages of Galera Cluster
- High availability due to multiple masters synchronized together.
- True multi-master – read/write to any node.
- Tightly coupled by synchronization – all nodes maintain same state.
- No data loss – changes are transaction committed as and when they occur without delay. So node failures will not affect data integrity.
- Automatic node provisioning.
- Hot standby – no downtime, since failed nodes are compensated by other nodes.