introduce the lease and self-fence mechanism to support high availability of table metadata operation#18094
Open
alpass163 wants to merge 1 commit into
Open
introduce the lease and self-fence mechanism to support high availability of table metadata operation#18094alpass163 wants to merge 1 commit into
alpass163 wants to merge 1 commit into
Conversation
…lity of table metadata operation procedure
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a lease-based self-fencing framework to enable high
availability for table metadata. It prevents DataNodes from serving stale
schema during network partitions.
Problem
Currently, in a cluster deployment (e.g., 1 ConfigNode, 3 DataNodes), metadata operations lack true high availability. If a single DataNode (DN) crashes or experiences a network partition, execution of DDL procedures (such as Create/Alter/Drop Table, View management, TTL adjustments, and Drop Database) will directly fail and trigger a rollback.
Solution
Lease Mechanism
heartbeats. Uses monotonic clock. Self-fences (clears cache, blocks
reads/writes) if no heartbeat received within
metadata_lease_fence_ms(
T_fence).fenced DataNode.
Broadcast Coordination
heartbeat response per DataNode. Separately maintained from load-balancing
samples to ensure correctness.
PROCEEDif allunacked DataNodes have been silent for
T_proceed = T_fence + margin,WAITotherwise,FAILwhen retry budget exhausted.loop, waiting up to
T_proceedfor unresponsive DataNodes to proveself-fenced.
Schema Change Integration
ClusterCachePropagatorbefore proceeding.PreDeleteTsTable) added for safe table statetransitions.
RollbackPreDeleteTablePlan) for failed schemachanges.
Configuration
New config:
metadata_lease_fence_ms(default iniotdb-system.properties.template).Testing
MetadataLeaseManager,DataNodeContactTracker,MetadataBroadcastVerdict,ClusterCachePropagatorDataNodeTableCache,PartitionCache,ClusterAuthorityFetcherIoTDBTableDDLHAITKey Components Added:
Here is the text version of the table completely in English:
MetadataLeaseManager (Location: datanode): Tracks the lease and performs self-fencing upon expiration.
DataNodeContactTracker (Location: confignode): Tracks the last successful heartbeat timestamp for each DataNode.
ClusterCachePropagator (Location: confignode): Broadcasts cache invalidations with a fencing-aware retry mechanism.
MetadataBroadcastVerdict (Location: confignode): Decides when unacknowledged DataNodes are logically safe to skip.
MetadataLeaseFencedException (Location: node-commons): The explicit exception thrown on a fenced DataNode.
PreDeleteTsTable (Location: node-commons): Represents the marker table state to ensure safe schema deletion.