Skip to content

introduce the lease and self-fence mechanism to support high availability of table metadata operation#18094

Open
alpass163 wants to merge 1 commit into
apache:masterfrom
alpass163:cyb/metadata-table
Open

introduce the lease and self-fence mechanism to support high availability of table metadata operation#18094
alpass163 wants to merge 1 commit into
apache:masterfrom
alpass163:cyb/metadata-table

Conversation

@alpass163

Copy link
Copy Markdown
Contributor

Summary

This PR introduces a lease-based self-fencing framework to enable high
availability for table metadata. It prevents DataNodes from serving stale
schema during network partitions.

Problem

Currently, in a cluster deployment (e.g., 1 ConfigNode, 3 DataNodes), metadata operations lack true high availability. If a single DataNode (DN) crashes or experiences a network partition, execution of DDL procedures (such as Create/Alter/Drop Table, View management, TTL adjustments, and Drop Database) will directly fail and trigger a rollback.

Solution

Lease Mechanism

  • MetadataLeaseManager (DataNode): Tracks lease via ConfigNode
    heartbeats. Uses monotonic clock. Self-fences (clears cache, blocks
    reads/writes) if no heartbeat received within metadata_lease_fence_ms
    (T_fence).
  • MetadataLeaseFencedException: Thrown when operations are blocked on a
    fenced DataNode.

Broadcast Coordination

  • DataNodeContactTracker (ConfigNode): Records time of last successful
    heartbeat response per DataNode. Separately maintained from load-balancing
    samples to ensure correctness.
  • MetadataBroadcastVerdict: Pure decision logic — PROCEED if all
    unacked DataNodes have been silent for T_proceed = T_fence + margin,
    WAIT otherwise, FAIL when retry budget exhausted.
  • ClusterCachePropagator: Broadcasts cache invalidations with retry
    loop, waiting up to T_proceed for unresponsive DataNodes to prove
    self-fenced.

Schema Change Integration

  • Procedures now propagate metadata invalidations via
    ClusterCachePropagator before proceeding.
  • Pre-deletion marker (PreDeleteTsTable) added for safe table state
    transitions.
  • Rollback mechanism (RollbackPreDeleteTablePlan) for failed schema
    changes.

Configuration

New config: metadata_lease_fence_ms (default in
iotdb-system.properties.template).

Testing

  • Unit tests for MetadataLeaseManager, DataNodeContactTracker,
    MetadataBroadcastVerdict, ClusterCachePropagator
  • Lease integration tests for DataNodeTableCache, PartitionCache,
    ClusterAuthorityFetcher
  • New HA IT: IoTDBTableDDLHAIT

Key Components Added:
Here is the text version of the table completely in English:

MetadataLeaseManager (Location: datanode): Tracks the lease and performs self-fencing upon expiration.

DataNodeContactTracker (Location: confignode): Tracks the last successful heartbeat timestamp for each DataNode.

ClusterCachePropagator (Location: confignode): Broadcasts cache invalidations with a fencing-aware retry mechanism.

MetadataBroadcastVerdict (Location: confignode): Decides when unacknowledged DataNodes are logically safe to skip.

MetadataLeaseFencedException (Location: node-commons): The explicit exception thrown on a fenced DataNode.

PreDeleteTsTable (Location: node-commons): Represents the marker table state to ensure safe schema deletion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant