fix(sector): guard zero-Fitch-words empty-vector UB (EW analogue of PR#264)#265
Merged
Conversation
…R#264) Any dataset that collapses to 0 Fitch blocks (total_words == 0) -- not just the all-hierarchy HSJ/xform case PR #264 fixed in wagner_tree/load_tip_states, but also plain equal-weights data where every character is constant or an autapomorphy -- leaves tip_states/prelim/edge_set genuinely empty std::vectors. Code that unconditionally took the address of element 0 of these empty vectors is undefined behaviour: it aborts under hardened libstdc++ assertions/ASan but is silent (out-of-bounds address, never dereferenced if the following block loop is also 0-trip) in a plain release build. Audited every zero-word-reachable site sharing this pattern and guarded each with the same have_words ternary wagner_tree already uses (never bare nullptr+0 arithmetic, which UBSan's pointer-overflow check would still flag): - ts_sector.cpp: compute_from_above_for_sector (called unconditionally on every build_reduced_dataset(), i.e. every sector reduction including the default rasStarts=1 -- the most severe of these, reachable with no non-default control), build_ras_sector (the RAS-restart start-tree builder, reachable via SearchControl(rasStarts>=2)), and build_reduced_dataset_collapsed (reachable via sectorCollapseTarget>0). - ts_prune_reinsert.cpp: build_reduced_dataset and expand_and_reinsert, reachable via pruneReinsertCycles>0 (auto-selected by the "large" preset at nTip>=120). - ts_temper.cpp: stochastic_tbr_phase (annealing), same "large"-preset reachability, needed its own top-level total_words==0 early return matching nni_search/tbr_search/ratchet_search/drift_search's existing style. - ts_tree.cpp: TreeState::save_node_state's prealloc fast path -- a shared primitive under TBR/ratchet/temper, guarded once at the source. - ts_wagner.cpp: wagner_goloboff_scores/wagner_entropy_scores (biased-Wagner tip scoring), also reachable via the "large" preset. Regression tests added to test-ts-sector.R (default control, rasStarts=3, sectorCollapseTarget=6, pruneReinsertCycles=1, annealCycles=1, all through MaximizeParsimony() on a 40-tip all-uninformative EW dataset) and test-ts-wagner.R (ts_wagner_bias_bench directly). Reachability of all 6 sites was confirmed empirically (temporary REprintf probes on each total_words==0 branch, fired under every new test, then removed). Verified the tests are non-tautological, not just reachability-proxy: a negative control reverting build_ras_sector's guard back to the unconditional &rd.data.tip_states[...] reproduces the exact predicted abort (Assertion '__n < this->size()' failed) under a local -D_GLIBCXX_ASSERTIONS -O0 -g build, with a gdb backtrace confirming build_ras_sector (ts_sector.cpp) -> search_sector -> xss_search -> run_single_replicate -> driven_search -> ts_driven_search, i.e. reachable from the public MaximizeParsimony() API. Restoring the guard reruns clean. Full targeted suite (sector, wagner, hsj, xform, anneal, driven, drift, ratchet, tbr, Concordance) passes under the same hardened build with no behavioural regression. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
df442df to
ed59a51
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #264 fixed
wagner_tree/load_tip_statesfor the all-hierarchy HSJ/xformcase where a dataset collapses to 0 Fitch blocks (
total_words == 0), leavingtip_states/prelim/edge_setgenuinely emptystd::vectors. This PR fixesthe same undefined-behaviour class at 6 further sites, all reachable via plain
equal-weights data where every character is constant or an autapomorphy
(no HSJ/xform needed):
ts_sector.cpp:compute_from_above_for_sector(unconditional on everysector reduction — the most severe, reachable at default
rasStarts=1,no non-default control needed),
build_ras_sector(SearchControl(rasStarts>=2)),build_reduced_dataset_collapsed(sectorCollapseTarget>0)ts_prune_reinsert.cpp:build_reduced_dataset,expand_and_reinsert(
pruneReinsertCycles>0, auto-selected by the"large"preset atnTip>=120)ts_temper.cpp:stochastic_tbr_phase(annealing, same"large"-preset reachability)ts_tree.cpp:TreeState::save_node_stateprealloc fast path (shared TBR/ratchet/temper primitive)ts_wagner.cpp:wagner_goloboff_scores/wagner_entropy_scores(biased-Wagner tip scoring)Each guarded with the same
have_wordsternarywagner_treealready uses(never bare
nullptr + 0arithmetic, which UBSan's pointer-overflow checkwould still flag).
Verification
REprintfprobes on each
total_words==0branch fired under every new test before removal.(temporarily reverting
build_ras_sector's guard) reproduces the exactpredicted abort (
Assertion '__n < this->size()' failed) under a local-D_GLIBCXX_ASSERTIONS -O0 -gbuild, with a gdb backtrace confirmingbuild_ras_sector→search_sector→xss_search→run_single_replicate→
driven_search→ts_driven_search— reachable from the publicMaximizeParsimony()API. Restoring the guard reruns clean.ratchet, tbr, Concordance, prune-reinsert) passes under the same hardened
build with no behavioural regression.
cpp-searchtip (incl. #2228c6ae, #3d50dfd4); zerofile overlap, clean rebase, rebuilt and retested post-rebase.
Test plan
-D_GLIBCXX_ASSERTIONS -O0 -gbuild — all touched test files green, no abortCo-Authored-By: Claude Opus 4.8 noreply@anthropic.com