Skip to content

Fix small performance regression versus v2.9#214

Closed
tameware wants to merge 157 commits into
dds-bridge:developfrom
tameware:opus-two-percent
Closed

Fix small performance regression versus v2.9#214
tameware wants to merge 157 commits into
dds-bridge:developfrom
tameware:opus-two-percent

Conversation

@tameware

Copy link
Copy Markdown
Collaborator

Regains roughly 2% of performance.

Compare is the speedup-opus branch. Branch is this branch.

solver file           compare_avg   branch_avg cmp/branch note
------ ------------- ------------ ------------ ---------- ---------------
solve  list1000.txt          1.96         1.93      1.02x branch faster
solve  list100.txt           1.97         1.95      1.01x branch faster
solve  list10.txt            4.18         4.16      1.00x equal
solve  list1.txt            12.20        12.00      1.02x branch faster
calc   list1000.txt         10.68        10.58      1.01x branch faster
calc   list100.txt           8.88         8.80      1.01x branch faster
calc   list10.txt           15.50        14.56      1.06x branch faster
calc   list1.txt            39.60        39.00      1.02x branch faster

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to regain a small (~2%) performance regression vs v2.9 by removing redundant/narrowing casts and simplifying a hot-path heuristic branch, reducing unnecessary conversions and work in frequently executed code paths.

Changes:

  • Removes redundant static_cast<unsigned char>(...) / double-cast patterns when indexing rank/relative-rank tables.
  • Simplifies part of weight_alloc_trump_void1() for the “void in trump when trump is led” discard case.
  • Removes unnecessary casts when copying winner/second-best ranks and hands from RelRanksType into Pos.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
library/src/quick_tricks.cpp Simplifies rank-to-index conversion when setting win_ranks from AbsRankType::rank.
library/src/heuristic_sorting/heuristic_sorting.cpp Removes redundant casts in rel_rank indexing and simplifies a discard-weighting branch in a hot heuristic function.
library/src/ab_search.cpp Removes unnecessary casts when updating cached winner/second-best info from thrp->rel[...].

@tameware tameware requested a review from zzcgumn June 28, 2026 14:09
@tameware tameware marked this pull request as ready for review June 28, 2026 14:09
@tameware tameware self-assigned this Jun 28, 2026
int suitAdd;

if (suit == trump)
if (lead_suit == trump) // We pitch

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This confuses me, is this not the same fix as the previous pull request?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like me to investigate and address, or can this be dealt with in merge?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try to rebase or merge this branch with latest develop?

@tameware

tameware commented Jun 29, 2026 via email

Copy link
Copy Markdown
Collaborator Author

tameware and others added 22 commits July 2, 2026 22:06
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Pass the resolved dds_mvp.js path via DDS_MVP_JS for Bazel runfiles, and add focus() to mock DOM elements so clearTestData and pageLoad work when merged with web changes.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Add upload-artifact@v6 steps to Linux, macOS, Windows, and WASM workflows using bazel-testlogs/ with if-no-files-found: ignore.

Co-authored-by: Cursor <cursoragent@cursor.com>
zzcgumn and others added 26 commits July 2, 2026 22:07
Co-authored-by: Cursor <cursoragent@cursor.com>
docs/BUILD_SYSTEM.md — new section “AddressSanitizer and ThreadSanitizer (macOS)” covering:

The three-way clang major coupling (MODULE.bazel, Xcode, .bazelrc TSAN rpath)
Why ASAN uses the full Xcode toolchain vs TSAN using LLVM compile + Xcode runtime
Steps when upgrading LLVM or Xcode
Smoke-test commands
Also corrected the outdated LLVM version note (darwin was listed as 20.1.8; it’s 21.1.8).

.bazelrc — short pointers to BUILD_SYSTEM.md; TSAN rpath comment notes the hardcoded clang/21 must stay in sync.

MODULE.bazel — comment on llvm_versions to update the TSAN rpath and verify Xcode when bumping LLVM.
…arking global arrays as static. Commented out unused SORT_SOLVE_STRENGTH and SORT_SOLVE_STRENGTH_CUTOFF
The BUILD change ensures a stats test links only system_util_stats, not system + system_util_stats. One system variant → one ThreadMgr → one set of mutexes. That matches the intent: a process-wide thread manager.
Wire a Bazel config_setting into DDS_LOCAL_DEFINES so AB stats can be
turned on without --cxxopt=-DDDS_AB_STATS; refresh MODULE.bazel.lock.

Co-authored-by: Cursor <cursoragent@cursor.com>
Pass dtest_effective_threads through to calc batches so calc threading
matches solve and responds to the -n flag.

Co-authored-by: Cursor <cursoragent@cursor.com>
SetResources ignores maxThreads; threading is handled via dtest_effective_threads
and the *N batch APIs. Passing -n only triggered a misleading library warning.

itest.cpp will be removed in a future PR.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The file was excluded from the dtest build, had no Bazel target, and
referenced a nonexistent realMain entry point.

Co-authored-by: Cursor <cursoragent@cursor.com>
Dereference the transposition-table NodeCards pointer to match the
DumpRetrieved(const NodeCards&) signature when DDS_AB_HITS is enabled.

Co-authored-by: Cursor <cursoragent@cursor.com>
Re-add the DDS_AB_HITS dump helpers in dump.cpp so debug_all links
succeed when ab_search logs transposition-table hits.

Co-authored-by: Cursor <cursoragent@cursor.com>
The heuristic extraction refactor changed weight_alloc_trump_void1's
first branch from `lead_suit == trump` to `suit == trump`. Since that
is exhaustive with the following `else if (suit != trump)`, the three
ruffing branches (using the `24 - rank + ...` formula) became dead
code, and trump ruffs were scored with side-suit discard weights
instead. This mis-ordered ruffs, costing alpha-beta cutoffs.

The effect is small for solve but compounds heavily in calc's warm-TT
iterative deepening: calc explored ~34% more nodes than v2.9. Restoring
the original `lead_suit == trump` pitch branch makes the ruffing
branches reachable again and cuts calc time ~25% (gap to v2.9: 1.37x ->
1.02x). Ordering-only change; double-dummy results are unchanged.

Co-authored-by: Cursor <cursoragent@cursor.com>
Per Copilot.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The heuristic/quick-tricks refactor introduced static_cast<unsigned char>
wrappers on values that v2.9 used as signed, changing search behavior:

- make_3 / make_3_ctx: winner[]/second_best[] .hand and .rank were cast
  to unsigned char, turning the -1 "no card" sentinel into 255. This broke
  winner[trump].hand == -1 style checks in QuickTricks, losing cutoffs.
- weight_alloc_trump_void2 / _void3: rel_rank[aggr[suit]][...] indexed
  through static_cast<unsigned char>(aggr[suit]), truncating the 13-bit
  aggregate holding to 8 bits and reading the wrong rel_rank row.
- QuickTricksPartnerHand{Trump,NT}: bit_map_rank index cast the signed
  rank through unsigned char.

With these reverted to v2.9's signed handling, the per-move-generation
ordering trace now matches v2.9 exactly (0 divergences on list1), closing
the residual calc gap to parity. Ordering/pruning-only change; double-dummy
results are unchanged and all library tests pass.

Co-authored-by: Cursor <cursoragent@cursor.com>
Whitespace-only cleanup of misindented if/else-if chains and wrapped
conditions left over from the v2.9 port; no logic change.

Co-authored-by: Cursor <cursoragent@cursor.com>
@tameware tameware force-pushed the opus-two-percent branch from 6234595 to b8143cd Compare July 3, 2026 03:56
@tameware

tameware commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator Author

The rebase was a disaster. I made a new PR #223 instead.

@tameware tameware closed this Jul 3, 2026
@tameware tameware deleted the opus-two-percent branch July 3, 2026 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants