Restrict CybORG player protocol by Muhtasham · Pull Request #110 · CodeClash-ai/CodeClash

Muhtasham · 2026-06-25T00:44:48Z

Summary

replace the CybORG native BaseAgent submission surface with a restricted decide(observation, action_space) policy function
keep the trusted runtime in charge of the CybORG/PettingZoo environment, action validation, scoring, and result-file handling
run submitted policies in isolated per-agent worker processes with startup handshakes, per-decision timeouts, restart-on-timeout behavior, invalid-action clamping, and error details
add validation timeouts, crash-score handling for missing result files, updated starter/docs/config, and tests for the restricted protocol

Design Choice For Review

This intentionally makes CybORG more CodeClash-controlled than a native simulator-agent submission.

Instead of letting submitted code instantiate or mutate CybORG BaseAgent objects directly, the arena exposes only a plain policy callback:

def decide(observation, action_space):
    return 0

The tradeoff is deliberate:

pro: simulator ownership, scoring, validation, and timeouts stay in trusted arena code
pro: submitted code is easier to isolate and failures degrade into logged fallback/default actions instead of corrupting the run
con: policies cannot use the full native CybORG agent API directly

@john-b-yang could you sanity-check whether this restricted policy interface is the right CodeClash-compatible shape for CybORG, or whether you would rather expose native CybORG agent classes for more expressivity?

Verification

uv run ruff check codeclash/arenas/cyborg/cyborg.py codeclash/arenas/cyborg/runtime/run_cyborg.py tests/arenas/test_cyborg.py
uv run pytest -q tests/arenas/test_cyborg.py -> 11 passed
uv run pytest -q tests/arenas -> 187 passed
uv run pre-commit run --files codeclash/arenas/cyborg/cyborg.py codeclash/arenas/cyborg/runtime/README.md codeclash/arenas/cyborg/runtime/cyborg_agent.py codeclash/arenas/cyborg/runtime/run_cyborg.py configs/examples/CybORG__dummy__r1__s2.yaml docs/reference/arenas/cyborg.md tests/arenas/test_cyborg.py
docker build -t codeclash/cyborg -f codeclash/arenas/cyborg/CybORG.Dockerfile .
direct Docker adversarial smoke with invalid-action, infinite-loop, and passive policies: invalid actions were clamped/logged; looping policy timed out per decision; runtime completed and wrote scores
uv run python main.py configs/examples/CybORG__dummy__r1__s2.yaml -o /private/tmp/codeclash-cyborg-final.e3pfFk -> two launcher rounds completed, both players validated, all details had status: "ok", steps_completed: 5, policy_errors: 0
after adding worker startup handshakes: rebuilt codeclash/cyborg and reran configs/examples/CybORG__dummy__r1__s2.yaml; both launcher rounds completed with policy_errors_total: 0 and invalid_actions_total: 0
uv run pytest -q -> 189 passed

john-b-yang · 2026-06-29T20:07:09Z

Similar response to what I put in #110. I think for future arenas, i'm slightly in favor of just giving an agent all of the "bot" code that a human participant would normally receive to make things fair, but this design choice here is very much sound, and I'll respect it. I think this makes a lot of sense.

Just one note - CybORG seems like a setting where the models' code is not actually going head to head, but playing against an identical adversary?

I think this is ok, technically all the other arenas have models going against each other head to head, in that their code is directly competing, so this is somewhat different. Thinking about it, i think this is ok and great that we have this style of competition included in the arena, but just wanted to point it out and make sure I had the correct understanding.

Muhtasham · 2026-06-29T20:51:21Z

Yes, that’s correct. CybORG is not direct model-vs-model head-to-head in the same environment. Each submitted policy is evaluated independently on the same seeded DroneSwarm episode batch, and CodeClash compares average rewards.

So it is an independent score-maximization arena rather than direct head-to-head competition. I’ll make sure we describe it that way in docs/public wording and avoid implying the models directly interact with each other there.

For future simulator arenas, I’ll default to the human-participant bot surface when it is clean and safe, and use restricted policy interfaces when native simulator objects make scoring/isolation too fragile.

Muhtasham added 2 commits June 24, 2026 20:42

Restrict CybORG player protocol

7ea5b6e

Wait for CybORG policy workers before decisions

0b1a7fe

Muhtasham requested a review from john-b-yang June 25, 2026 14:39

john-b-yang merged commit d25fb8a into CodeClash-ai:main Jun 29, 2026
4 checks passed

john-b-yang mentioned this pull request Jun 29, 2026

Restrict SCML player protocol #111

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Restrict CybORG player protocol#110

Restrict CybORG player protocol#110
john-b-yang merged 2 commits into
CodeClash-ai:mainfrom
Muhtasham:feat/cyborg-restricted-protocol

Muhtasham commented Jun 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

john-b-yang commented Jun 29, 2026

Uh oh!

Muhtasham commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Muhtasham commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design Choice For Review

Verification

Uh oh!

Uh oh!

john-b-yang commented Jun 29, 2026

Uh oh!

Muhtasham commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Muhtasham commented Jun 25, 2026 •

edited

Loading