Restrict CybORG player protocol#110
Conversation
|
Similar response to what I put in #110. I think for future arenas, i'm slightly in favor of just giving an agent all of the "bot" code that a human participant would normally receive to make things fair, but this design choice here is very much sound, and I'll respect it. I think this makes a lot of sense. Just one note - CybORG seems like a setting where the models' code is not actually going head to head, but playing against an identical adversary? I think this is ok, technically all the other arenas have models going against each other head to head, in that their code is directly competing, so this is somewhat different. Thinking about it, i think this is ok and great that we have this style of competition included in the arena, but just wanted to point it out and make sure I had the correct understanding. |
|
Yes, that’s correct. CybORG is not direct model-vs-model head-to-head in the same environment. Each submitted policy is evaluated independently on the same seeded DroneSwarm episode batch, and CodeClash compares average rewards. So it is an independent score-maximization arena rather than direct head-to-head competition. I’ll make sure we describe it that way in docs/public wording and avoid implying the models directly interact with each other there. For future simulator arenas, I’ll default to the human-participant bot surface when it is clean and safe, and use restricted policy interfaces when native simulator objects make scoring/isolation too fragile. |
Summary
BaseAgentsubmission surface with a restricteddecide(observation, action_space)policy functionDesign Choice For Review
This intentionally makes CybORG more CodeClash-controlled than a native simulator-agent submission.
Instead of letting submitted code instantiate or mutate CybORG
BaseAgentobjects directly, the arena exposes only a plain policy callback:The tradeoff is deliberate:
@john-b-yang could you sanity-check whether this restricted policy interface is the right CodeClash-compatible shape for CybORG, or whether you would rather expose native CybORG agent classes for more expressivity?
Verification
uv run ruff check codeclash/arenas/cyborg/cyborg.py codeclash/arenas/cyborg/runtime/run_cyborg.py tests/arenas/test_cyborg.pyuv run pytest -q tests/arenas/test_cyborg.py-> 11 passeduv run pytest -q tests/arenas-> 187 passeduv run pre-commit run --files codeclash/arenas/cyborg/cyborg.py codeclash/arenas/cyborg/runtime/README.md codeclash/arenas/cyborg/runtime/cyborg_agent.py codeclash/arenas/cyborg/runtime/run_cyborg.py configs/examples/CybORG__dummy__r1__s2.yaml docs/reference/arenas/cyborg.md tests/arenas/test_cyborg.pydocker build -t codeclash/cyborg -f codeclash/arenas/cyborg/CybORG.Dockerfile .uv run python main.py configs/examples/CybORG__dummy__r1__s2.yaml -o /private/tmp/codeclash-cyborg-final.e3pfFk-> two launcher rounds completed, both players validated, all details hadstatus: "ok",steps_completed: 5,policy_errors: 0codeclash/cyborgand reranconfigs/examples/CybORG__dummy__r1__s2.yaml; both launcher rounds completed withpolicy_errors_total: 0andinvalid_actions_total: 0uv run pytest -q-> 189 passed