Remove DangerousDevelopmentOnlyAuthHandler in Favour of a Test-Only TestAuthHandler
Context
DangerousDevelopmentOnlyAuthHandler was introduced when the platform had no identity provider of its own. Authenticating required bringing an external directory — typically Azure AD — and wiring it into the deployment; there was no "start the stack and log in" path for local development. The no-auth handler filled that gap: it fabricated a constant UserIdentity so developers could run the API, bots, and agent playgrounds against their laptops without provisioning AD first. It lived next to the production handlers under swiss_ai_hub.core.auth.dependencies.* and was re-exported through the public swiss_ai_hub.core.auth interface. Its companion DangerousDevelopmentOnlyAuthSettings exposed the test identity (name, email, oid, roles) as environment-variable-backed Pydantic settings.
Two platform changes obsoleted that use case and made the handler actively harmful to keep:
Keycloak is now part of the default dev stack. The recent introduction of Keycloak as the platform's bundled identity provider means
docker compose updelivers a working auth flow out of the box — no external AD required. The "local dev without any IdP" scenario the handler was built for no longer exists; the honest path (log in to the local Keycloak, get a real JWT) is now shorter than the bypass path was.Keycloak became the ground truth for user and tenant existence. Large parts of the codebase — membership checks (
KeycloakAdminService.is_user_member_of_tenant, per ADRs2026_04_15_sysadmin_implicit_admin_accessand2026_04_15_superuser_added_to_every_new_tenant), user lookups, tenant-group existence (ADR2026_04_15_keycloak_as_tenant_existence_authority), active-tenant attributes, realm-role resolution — now consult Keycloak directly. The no-auth handler was architecturally stuck on the wrong side of that shift: it fabricated a self-containedUserIdentitywith a hardcoded__dangerous_development_only_tenant__that did not exist in Keycloak, a user that was not a member of any real Keycloak group, and roles that were not backed by any realm role. Making that shape work with the current code would have required mocking Keycloak at every call site the handler touches — which is exactly what the test fixtures already do (_build_fake_admin()intesting/auth_utils/user_mocks.pypatches theKeycloakAdminfactory with a stateful in-memory fake that every production auth handler is already tested against). So the no-auth handler's original value proposition — "avoid needing to mock Keycloak" — no longer holds; the mock is required either way.
A third concern made the handler a liability rather than just a redundancy: it was the default fallback in Controller.__init__. self.auth: AuthHandler = auth or DangerousDevelopmentOnlyAuthHandler() silently installed an auth-bypass on any controller whose caller forgot to pass auth. packages/bot/app/main.py in fact used the handler deliberately as a production placeholder because the Controller base required one. Co-locating test helpers with production handlers — under the same import path, re-exported from the same public interface — normalised the exact thing that should stay anomalous, and turned a config mistake into a silent auth bypass.
Decision Drivers
- Local dev now has an honest path: Before bundling Keycloak, skipping authentication locally meant running without any identity at all — there was nothing to log in to. With Keycloak in the default dev stack, "start the stack and log in" is the short path; a no-auth shortcut no longer saves anyone meaningful time.
- A self-contained fake identity contradicts the Keycloak-as-ground-truth model: The platform now answers "does this user exist?", "is this user a member of tenant X?", "does this tenant exist?", and "does this user have the sysadmin realm role?" by asking Keycloak. A handler that fabricates a
UserIdentityplus a fabricated tenant (__dangerous_development_only_tenant__) — without corresponding Keycloak records — fails every one of those checks. Keeping the handler functional would require mocking Keycloak at every call site it touches, which is what the test infrastructure already does. There is no reduction in mocking surface to defend anymore. - Separate safety from convenience: A bypass-auth handler is a legitimate test tool, not a production artefact. Co-locating it with
KeycloakAuthHandlerunderswiss_ai_hub.core.authand re-exporting it through the public interface normalises the anomaly. Moving the replacement underswiss_ai_hub.core.testing— and keeping it out of the production auth__all__— makes the boundary a property of the import path, not a naming convention. Controllermust require an explicit auth handler: The oldself.auth: AuthHandler = auth or DangerousDevelopmentOnlyAuthHandler()fallback silently installed an auth-bypass handler whenever a caller forgot to passauth.TypeErroronauth=Noneis safer than a silent fallback, and no current caller relied on the fallback anyway.- No parallel identity configuration: A
BaseSettingsclass with environment overrides invited per-developer drift ("my localDANGEROUS_DEV_ONLY_AUTH_FAKE_OIDis different from yours") for no real benefit. Plain module-level constants are shorter, deterministic, and impossible to misconfigure. - Tests exercise the real membership pipeline anyway:
TestAuthHandlerdelegates toAuthHandler.build_identity()— the fake admin (already session-autouse intesting/auth_utils/user_mocks.py) supplies the membership data. Tests catch Keycloak-first regressions instead of hiding them behind a self-contained fake.
Decision
Delete DangerousDevelopmentOnlyAuthHandler and DangerousDevelopmentOnlyAuthSettings entirely. Replace them with swiss_ai_hub.core.testing.auth_utils.TestAuthHandler (the bypass handler) and plain module-level constants in test_identity.py (TEST_USER_OID, TEST_USER_NAME, TEST_USER_EMAIL, TEST_USER_ROLES, TEST_TENANT_ID). Make Controller.__init__(auth=...) fail loudly with TypeError when auth is None rather than silently falling back to a bypass handler. The production bot entry point (packages/bot/app/main.py), which used the old handler as a placeholder, switches to KeycloakAuthHandler — if the endpoint is ever reached by an external request the handler fails closed on missing JWT.
The namespace move is the load-bearing part of the change: TestAuthHandler lives under swiss_ai_hub.core.testing.auth_utils and is not re-exported through swiss_ai_hub.core.auth. Production code cannot import it through the auth public interface; test code imports it explicitly through the testing package. The boundary is enforced by what is and isn't in __all__, not by naming conventions or comments.
Consequences
Positive
- No code path exists in a production image that returns a valid
UserIdentitywithout validating credentials. TheDangerousname was a warning; its absence is a guarantee. Controllerconstruction now fails loudly if the caller forgetsauth, eliminating a class of silent bypass bugs.- Test identity is a handful of constants with no environment-variable surface — one file to grep, one pattern to follow, no per-developer drift.
- Tests exercise the real Keycloak-first membership pipeline via the fake admin, catching regressions that the old self-contained
UserIdentitywould have hidden.
Trade-offs
- Local dev requires the bundled Keycloak. Running the API, bots, or agents interactively means booting the Docker dev stack (which already ships Keycloak) and logging in with a real account — the no-login shortcut is gone. This is aligned with the broader "Keycloak is the authority" direction, but developers who had muscle memory for the bypass path need to switch to real logins.
- Migration touched ~60 files (playground tests, agent triggers, process playground, bot playground, API test fixture, controller base, public auth exports). The change is mechanical — imports and constructor call sites — but large in scope; it should be its own PR for reviewability.
- The production bot app now requires
KEYCLOAK_URLto be set at import time (becauseKeycloakAuthHandlerinstantiatesKeycloakSettingsat class definition). This is consistent with every other production service; the bot was the outlier that leaned on the no-auth handler as a placeholder. - Historical documentation still mentions
DangerousDevelopmentOnlyAuthHandlerin the 2025 superuser ADR and in the changelog. Those are history, not guidance, and are not rewritten.
Related Decisions
2026_04_15_sysadmin_implicit_admin_access.md— Establishes authorization-only sysadmin bypass; removing the membership-bypass in the old DangerousDev flow is a direct consequence of treating Keycloak as the sole membership authority.2026_04_15_superuser_added_to_every_new_tenant.md— Makes cross-tenant sysadmin reach work via explicit group membership, whichTestAuthHandlercan rely on in tests via the fake admin.2026_04_15_keycloak_as_tenant_existence_authority.md— The broader "Keycloak is the authority" line this change continues.2025_08_11_global_superuser_authentication.md— Historical; mentions the old handler.
