Security
Sandboxing is running code in an isolated environment. Sandbox security is the set of controls that makes that isolation hold when users, notebooks, scripts, or agents run inside the workspace. For a platform operator, it is also a governance question: who may run code, what authority does that code get, and which state can survive the session.
Provider Datalab does not invent a separate sandbox runtime. It builds on Kubernetes and sandbox environments curated by tooling such as Educates. Other Kubernetes-based sandbox runtimes may be integrated later, but the current package is scoped to the Educates-based composition.
- Educates creates the workshop runtime, session namespaces, optional vcluster, IDE, terminal, Docker/registry applications, and workspace ingress.
- Provider Datalab passes security intent into Educates through
spec.security.policy,spec.security.kubernetesAccess,spec.security.kubernetesRole, andspec.security.externalEgress. - Provider Datalab also injects storage and service credentials, persistent workspace volumes, and generated namespace-level NetworkPolicies.
This is a useful sandbox baseline, not a complete isolation model. A session can still do whatever its mounted credentials, Kubernetes role, network access, volumes, and enabled applications allow. The platform operator owns the outer guardrails: ingress authentication, RBAC, Pod Security, NetworkPolicy, quotas, storage policy, backup, retention, and audit.
Where Enforcement Lives
Provider Datalab expresses workspace intent and renders the Educates resources,
credentials, volumes, and generated NetworkPolicies for a Datalab. Educates
runs the workspace runtime and applies its native namespace, token, role,
vcluster, registry, Docker, and session controls.
The platform still enforces the boundary around that runtime. The operator owns the enforcing CNI, admission policy such as Kyverno, ingress authentication and authorization, DNS, storage classes, cloud IAM, bucket policy, backup, retention, audit, and any node isolation for privileged workloads.
Access to a Datalab
Workspace access can be Keycloak-managed by Provider Datalab or delegated to
the platform ingress layer with auth.type: delegated. Delegated mode does not
mean unauthenticated access; it means authentication and authorization are
attached before traffic reaches the workspace runtime.
See the Authentication
guide for the concrete Keycloak-managed and delegated-ingress patterns,
including NGINX oauth2-proxy and APISIX openid-connect examples.
Current Baseline
| Area | Current behavior | Operator check |
|---|---|---|
| Runtime isolation | One Educates runtime namespace per Datalab environment; optional vcluster API for Kubernetes-shaped workflows. | Treat vcluster as API isolation, not stronger Pod isolation. Host namespace policy still matters. |
| Pod security | spec.security.policy maps to Educates namespace security policy: restricted, baseline, or privileged. Default is baseline. |
Decide who may request privileged; it enables Docker support. |
| Kubernetes API access | kubernetesAccess controls whether a token is mounted. Default is enabled with kubernetesRole: edit. |
Use stricter environment defaults where workspace code should not call the API. A vcluster changes the API surface; it is not stronger Pod isolation. |
| Network policy | Provider Datalab renders namespace-level egress policies for all runtime Pods. externalEgress defaults to true; when false, only namespace-local Pod egress is allowed by the generated policies. |
Verify the CNI enforces NetworkPolicy. Put broad allowed egress CIDRs in EnvironmentConfig.data.network.externalEgressCIDRs; put pod/service CIDRs in podCIDRs and serviceCIDR; put cloud metadata and control-plane CIDRs in blacklistIPs; add explicit policies for cross-namespace needs. |
| Ingress auth | auth.type: delegated hands authentication to the platform ingress layer. |
Delegated mode is protected only when the operator attaches external auth/authz policy. |
| Data access | Object storage, service credentials, PVCs, databases, caches, vector stores, and registry state may be exposed to the session. | Scope credentials and define backup, retention, and deletion behavior outside the session. |
What Stays Outside
Some controls cannot be solved by Provider Datalab or Kubernetes NetworkPolicies alone:
- the CNI dataplane must actually enforce NetworkPolicy
- cloud IAM, metadata service settings, bucket policy, backup, retention, and audit remain platform responsibilities
- standard NetworkPolicy does not provide ordered deny rules or FQDN-only egress allowlists
- privileged workloads need node, runtime, and cloud isolation outside namespace policy
- leaked credentials must be rotated and audited; policy does not revoke them
Operator Contract
For production shared clusters, the operator should be able to answer these questions for engineers, users, and governance reviewers:
- Which data and services can this workspace read or write?
- Is Kubernetes API access disabled,
view,edit, oradmin? - Is network egress broad, allowlisted, internal-only, or offline?
- Are Educates and Provider Datalab NetworkPolicies aligned, given that Kubernetes NetworkPolicies are additive?
- Are Pod Security, quotas, storage classes, and privileged/Docker use enforced by policy instead of convention?
- What persists after the session stops, what is backed up, and who can approve deletion?
Related Pages
- Workspace Sessions as Sandboxes explains the session authority model.
- Sandbox Security Measures describes the current NetworkPolicy baseline and the next hardening steps.