Document how to implement Pod Security Standard (#624)

* docs(guides): add pod security guide and other minor enhancements
This commit is contained in:
Adriano Pezzuto
2022-07-30 21:30:14 +02:00
committed by GitHub
parent a36c7545db
commit f9554d4cae
8 changed files with 323 additions and 103 deletions

View File

@@ -1237,88 +1237,6 @@ A Pod running `internal.registry.foo.tld/capsule:latest` as registry will be all
Any attempt of Alice to use a not allowed `containerRegistries` value is denied by the Validation Webhook enforcing it.
## Assign Pod Security Policies
Bill, the cluster admin, can assign a dedicated Pod Security Policy (PSP) to Alice's tenant. This is likely to be a requirement in a multi-tenancy environment.
The cluster admin creates a PSP:
```yaml
kubectl -n oil-production apply -f - << EOF
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp:restricted
spec:
privileged: false
# Required to prevent escalations to root.
allowPrivilegeEscalation: false
...
EOF
```
Then create a _ClusterRole_ using or granting the said item
```yaml
kubectl -n oil-production apply -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: psp:restricted
rules:
- apiGroups: ['policy']
resources: ['podsecuritypolicies']
resourceNames: ['psp:restricted']
verbs: ['use']
EOF
```
Bill can assign this role to all namespaces in the Alice's tenant by setting it in the tenant manifest:
```yaml
kubectl -n oil-production apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta1
kind: Tenant
metadata:
name: oil
spec:
owners:
- name: alice
kind: User
additionalRoleBindings:
- clusterRoleName: psp:privileged
subjects:
- kind: "Group"
apiGroup: "rbac.authorization.k8s.io"
name: "system:authenticated"
EOF
```
With the given specification, Capsule will ensure that all Alice's namespaces will contain a _RoleBinding_ for the specified _Cluster Role_.
For example, in the `oil-production` namespace, Alice will see:
```yaml
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: 'capsule-oil-psp:privileged'
namespace: oil-production
labels:
capsule.clastix.io/role-binding: a10c4c8c48474963
capsule.clastix.io/tenant: oil
subjects:
- kind: Group
apiGroup: rbac.authorization.k8s.io
name: 'system:authenticated'
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: 'psp:privileged'
```
With the above example, Capsule is forbidding any authenticated user in `oil-production` namespace to run privileged pods and to perform privilege escalation as declared by the Cluster Role `psp:privileged`.
## Create Custom Resources
Capsule grants admin permissions to the tenant owners but is only limited to their namespaces. To achieve that, it assigns the ClusterRole [admin](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#user-facing-roles) to the tenant owner. This ClusterRole does not permit the installation of custom resources in the namespaces.
@@ -1448,10 +1366,10 @@ spec:
> This feature is still in an alpha stage and requires a high amount of computing resources due to the dynamic client requests.
## Taint namespaces
With Capsule, Bill can _"taint"_ the namespaces created by Alice with additional labels and/or annotations. There is no specific semantic assigned to these labels and annotations: they just will be assigned to the namespaces in the tenant as they are created by Alice. This can help the cluster admin to implement specific use cases. As it can be used to implement backup as a service for namespaces in the tenant.
## Assign Additional Metadata
The cluster admin can _"taint"_ the namespaces created by tenant onwers with additional metadata as labels and annotations. There is no specific semantic assigned to these labels and annotations: they will be assigned to the namespaces in the tenant as they are created. This can help the cluster admin to implement specific use cases as, for example, leave only a given tenant to be backuped by a backup service.
Bill assigns additional labels and annotations to all namespaces created in the `oil` tenant:
Assigns additional labels and annotations to all namespaces created in the `oil` tenant:
```yaml
kubectl apply -f - << EOF
@@ -1466,18 +1384,42 @@ spec:
namespaceOptions:
additionalMetadata:
annotations:
capsule.clastix.io/backup: "true"
storagelocationtype: s3
labels:
capsule.clastix.io/tenant: oil
capsule.clastix.io/backup: "true"
EOF
```
When Alice creates a namespace, this will inherit the given label and/or annotation.
When the tenant owner creates a namespace, it inherits the given label and/or annotation:
## Taint services
With Capsule, Bill can _"taint"_ the services created by Alice with additional labels and/or annotations. There is no specific semantic assigned to these labels and annotations: they just will be assigned to the services in the tenant as they are created by Alice. This can help the cluster admin to implement specific use cases.
```yaml
apiVersion: v1
kind: Namespace
metadata:
annotations:
storagelocationtype: s3
labels:
capsule.clastix.io/tenant: oil
kubernetes.io/metadata.name: oil-production
name: oil-production
capsule.clastix.io/backup: "true"
name: oil-production
ownerReferences:
- apiVersion: capsule.clastix.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: Tenant
name: oil
spec:
finalizers:
- kubernetes
status:
phase: Active
```
Bill assigns additional labels and annotations to all services created in the `oil` tenant:
Additionally, the cluster admin can _"taint"_ the services created by the tenant owners with additional metadata as labels and annotations.
Assigns additional labels and annotations to all services created in the `oil` tenant:
```yaml
kubectl apply -f - << EOF
@@ -1491,14 +1433,30 @@ spec:
kind: User
serviceOptions:
additionalMetadata:
annotations:
capsule.clastix.io/backup: "true"
labels:
capsule.clastix.io/tenant: oil
capsule.clastix.io/backup: "true"
EOF
```
When Alice creates a service in a namespace, this will inherit the given label and/or annotation.
When the tenant owner creates a service in a tenant namespace, it inherits the given label and/or annotation:
```yaml
apiVersion: v1
kind: Service
metadata:
name: nginx
namespace: oil-production
labels:
capsule.clastix.io/backup: "true"
spec:
ports:
- protocol: TCP
port: 80
targetPort: 8080
selector:
run: nginx
type: ClusterIP
```
## Cordon a Tenant

View File

Before

Width:  |  Height:  |  Size: 85 KiB

After

Width:  |  Height:  |  Size: 85 KiB

View File

Before

Width:  |  Height:  |  Size: 72 KiB

After

Width:  |  Height:  |  Size: 72 KiB

View File

Before

Width:  |  Height:  |  Size: 106 KiB

After

Width:  |  Height:  |  Size: 106 KiB

View File

Before

Width:  |  Height:  |  Size: 82 KiB

After

Width:  |  Height:  |  Size: 82 KiB

View File

@@ -37,18 +37,18 @@ spec:
In example, the cluster admin is supposed to apply this Kustomization, during the cluster bootstrap that i.e. will reconcile also Flux itself.
All the remaining Reconciliation resources can be children of this Kustomization.
![bootstrap](./kustomization-hierarchy-root-tenants.png)
![bootstrap](./assets/kustomization-hierarchy-root-tenants.png)
### Namespace-as-a-Service
Tenants could have his own set of Namespaces to operate on but it should be prepared by higher-level roles, like platform admins: the declarations would be part of the platform space.
They would be responsible of tenants administration, and each change (e.g. new tenant Namespace) should be a request that would pass through approval.
![no-naas](./flux-tenants-reconciliation.png)
![no-naas](./assets/flux-tenants-reconciliation.png)
What if we would like to provide tenants the ability to manage also their own space the GitOps-way? Enter Capsule.
![naas](./flux-tenants-capsule-reconciliation.png)
![naas](./assets/flux-tenants-capsule-reconciliation.png)
## The ingredients of the recipe

View File

@@ -0,0 +1,258 @@
# Pod Security
In Kubernetes, by default, workloads run with administrative access, which might be acceptable if there is only a single application running in the cluster or a single user accessing it. This is seldomly required and youll consequently suffer a noisy neighbour effect along with large security blast radiuses.
Many of these concerns were addressed initially by [PodSecurityPolicies](https://kubernetes.io/docs/concepts/security/pod-security-policy) which have been present in the Kubernetes APIs since the very early days.
The Pod Security Policies are deprecated in Kubernetes 1.21 and removed entirely in 1.25. As replacement, the [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/) and [Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/) has been introduced. Capsule support the new standard for tenants under its control as well as the oldest approach.
## Pod Security Policies
As stated in the documentation, *"PodSecurityPolicies enable fine-grained authorization of pod creation and updates. A Pod Security Policy is a cluster-level resource that controls security sensitive aspects of the pod specification. The `PodSecurityPolicy` objects define a set of conditions that a pod must run with in order to be accepted into the system, as well as defaults for the related fields."*
Using the [Pod Security Policies](https://kubernetes.io/docs/concepts/security/pod-security-policy), the cluster admin can impose limits on pod creation, for example the types of volume that can be consumed, the linux user that the process runs as in order to avoid running things as root, and more. From multi-tenancy point of view, the cluster admin has to control how users run pods in their tenants with a different level of permission on tenant basis.
Assume the Kubernetes cluster has been configured with [Pod Security Policy Admission Controller](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#podsecuritypolicy) enabled in the APIs server: `--enable-admission-plugins=PodSecurityPolicy`
The cluster admin creates a `PodSecurityPolicy`:
```yaml
kubectl apply -f - << EOF
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp:restricted
spec:
privileged: false
# Required to prevent escalations to root.
allowPrivilegeEscalation: false
EOF
```
Then create a _ClusterRole_ using or granting the said item
```yaml
kubectl apply -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: psp:restricted
rules:
- apiGroups: ['policy']
resources: ['podsecuritypolicies']
resourceNames: ['psp:restricted']
verbs: ['use']
EOF
```
He can assign this role to all namespaces in a tenant by setting the tenant manifest:
```yaml
kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta1
kind: Tenant
metadata:
name: oil
spec:
owners:
- name: alice
kind: User
additionalRoleBindings:
- clusterRoleName: psp:privileged
subjects:
- kind: "Group"
apiGroup: "rbac.authorization.k8s.io"
name: "system:authenticated"
EOF
```
With the given specification, Capsule will ensure that all tenant namespaces will contain a _RoleBinding_ for the specified _Cluster Role_:
```yaml
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: 'capsule-oil-psp:privileged'
namespace: oil-production
labels:
capsule.clastix.io/tenant: oil
subjects:
- kind: Group
apiGroup: rbac.authorization.k8s.io
name: 'system:authenticated'
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: 'psp:privileged'
```
Capsule admission controller forbids the tenant owner to run privileged pods in `oil-production` namespace and perform privilege escalation as declared by the above Cluster Role `psp:privileged`.
As tenant owner, creates a namespace:
```
kubectl --kubeconfig alice-oil.kubeconfig create ns oil-production
```
and create a pod with privileged permissions:
```yaml
kubectl --kubeconfig alice-oil.kubeconfig apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
name: nginx
namespace: oil-production
spec:
containers:
- image: nginx
name: nginx
ports:
- containerPort: 80
securityContext:
privileged: true
EOF
```
Since the assigned `PodSecurityPolicy` explicitly disallows privileged containers, the tenant owner will see her request to be rejected by the Pod Security Policy Admission Controller.
## Pod Security Standards
One of the issues with Pod Secury Policies is that it is difficult to apply restrictive permissions on a granular level, increasing security risk. Also the Pod Security Policies get applied when the request is submitted and there is no way of applying them to pods that are already running. For these, and other reasons, the Kubernetes community decided to deprecate the Pod Secury Policies.
As the Pod Secury Policies get deprecated and removed, the [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/) is used in place. It defines three different policies to broadly cover the security spectrum. These policies are cumulative and range from highly-permissive to highly-restrictive:
- **Privileged**: unrestricted policy, providing the widest possible level of permissions.
- **Baseline**: minimally restrictive policy which prevents known privilege escalations.
- **Restricted**: heavily restricted policy, following current Pod hardening best practices.
Kubernetes provides a built-in [Admission Controller](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#podsecurity) to enforce the Pod Security Standards at either:
1. cluster level which applies a standard configuration to all namespaces in a cluster
2. namespace level, one namespace at a time
For the first case, the cluster admin has to configure the Admission Controller and pass the configuration to the `kube-apiserver` by mean of the `--admission-control-config-file` extra argument, for example:
```yaml
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1beta1
kind: PodSecurityConfiguration
defaults:
enforce: "baseline"
enforce-version: "latest"
warn: "restricted"
warn-version: "latest"
audit: "restricted"
audit-version: "latest"
exemptions:
usernames: []
runtimeClasses: []
namespaces: [kube-system]
```
For the second case, he can just assign labels to the specific namespace he wants enforce the policy since the Pod Security Admission Controller is enabled by default starting from Kubernetes 1.23+:
```yaml
apiVersion: v1
kind: Namespace
metadata:
labels:
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/audit: restricted
name: development
```
## Pod Security Standards with Capsule
According to the regular Kubernetes segregation model, the cluster admin has to operate either at cluster level or at namespace level. Since Capsule introduces a further segregation level (the _Tenant_ abstraction), the cluster admin can implement Pod Security Standards at tenant level by simply forcing specific labels on all the namespaces created in the tenant.
As cluster admin, create a tenant with additional labels:
```yaml
kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta1
kind: Tenant
metadata:
name: oil
spec:
namespaceOptions:
additionalMetadata:
labels:
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
owners:
- kind: User
name: alice
EOF
```
All namespaces created by the tenant owner, will inherit the Pod Security labels:
```yaml
apiVersion: v1
kind: Namespace
metadata:
labels:
capsule.clastix.io/tenant: oil
kubernetes.io/metadata.name: oil-development
name: oil-development
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/audit: restricted
name: oil-development
ownerReferences:
- apiVersion: capsule.clastix.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: Tenant
name: oil
```
and the regular Pod Security Admission Controller does the magic:
```yaml
kubectl --kubeconfig alice-oil.kubeconfig apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
name: nginx
namespace: oil-production
spec:
containers:
- image: nginx
name: nginx
ports:
- containerPort: 80
securityContext:
privileged: true
EOF
```
The request gets denied:
```
Error from server (Forbidden): error when creating "STDIN":
pods "nginx" is forbidden: violates PodSecurity "baseline:latest": privileged
(container "nginx" must not set securityContext.privileged=true)
```
If the tenant owner tries to change o delete the above labels, Capsule will reconcile them to the original tenant manifest set by the cluster admin.
As additional security measure, the cluster admin can also prevent the tenant owner to make an improper usage of the above labels:
```
kubectl annotate tenant oil \
capsule.clastix.io/forbidden-namespace-labels-regexp="pod-security.kubernetes.io\/(enforce|warn|audit)"
```
In that case, the tenant owner gets denied if she tries to use the labels:
```
kubectl --kubeconfig alice-oil.kubeconfig label ns oil-production \
pod-security.kubernetes.io/enforce=restricted \
--overwrite
Error from server (Label pod-security.kubernetes.io/audit is forbidden for namespaces in the current Tenant ...
```

View File

@@ -70,10 +70,18 @@ module.exports = function (api) {
label: 'Upgrading Tenant version',
path: '/docs/guides/upgrading'
},
{
label: 'Multi-tenant GitOps with Flux',
path: '/docs/guides/flux2-capsule'
},
{
label: 'Install on Charmed Kubernetes',
path: '/docs/guides/charmed'
},
{
label: 'Control Pod Security',
path: '/docs/guides/pod-security'
},
{
title: 'Managed Kubernetes',
subItems: [
@@ -90,11 +98,7 @@ module.exports = function (api) {
path: '/docs/guides/managed-kubernetes/coaks'
},
]
},
{
label: 'Flux and Capsule for multi-tenant GitOps',
path: '/docs/guides/flux2-capsule'
}
}
]
},
{