Linux capabilities are part of the kernel’s security model and are used to enhance security by reducing the attack surface of applications.
Traditional Privileges vs. Capabilities Traditionally, processes running with root privileges have all or nothing. They have full control over the system. Linux capabilities allow more fine-grained control by dividing these privileges into separate units.
Bounding Set Each process has a set of capabilities known as the bounding set. The bounding set restricts the capabilities a process can gain through user or group privileges.
To check what capabilities is needed by a command, example is ping:
$ which ping
/usr/bin/ping
$ getcap /usr/bin/ping
/usr/bin/ping = cap_net_raw+ep
To get the capabilities needed by a process, example is ssh:
$ which sshd
/usr/sbin/sshd
$ ps -ef | grep /usr/bin/sshd
joseeden 740 1 0 18:29 ? 00:00:00 /usr/bin/sshd =D
$ getpcaps 740
Capabilities for `740': =cap_net_bind_service,cap_net_raw+ep
For each process, there are three sets of capabilities: effective, inheritable, and permitted. These sets determine the actual privileges a process has.
execve()
system call.Processes can drop specific capabilities to reduce their privileges after they have started.
prctl()
system call is often used to manipulate capabilities programmatically.To leverage Linux capabilities in Kubernetes, we can dfeine them under security context in a pod definition file.
apiVersion: v1
kind: Pod
metadata:
name: time-cap-pod
spec:
containers:
- name: ubuntu-container
image: ubuntu:latest
command: ["sleep", "3600"]
securityContext:
capabilities:
add: ["SYS_TIME"]
Similarly, we can also drop Linux capabilities:
apiVersion: v1
kind: Pod
metadata:
name: time-cap-pod
spec:
containers:
- name: ubuntu-container
image: ubuntu:latest
command: ["sleep", "3600"]
securityContext:
capabilities:
add: ["SYS_TIME"]
drop: ["CHOWN"]
To learn more, check out Set capabilities for a Container.