Container Runtime

The ways in which a container is started governs a lot security implications. It is possible to provide potentially dangerous runtime parameters that might compromise the host and other containers on the host. Verifying container runtime is thus very important. Various recommendations to assess the container runtime are as below:

Do not use privileged containers

Docker supports the addition and removal of capabilities, allowing use of a non-default profile. This may make Docker more secure through capability removal, or less secure through the addition of capabilities. It is thus recommended to remove all capabilities except those explicitly required for your container process.

As seen below when we run the container without the privileged mode, we are unable to change the Kernel parameters but when we run the container in privileged mode using the -privileged flag it is able to change the Kernel Parameters easily, which can cause security vulnerability.

$ docker run -it centos /bin/bash
[root@7e1b1fa4fb89 /]#  sysctl -w net.ipv4.ip_forward=0
sysctl: setting key "net.ipv4.ip_forward": Read-only file system


$ docker run --privileged -it centos /bin/bash
[root@930aaa93b4e4 /]#  sysctl -a | wc -l
sysctl: reading key "net.ipv6.conf.all.stable_secret"
sysctl: reading key "net.ipv6.conf.default.stable_secret"
sysctl: reading key "net.ipv6.conf.eth0.stable_secret"
sysctl: reading key "net.ipv6.conf.lo.stable_secret"
638
[root@930aaa93b4e4 /]# sysctl -w net.ipv4.ip_forward=0
net.ipv4.ip_forward = 0

So, while auditing it should be made sure that all the containers should not have the privileged mode set to true.

$ docker ps -q | xargs docker inspect --format '{{ .Id }}: Privileged={{ .HostConfig.Privileged }}'
930aaa93b4e44c0f647b53b3e934ce162fbd9ef1fd4ec82b826f55357f6fdf3a: Privileged=true

Do not use host network mode on container

This is potentially dangerous. It allows the container process to open low-numbered ports like any other root process. It also allows the container to access network services like Dbus on the Docker host. Thus, a container process can potentially do unexpected things such as shutting down the Docker host. You should not use this option.

When we run the container with network mode as host, it will be able to change all the network configurations of the host which can cause a potential danger to the other running applications.

$ docker run -it --net=host ubuntu /bin/bash
$ ifconfig
docker0   Link encap:Ethernet  HWaddr 02:42:1d:36:0d:0d
          inet addr:172.17.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
          inet6 addr: fe80::42:1dff:fe36:d0d/64 Scope:Link
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:24 errors:0 dropped:0 overruns:0 frame:0
          TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1608 (1.6 KB)  TX bytes:5800 (5.8 KB)

eno16777736 Link encap:Ethernet  HWaddr 00:0c:29:02:b9:13
          inet addr:192.168.218.129  Bcast:192.168.218.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe02:b913/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:4934 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4544 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2909561 (2.9 MB)  TX bytes:577079 (577.0 KB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:1416 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1416 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:128893 (128.8 KB)  TX bytes:128893 (128.8 KB)

$ docker ps -q | xargs docker inspect --format '{{ .Id }}: NetworkMode={{ .HostConfig.NetworkMode }}'
52afb14d08b9271bd96045bebd508325a2adff98dbef8c10c63294989441954d: NetworkMode=host

While auditing it should be checked that all the containers by default should have network mode set to default and not host.

$ docker ps -q | xargs docker inspect --format '{{ .Id }}: NetworkMode={{ .HostConfig.NetworkMode }}'
1aca7fe47882da0952702c383815fc650f24da2c94029b5ad8af165239b78968: NetworkMode=default

Bind incoming container traffic to a specific host interface

If you have multiple network interfaces on your host machine, the container can accept connections on the exposed ports on any network interface. This might not be desired and may not be secured. Many a times a particular interface is exposed externally and services such as intrusion detection, intrusion prevention, firewall, load balancing, etc. are run on those interfaces to screen incoming public traffic. Hence, you should not accept incoming connections on any interface. You should only allow incoming connections from a particular external interface.

As shown below the machine has two network interfaces and by default if we run a nginx container it will get binded to localhost (0.0.0.0) that means this container will be accessible from both the IP address which can result in intrusion attack if any of them are not monitored.

$ ifconfig
docker0   Link encap:Ethernet  HWaddr 02:42:3f:3c:d7:3c
          inet addr:172.17.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eno16777736 Link encap:Ethernet  HWaddr 00:0c:29:02:b9:13
          inet addr:192.168.218.129  Bcast:192.168.218.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe02:b913/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:259 errors:0 dropped:0 overruns:0 frame:0
          TX packets:242 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:148095 (148.0 KB)  TX bytes:27195 (27.1 KB)

eno33554992 Link encap:Ethernet  HWaddr 00:0c:29:02:b9:1d
          inet addr:192.168.218.130  Bcast:192.168.218.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe02:b91d/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:45 errors:0 dropped:0 overruns:0 frame:0
          TX packets:53 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:5931 (5.9 KB)  TX bytes:8452 (8.4 KB)

$ docker run -d -p 4915:80 nginx
26acfc7851a75c71c1315ee272d35ea56ea724842617074f4bd3a0026b5e4261

$ docker port 26ac
80/tcp -> 0.0.0.0:4915

In order to restrict this we should bind container to one of the IP address using the “-p” flag;

$ docker run -d -p 192.168.218.129:4915:80 nginx
f191f3aaf9052803a46dce1d65e2bf6f44e2c5cc929a713c40931b4a0d871d7e

$ docker ps -q
f191f3aaf905

$ docker port f191
80/tcp -> 192.168.218.129:4915

Do not share the host’s process namespace

PID namespace provides separation of processes. The PID Namespace removes the view of the system processes, and allows process ids to be reused including PID 1. If the host’s PID namespace is shared with the container, it would basically allow processes within the container to see all of the processes on the host system. This breaks the benefit of process level isolation between the host and the containers. Someone having access to the container can eventually know all the processes running on the host system and can even kill the host system processes from within the container. This can be catastrophic. Hence, do not share the host’s process namespace with the containers.

In this section we can see that if the container gets the pid of the host then it actually can access all the system level process of the host and can kill them as well causing potential threat. So, thus while auditing it should be checked that PID Mode should not be set to host for all the containers.

docker run -it --pid=host ubuntu /bin/bash
$ ps -ef
UID         PID   PPID  C STIME TTY          TIME CMD
root          1      0  0 12:26 ?        00:00:03 /sbin/init auto noprompt
root          2      0  0 12:26 ?        00:00:00 [kthreadd]
root          3      2  0 12:26 ?        00:00:00 [ksoftirqd/0]
root          5      2  0 12:26 ?        00:00:00 [kworker/0:0H]
root          7      2  0 12:26 ?        00:00:01 [rcu_sched]
root          8      2  0 12:26 ?        00:00:00 [rcu_bh]

$ docker ps -q | xargs docker inspect --format '{{ .Id }}: PidMode={{ .HostConfig.PidMode }}'
e42faa09133dd717d50da59af516dd3410db3889ffb9ef2767438b24a7b96a74: PidMode=host

Do not mount sensitive host system directories on containers

If sensitive directories are mounted in read-write mode, it would be possible to make changes to files within those sensitive directories. The changes might bring down security implications or unwarranted changes that could put the Docker host in compromised state.

If /run/systemd, sensitive directory is mounted in the container then we can actually shutdown the host from the container itself.

$ docker run -ti -v /run/systemd:/run/systemd centos /bin/bash
[root@1aca7fe47882 /]# systemctl status docker
docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled)
   Active: active (running) since Sun 2015-11-29 12:22:50 UTC; 21min ago
     Docs: https://docs.docker.com
 Main PID: 758
   CGroup: /system.slice/docker.service

[root@1aca7fe47882 /]# shutdown

It can be audited by using the command below which returns the list of current mapped directories and whether they are mounted in read-write mode for each container instance;

$ docker ps -q | xargs docker inspect --format '{{ .Id }}: Volumes={{ .Volumes }} VolumesRW={{ .VolumesRW }}'