Capceiling is a new system call added by IronPenguin in order to support jail and other security measures.
For i386 and x86_64 systems, it is defined as:
#define __NR_capceiling 318 _syscall1(int, capceiling, unsigned int *, capceiling);
or
int capceiling(unsigned int *caps);
The capceiling is the absolute limit of the capabilities a process or it's descendants may acquire by any means even if it exectes an suid-root program. Based on that definition, a process can only reduce it's capceiling through this system call. Any capabilities specified in caps that are currently prohibited will be silently masked out.
caps is currently a pointer to an unsigned 32 bit integer bitmask of capabilities. include/capceiling.h
will specify the bitmask of each capability.
On successful return (retval == 0), caps will be set to the new actual capceiling of a process.
eaxmples:
#include <capceiling.h> int caps = CAP_ALL; int ret; ret = capceiling(&caps); printf("Current capceiling = 0x%08x\n", caps); return 0;
This works since any of CAP_ALL not currently permitted will simply be masked out.
Capceiling does NOT grant a capability. Having a capability set in a process's capceiling only means it can potentially be granted that capability. For that reason, init is started with capceiling == CAP_ALL. If capceiling is ignored, all processes will inherit that and will behave no differently than a system that doesn't have capceiling at all.
If, instead, caps is assigned:
int caps = CAP_NET_BIND;
Then as a result, that process and it's children could potentially bind to a privileged port if granted that capability through fscaps, setuid-root, su, or sudo, but can never gain any other capability. Notably, after this, even ping
won't work (as ping requires CAP_NET_RAW, that's why ping is usually suid-root).