Runtime security is the process of providing real-time monitoring or observability capabilities for your host, containers, and applications while they're running. This allows you to detect a variety of threats, such as:
- Privilege escalation attacks through exploiting security bugs.
- The deployment of unauthorized workloads by an attacker.
- Unauthorized access to secrets or other sensitive information.
- The activation of malware that is hidden inside an application.
Falco is designed to detect these and other threats while your services and applications are running. When it detects unwanted behavior, Falco alerts you instantly so you’re informed (and can react!) right away, not after minutes or hours have passed.
You can think of Falco like a set of smart security cameras for your infrastructure: you place the sensors in key locations, they observe what’s going on, and they ping you if they detect harmful behavior.
With Falco, a set of rules define what bad behavior is. You can customize or extend these rules for your needs. The alerts generated by the set of Falco sensors can stay in the local machine, but it is a good practice to export them to a centralized collector.
Yes, Falco can run in almost every Linux kernel, whether it is a bare-metal server or a VM or microVM.
Please check the documentation to learn about kernel versions and more specific deployment restrictions. A list of available drivers can be found here.
No, Falco is deployed once per Linux OS (host OS or guest OS when a hypervisor is involved) - typically as privileged DaemonSet for a deployment in Kubernetes. Falco instruments the Linux kernel (either via a kernel module or eBPF probe) and can therefore monitor everything (e.g. system calls) within each container, because every container scheduled on the same node shares the same kernel.
In addition, Falco can hook into the container runtime and that way associate each kernel event with the exact container (e.g. container id, name, image repository, tags) as well as Kubernetes attributes such as namespace or pod name etc.
Lastly, Falco supports gVisor in GCP. However, please note that when for example microVMs are used in Kubernetes Falco would need to be installed in each microVM / VM guest OS.
Please check the documentation to learn about limitations in kernel instrumentation options for some platforms such as GCP or AWS Fargate.
The most common cause of excessive notifications are noisy rules. Falco ships with a set of default rules, which can be disabled, either individually or by using tags, and default macros, some of them designed to be overridden, depending on the needs and the use case.
There’s also the possibility of configuring a minimum rule priority, used as a threshold to filter out rules with a lower priority (alerts are ignored), and a rate limiter. Take however into account that these options might reduce the visibility of potential threats.
First, make sure Falco is running, either as a service or as a container. Second, the event must be generated on the same host as where Falco is running, otherwise, Falco won’t see it since a different kernel will be serving that process.
Finally, make sure the rule you want to trigger is not too strict and the event is being filtered out. Start by having less parameters in the conditions and keep adding them until the rule is just noise enough. Be also aware that Falco tries to optimize using buffers, so the alert might take some seconds to be displayed.
Here are the system call event types and args supported by the Falco drivers.
By default and for performance reasons, Falco will only consider a subset of them, indicated in the first column of the same table. However, it's possible to make Falco consider all events by using the -A command line switch.
This doesn’t make Falco cover all possible threats automatically. Without the proper rules in place, many of those events will be seen as regular behavior between the processes and the kernel.
No, the k8s set of fields k8s.ns.name and k8s.pod.* (i.e., k8s.pod.name, k8s.pod.id, k8s.pod.labels, and k8s.pod.label.*) are populated with data fetched from the container runtime.
Therefore, they can also be accessed without having the Kubernetes Metadata Enrichment functionality enabled (-k Falco option).
The performance overhead of Falco can have a large variability and typically scales up and down in relation to the amount of load of the server or VM and the workload footprint (e.g. network heavy servers likely cause Falco to consume significantly more CPU).
This is because Falco hooks into kernel syscall tracepoints and the more syscalls invocations occur the more work has to be done, that is, parsing the event in the kernel, sending it to userspace over a ring buffer, parsing in userspace and applying Falco's rule filters. This fact also makes it hard to derive stable performance metrics, as CPU and memory will fluctuate with the workloads it is monitoring.
Options available to tune performance
- Some syscalls are more high-volume than others, perform a cost-benefit analysis according to your organization's threat model and security posture. The list of syscalls that are activated is one of the most significant factors that drive CPU utilization. In addition, there are tricks to craft Falco rules more effectively.
- Contact your organization's SREs and conduct performance tests in your environment early on in order to derive budgets and appropriate limits (CPU and memory used). We recommend to always run Falco in
cgroupsto also not starve the tool on the flip side.
- Memory: Falco allocates a ring buffer for each CPU, the more CPUs you have the more memory is allocated. For high load servers you may even need to increase the size of each buffer to avoid kernel side syscall drops. In addition, Falco builds up process threads state over time and memory increases as a consequence, but at some point should plateau.
Lastly, while the Falco community is constantly improving and optimizing the tool and exposing more settings and options in falco.yaml to customize the deployment, there are factors that are out of reach. Concrete examples include the fact that kernel settings alone or the hardware type can have tremendous impacts on the tool performance even when all else is constant.