DevOpsMar 28, 2026

Zero-Instrumentation Observability: How eBPF Replaced the Sidecar Fleet

Engineering Team

67% of Kubernetes Teams Have Switched to eBPF Observability

According to the CNCF 2026 Annual Survey, 67% of Kubernetes teams now use eBPF-based tools for at least one observability pillar (metrics, traces, or logs) — up from 29% in 2024 and 41% in 2025. The shift is not gradual anymore; it is a stampede.

The reason is simple: traditional sidecar-based observability (Envoy proxies, OpenTelemetry Collector sidecars, Datadog agents) consumes enormous resources, adds latency to every request, and requires code changes or container modifications to instrument. eBPF does all of it from the kernel — with zero application changes.

What eBPF Is and Why It Matters

eBPF (extended Berkeley Packet Filter) is a technology that allows sandboxed programs to run inside the Linux kernel without changing kernel source code or loading kernel modules. Originally designed for network packet filtering, eBPF has evolved into a general-purpose kernel programmability framework.

How It Works

Write a small program in C or Rust (or use a higher-level framework)
Attach it to a kernel hook point — syscalls, network events, tracepoints, function entries/exits
The kernel’s eBPF verifier checks the program for safety (no infinite loops, no out-of-bounds access, bounded execution)
The JIT compiler translates eBPF bytecode to native machine instructions
The program executes at the hook point with near-native performance

For observability, this means you can intercept every HTTP request, DNS lookup, TCP connection, file system operation, and process execution — without modifying any application code, without injecting sidecars, and without restarting pods.

// Simplified eBPF program to trace HTTP requests
// Attaches to the accept() syscall to capture incoming connections
SEC("kprobe/tcp_v4_connect")
int trace_connect(struct pt_regs *ctx) {
    struct sock *sk = (struct sock *)PT_REGS_PARM1(ctx);
    
    u32 pid = bpf_get_current_pid_tgid() >> 32;
    u16 dport = sk->__sk_common.skc_dport;
    u32 daddr = sk->__sk_common.skc_daddr;
    
    struct event_t event = {
        .pid = pid,
        .dport = ntohs(dport),
        .daddr = daddr,
        .timestamp = bpf_ktime_get_ns(),
    };
    
    bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU,
                          &event, sizeof(event));
    return 0;
}

Why It Matters for Observability

Traditional observability requires instrumentation — adding code, libraries, or sidecar containers to your applications. This approach has three fundamental problems:

Resource overhead — Every sidecar consumes CPU and memory. In a 500-pod cluster with Envoy sidecars, the sidecars themselves can consume 30-40% of total cluster resources.
Coverage gaps — You can only observe what you instrument. Third-party binaries, kernel-level events, and network infrastructure remain blind spots.
Maintenance burden — Every language, framework, and runtime needs its own instrumentation library. Keeping them updated across hundreds of services is a full-time job.

eBPF solves all three: it runs in the kernel (zero application overhead), sees everything the kernel sees (no gaps), and works regardless of application language or framework (instrument once, observe everything).

The eBPF Observability Stack in 2026

The ecosystem has matured into a set of battle-tested tools, each covering a specific observability domain:

Cilium + Hubble: Network Observability

Cilium is the de facto CNI (Container Network Interface) for Kubernetes, used by AWS EKS, Google GKE, and Azure AKS as their default or recommended networking layer. Hubble, Cilium’s observability component, provides:

L3/L4 flow visibility — Every TCP/UDP connection between pods, with source/destination identity, port, latency, and bytes transferred
L7 protocol parsing — HTTP, gRPC, Kafka, DNS, and PostgreSQL request/response parsing without any application changes
Network policy auditing — See which network policies allow or deny traffic in real time
Service dependency mapping — Automatic service graph generation based on observed traffic patterns

Hubble’s UI provides a real-time service map that shows request rates, error rates, and latency percentiles between every service — all derived from kernel-level network observations.

Pixie: Application Performance Monitoring

Pixie (now a CNCF sandbox project) uses eBPF to provide zero-instrumentation APM for Kubernetes workloads:

Automatic protocol tracing — HTTP/1.1, HTTP/2, gRPC, PostgreSQL, MySQL, Redis, Kafka, DNS, AMQP, and NATS request/response capture
Continuous CPU profiling — Flame graphs for every process, generated from eBPF stack traces without any profiling agents
Dynamic logging — Add trace points to running applications without redeploying
Full-body request/response capture — See the actual HTTP request headers and bodies, SQL queries and results, gRPC payloads

Pixie stores all data locally in the cluster (not shipped to a SaaS vendor), retaining up to 24 hours of full-fidelity data in a configurable memory budget.

Tetragon: Runtime Security and Audit

Tetragon (by Isovalent/Cilium) is an eBPF-based security observability and runtime enforcement tool:

Process execution tracking — Every exec, fork, and exit event with full process tree context
File access monitoring — Track reads, writes, and permission changes to sensitive files
Network connection auditing — Log every outbound connection with process context (which binary opened a socket to which IP)
Security policy enforcement — Block suspicious activities in real time (e.g., kill a process that tries to read /etc/shadow)

Tetragon integrates with Kubernetes admission controllers and SIEM systems, providing the audit trail that compliance teams need without any application-level logging code.

Grafana Beyla: Auto-Instrumentation

Grafana Beyla is an eBPF-based auto-instrumentation agent that generates OpenTelemetry-compatible traces and metrics without any code changes:

Detects HTTP, gRPC, SQL, and Redis requests at the kernel level
Emits OpenTelemetry spans with proper trace context propagation
Supports distributed tracing across services (propagates trace IDs through kernel observations)
Integrates with Grafana Cloud, Tempo, Mimir, and any OpenTelemetry-compatible backend

Beyla is particularly useful for teams migrating from sidecar-based OpenTelemetry Collectors: drop in Beyla as a DaemonSet, remove the sidecar containers, and your existing Grafana dashboards keep working.

Splunk OBI: OpenTelemetry eBPF Instrumentation

At KubeCon EU 2026 (March 2026, London), Splunk announced OBI (OpenTelemetry eBPF Instrumentation), a project that contributes eBPF-based auto-instrumentation directly to the OpenTelemetry Collector:

Upstream-first approach — OBI is being contributed to the OpenTelemetry project, not a proprietary Splunk tool
Full OTel compatibility — Generates standard OpenTelemetry signals (traces, metrics, logs) from eBPF observations
Language-agnostic — Works with any language runtime (Go, Java, Python, Node.js, Rust, .NET) without SDK installation
Hybrid mode — Can supplement existing SDK instrumentation with kernel-level data for complete visibility

OBI represents the convergence of the eBPF and OpenTelemetry ecosystems. Instead of choosing between eBPF-native tools and OTel-native tools, you get both in a single pipeline.

Performance: The Numbers That Matter

The resource savings from eBPF-based observability are dramatic. Here are real-world benchmarks from a 500-pod production Kubernetes cluster running a microservices application:

Memory Usage Comparison

Component	Sidecar Approach	eBPF Approach	Savings
Envoy sidecar (500 pods)	50 GB	0 (Cilium CNI)	50 GB
OTel Collector sidecars (500 pods)	15 GB	0 (Beyla DaemonSet)	15 GB
Datadog Agent (DaemonSet, 20 nodes)	10 GB	N/A	10 GB
Cilium agents (20 nodes)	N/A	8 GB	-8 GB
Beyla agents (20 nodes)	N/A	2 GB	-2 GB
Pixie (edge modules, 20 nodes)	N/A	2 GB	-2 GB
Total	75 GB	12 GB	84% reduction

CPU Overhead

Metric	Sidecar Approach	eBPF Approach
Per-request latency added	1-5ms (Envoy proxy hop)	<0.1ms (kernel-level)
CPU overhead per node	8-12%	<1%
Tail latency impact (p99)	+15-30ms	<1ms

Operational Metrics

Metric	Sidecar Approach	eBPF Approach
Containers per pod	2-4 (app + sidecars)	1 (app only)
Pod startup time	5-15s (sidecar init)	1-3s (app only)
Config files to manage	500+ (per-pod sidecar configs)	20 (per-node DaemonSet configs)
Languages requiring SDK	All (per-language OTel SDK)	None (kernel-level)
Blind spots	Non-instrumented services	None (kernel sees all)

Under 1% CPU Overhead: How Is That Possible?

The sub-1% CPU overhead claim is real, verified by independent benchmarks from Isovalent, CNCF, and multiple end-user companies. Here is why eBPF is so efficient:

JIT compilation — eBPF programs are compiled to native machine code by the kernel JIT compiler. They run at near-native speed, not in an interpreter.
Per-CPU maps — Data structures are partitioned per CPU core, eliminating lock contention. Each core writes to its own buffer.
Ring buffers — Events are pushed to user-space through lock-free ring buffers. No system calls needed for each event.
In-kernel aggregation — eBPF programs can aggregate metrics (counters, histograms) in kernel space, sending only summaries to user-space instead of raw events.
Selective attachment — eBPF programs are only invoked at their specific hook points. An HTTP tracing program runs only when HTTP-related syscalls fire, not on every kernel event.

Migration Guide: From Sidecars to eBPF

Migrating from sidecar-based monitoring to eBPF is a phased process. Here is the recommended approach:

Phase 1: Deploy eBPF Tools Alongside Sidecars (Week 1-2)

Install Cilium as your CNI (if not already using it)
Deploy Hubble for network observability
Deploy Beyla as a DaemonSet for auto-instrumented traces
Run both sidecar and eBPF observability in parallel
Compare data quality and coverage

Phase 2: Validate and Tune (Week 3-4)

Verify that eBPF tools capture the same signals as your sidecar stack
Tune Beyla’s protocol detection for your specific services
Configure Hubble’s L7 parsing for your custom protocols
Set up dashboards that mirror your existing sidecar-based dashboards
Alert your team to any coverage gaps

Phase 3: Remove Sidecars Incrementally (Week 5-8)

Start with non-critical services: remove OTel Collector sidecars
Monitor for data quality regressions
Remove Envoy sidecars from services that do not need advanced traffic management
Keep Envoy only for services that need its advanced features (circuit breaking, retries, traffic splitting)

Phase 4: Full eBPF Stack (Week 9-12)

Remove remaining sidecars
Deploy Tetragon for runtime security
Consolidate alerting on eBPF-derived signals
Document the new observability architecture
Reclaim freed resources (you will get 60-80% of sidecar resources back)

Rollback Plan

Keep your sidecar configurations in version control. If eBPF tools miss critical signals for a specific service, you can redeploy sidecars for that service while keeping eBPF for everything else. Hybrid deployments work fine.

Kernel Requirements

eBPF observability tools require a modern Linux kernel. Here are the minimum versions:

Feature	Minimum Kernel	Recommended
Basic eBPF (maps, programs)	4.15	5.15+
BPF ring buffer	5.8	5.15+
BPF CO-RE (compile once, run everywhere)	5.5	5.15+
BTF (BPF Type Format)	5.2	5.15+
BPF LSM (security policies)	5.7	5.15+
BPF iterators	5.8	5.15+

All major Kubernetes distributions (EKS with Amazon Linux 2023, GKE with COS, AKS with Ubuntu 22.04) ship kernels that meet these requirements. If you are running on-premises, ensure your nodes run kernel 5.15 or later for the best eBPF experience.

Frequently Asked Questions

Does eBPF observability work with non-Kubernetes workloads?

Yes. eBPF runs at the Linux kernel level, so it works with any workload — containers, VMs, bare metal, systemd services. Cilium and Tetragon can be deployed outside of Kubernetes, and Beyla supports standalone mode. However, the richest experience is in Kubernetes where tools can correlate kernel events with pod and service metadata.

Can eBPF replace distributed tracing?

For many teams, yes. Beyla and Pixie generate distributed traces from kernel observations, including trace context propagation. However, eBPF traces are limited to request/response boundaries — they cannot trace custom business logic inside your functions. For deep application-level tracing (e.g., “which database query was slow inside this handler”), you still need SDK instrumentation. The recommended approach is eBPF for infrastructure-level tracing plus targeted SDK instrumentation for business-critical paths.

What about encrypted traffic (TLS)?

eBPF tools can trace TLS traffic by attaching to the TLS library functions (e.g., OpenSSL’s SSL_read and SSL_write) rather than the network layer. This captures plaintext data before encryption or after decryption. Pixie, Beyla, and Cilium all support TLS tracing for OpenSSL, BoringSSL, and Go’s crypto/tls. Rust’s rustls support was added in early 2026.

Is eBPF safe? Can a buggy eBPF program crash the kernel?

No. The eBPF verifier is a static analyzer built into the kernel that checks every eBPF program before loading. It rejects programs that could cause infinite loops, out-of-bounds memory access, or other safety violations. A buggy eBPF program will fail to load — it will never crash the kernel. This is a fundamental safety guarantee of the eBPF architecture.

How does eBPF handle high-throughput services (100K+ requests/second)?

eBPF handles high throughput through in-kernel aggregation and sampling. Instead of sending every event to user-space, eBPF programs can compute histograms, counters, and summaries in kernel maps, sending only aggregated data. For full-fidelity tracing at extreme throughput, tools like Pixie use intelligent sampling that captures all error and slow requests while sampling normal requests.

What is the total cost of ownership compared to commercial APM tools?

eBPF observability tools (Cilium, Hubble, Pixie, Beyla, Tetragon) are open-source. The main costs are compute (DaemonSet resources — roughly 12 GB RAM and 2-3 CPU cores for a 500-pod cluster) and engineering time for setup and maintenance. Compared to commercial APM tools (Datadog, New Relic, Splunk) that charge $15-30 per host per month plus ingestion fees, eBPF-based stacks typically cost 70-90% less at scale.

الوسوم

DevOps Docker Kubernetes