Why Flow Offload Breaks QoS (And When It’s Still Worth It)

Why Flow Offload Breaks QoS (And When It’s Still Worth It)

You enable QoS.

You tune your queues.

You finally get latency under control.

Then you enable flow offload.

And suddenly… everything feels wrong again.

This is not a bug. It’s a design trade-off.

What Flow Offload Actually Does

Flow offload exists for one reason:

to bypass the slow parts of the Linux networking stack.

Instead of processing every packet through:

  • netfilter
  • qdisc (QoS)
  • full routing logic

it caches the flow and fast-paths future packets.

This dramatically reduces CPU usage.

And on small routers, that matters.

The Problem: QoS Lives in the Slow Path

QoS in Linux is applied through:

  • qdiscs (CAKE, fq_codel, etc.)
  • skb->priority handling

All of that happens in the normal packet path.

Flow offload skips most of it.

So when a flow is offloaded:

QoS decisions are no longer applied per packet.

What That Looks Like in Practice

You might see:

  • bufferbloat returning
  • latency spikes under load
  • priority traffic losing its advantage

Because once a flow is offloaded:

it is treated as “just packets” again.

Why This Is Not Easily Fixable

At first glance, it seems like a bug.

Why not just apply QoS in the fast path?

Because:

  • flow offload is intentionally simple
  • hardware offload is even more limited
  • per-packet decisions are expensive

You can have:

  • fast path
  • or fine-grained control

But not both at the same time.

The Important Distinction: Classification vs Enforcement

This is where things get interesting.

Even with flow offload:

  • DSCP still exists
  • hardware queues still exist

So while software QoS is bypassed:

hardware QoS can still work.

If your system is aligned (DSCP → CoS → WMM):

  • switch queues still prioritize traffic
  • Wi-Fi still applies WMM

This is why alignment matters more than complex qdiscs.

When Flow Offload Makes Sense

Flow offload is a good choice when:

  • CPU is the bottleneck
  • throughput matters more than perfect fairness
  • you rely on hardware QoS instead of software shaping

In other words:

most real-world SoHo routers.

When It Doesn’t

You should avoid flow offload when:

  • you depend on SQM (CAKE, fq_codel)
  • latency under load is critical
  • bandwidth is already sufficient

Because in those cases:

software control matters more than raw speed.

The RouterWRT Perspective

Instead of trying to make SQM and flow offload work together, RouterWRT takes a different approach:

  • use flow offload for efficiency
  • use hardware QoS for prioritization
  • align DSCP across the system

The goal is not perfect fairness.

It is predictable performance at low CPU cost.

Conclusion

Flow offload doesn’t break QoS by accident.

It replaces it.

It trades per-packet control for efficiency.

And on small devices, that trade-off often makes sense.

The key is understanding the difference:

  • software QoS → precise, expensive
  • hardware QoS → simple, scalable

Once you accept that, the system becomes much easier to reason about.

Leave a Reply

Your email address will not be published. Required fields are marked *