When the Network Is Clean: Troubleshooting the Host Network Stack

robertbmacdonald
6 hours ago
9 min read

"...but the network is still slow."

You've dropped everything to jump on an urgent troubleshooting conference call. A vital application has started running slowly and the business is hurting. Everyone looks at you and says "the network is slow!"

I recently shared how I troubleshoot "the network is slow" complaints in under 30 minutes. Using this data-driven method, the routers and switches in the network are systematically checked and verified.

But what happens when the routers and switches are cleared? You present your data with confidence—the network infrastructure is healthy. But the business's critical app is still degraded.

"...but the network is still slow."

This is where many troubleshooting efforts stall. The network team cleared their equipment. The application team insists their code is fine. Everyone is stuck.

But there's still a major component of the system that has not yet been checked - the server. More specifically, the server's network stack.

Here's the systematic approach I use to troubleshoot the host network stack and find where the problem actually lives.

Keeping It Simple (For Now)

For simplicity's sake, I'll use a UDP-based application as the example and only focus on packets being received at the host. UDP is simpler than TCP—no connection state, no retransmissions, no congestion control—which lets us focus on the systematic methodology without getting lost in TCP complexity.

Likewise, this post will not cover advanced network stack technologies like kernel bypass.

Understanding the Packet Journey: Every Queue is a Bottleneck

When troubleshooting host network performance, it's critical to understand that a packet passes through multiple queues and buffers on its journey from the network cable to your application. Each of these is an opportunity for packet loss or latency.

Here's the complete path for an inbound packet:

NIC hardware buffer → Small buffer on the NIC itself
RX ring buffer → Circular buffer in system RAM (DMA target)
SoftIRQ processing → Kernel processes packets from ring buffer
CPU backlog queue → Per-CPU queue between NIC and protocol stack
Protocol stack → IP, UDP/TCP layers, netfilter
Socket receive buffer → Per-socket kernel buffer
Application → Userspace buffer

Each of these has a maximum size (tunable in most cases), specific metrics that show drops or saturation, and different failure modes.

Just like how I analyze every single hop along the network path when validating the network, I must analyze every single hop within the network stack when validating the host.

When I apply the USE Method to the host network stack, I'm systematically checking Utilization, Saturation, and Errors at each of these stages.

Quick refresher on the USE Method:

Utilization: How busy is the resource?
Saturation: How much queued work is waiting?
Errors: Are there failures occurring?

My 8-Step Systematic Approach:

System-wide health - Confirm the host isn't generally unhealthy
NIC hardware - Physical layer and link status
Ring buffers - Where NIC DMAs packets into RAM
SoftIRQ processing - Interrupt handling and distribution
CPU backlog queue - Per-CPU queues before protocol stack
Protocol stack - IP and UDP/TCP processing
Socket layer - Where most UDP problems hide
Correlate and confirm - Tie findings to symptoms

I work through the packet path systematically, checking each buffer and queue along the way. This approach mirrors how I troubleshoot the network itself—hop by hop, checking each component in the path.

Step 1: Quick System-Wide Health Check

Before diving into network stack specifics, I confirm the host isn't just generally unhealthy. A saturated CPU core or memory pressure will cause network performance issues regardless of buffer tuning.

Utilization:

# CPU per core (one saturated core can bottleneck)
mpstat -P ALL 1

# Memory usage
free -h

# Context switching
vmstat 1

Saturation:

# Swap usage, run queue depth, page faults
vmstat 1

Look at:

si/so columns for swap in/out (memory pressure indicators)
r column for run queue depth (how many processes waiting for CPU

Errors:

# Hardware errors
dmesg -T | grep -i error

# System log anomalies
journalctl -p err -n 50

I've seen "network" problems that were actually one CPU core pegged at 100% handling interrupts, or memory pressure causing the kernel to drop packets. If system resources are saturated, fix those first.

Step 2: NIC Hardware and Link Level

I start at the beginning of the packet's journey—the NIC hardware itself and the physical link. I check all NIC error counters:

ethtool -S eth0 | grep -iE "err|miss"

I look for:

rx_crc_errors: Bad packets (cable, SFP, physical issue)
rx_errors / tx_errors: Physical layer issues
rx_length_errors: Malformed frames

Link level checks:

ethtool eth0

I verify:

Speed and duplex match expectations
No auto-negotiation failures
Link is up and stable

Physical layer issues are rare but catastrophic when they occur. If I see errors incrementing or links flapping constantly, I have a hardware problem (cable, SFP, NIC, or switch port). Don't forget to check light levels on optical links.

Step 3: NIC Ring Buffer

The ring buffer is where the NIC places packets into system RAM via DMA (direct memory access).

I check ring buffer sizes:

ethtool -g eth0

I check for ring buffer drops:

ethtool -S eth0 | grep -iE "drop|discard"

I look for:

rx_dropped: Ring buffer full, NIC dropped packets
rx_missed_errors: NIC couldn't handle packet rate
rx_fifo_errors: Hardware buffer overflow at NIC level

UDP workloads often have high packet rates with small packets. If ring buffers are too small, you'll drop packets at the NIC level before the kernel even sees them.

Larger ring buffers absorb micro-bursts better. If current size is less than maximum, you can increase:

ethtool -G eth0 rx 4096 tx 4096

Step 4: SoftIRQ Processing and Interrupt Distribution

I check CPU time spent on interrupts:

mpstat -P ALL 1

I look at the %soft column. High softirq time on one core while others are idle indicates poor interrupt distribution.

I check interrupt distribution:

cat /proc/interrupts | grep eth0

Are NIC interrupts balanced across CPU cores, or is one CPU handling everything?

For UDP at high packet rates, interrupt distribution matters. Poor distribution means one CPU becomes the bottleneck. Good distribution spreads the load across cores.

Step 5: CPU Backlog Queue

Between the NIC processing (SoftIRQ) and the protocol stack sits the CPU backlog queue—a per-CPU queue that holds packets before they're processed by IP/UDP/TCP layers.

This queue is controlled by net.core.netdev_max_backlog and can be a bottleneck for high packet-rate workloads.

I check backlog queue statistics:

cat /proc/net/softnet_stat

What you're looking at: Each line represents one CPU core (starting with core 0). Counters are in hex, not decimal.

Column 1: Total packets received by this CPU
Column 2: Packets dropped because backlog queue was full ← This is critical
Column 3: Times softirqd ran out of time to process all packets (time squeeze)

If column 2 is incrementing over time, I check the current backlog setting and increase it:

sysctl net.core.netdev_max_backlog 
net.core.netdev_max_backlog = 1000 
sysctl -w net.core.netdev_max_backlog=2000

Step 6: Protocol Stack Processing

I next check the protocol stack where IP and UDP processing happens.

Check all relevant counters at once:

nstat -az | grep -iE "drop|discard|err|retrans"

Check interface-level statistics:

ip -s -s link show eth0

I check conntrack table usage and compare against the max:

sysctl net.netfilter.nf_conntrack_count
sysctl net.netfilter.nf_conntrack_max

Key counters I watch:

UDP-specific:

UdpInErrors: Errors processing received UDP packets
UdpInCsumErrors: Checksum errors (corrupted packets)
UdpNoPorts: Packets sent to ports with no listener

IP layer:

IpInDiscards: IP packets discarded (routing issues, policy drops)
IpOutDiscards: Outbound IP packets discarded
IpInAddrErrors: Packets with invalid destination addresses
IpInHdrErrors: Malformed IP headers

Common issues at this layer:

Netfilter/iptables overhead at high packet rates
Connection tracking table full
Routing failures
IP fragment reassembly issues

For most workloads, protocol stack issues are less common. However, corner cases like large amounts of fragmented IP packets can create contention and packet loss within the protocol processing layer. Systematic checking ensures I don't miss anything.

Step 7: Socket Layer

This is where I usually find problems for UDP workloads!

If the application isn't pulling packets out of the socket buffer quicker than packets are being placed in the buffer by the operating system, the buffer fills up and packets are dropped. From the application's perspective, these packets are simply missing! Thus many people immediately jump to "there is something wrong with the network."

View all UDP sockets and their receive queues:

ss -u -a

Check socket statistics summary:

ss -s

Check for socket buffer drops (the smoking gun!):

nstat -az | grep RcvbufErrors

What I look for:

Recv-Q building up in ss -u -a output means the application isn't draining the socket buffer fast enough
RcvbufErrors incrementing means packets are being dropped because the socket buffer is full

I check current socket buffer settings:

sysctl net.core.rmem_default 
sysctl net.core.rmem_max

Example of the problem:

$ nstat -az | grep UdpRcvbufErrors 
UdpRcvbufErrors                 23910              0.0
$ ss -u -a 
State Recv-Q Send-Q Local Address:Port 
UNCONN 212992 0 0.0.0.0:5000

If Recv-Q equals or approaches rmem_max, the socket buffer is full and packets are being dropped.

The key counter: UdpRcvbufErrors - Socket receive buffer full, packets dropped.

When I see UdpRcvbufErrors incrementing, I've found my problem.

Ideally, the application's code base should be reviewed to determine what improvements can be made to inbound packet processing and increase the rate at which packets are retrieved from the socket buffer. However, where this is not possible or practical (e.g. time constraints), socket buffer sizes can be tuned.

The system-wide UDP socket buffer size can be increased, but setting this value can waste memory and is generally considered a crude solution. Instead, the application can explicitly set its socket buffer size with setsockopt().

Step 8: Correlate and Confirm

Once I've worked through all the layers, I correlate what I found with the symptoms.

Does the timing match when problems started? Do the counters increment during problem periods?

I use watch -d -n 1 to see counters change in real-time:

watch -d -n 1 'nstat -az | grep UdpRcvbufErrors'

When I tune one thing, does the problem improve?

I document everything: timestamps, counter values before and after tuning, specific settings changed. This is critical for proving the fix worked and building organizational knowledge.

A Real Example

A client was running a UDP-based application for real-time data streaming. They were seeing intermittent data loss and degraded quality. The network path was clean—my Post #1 methodology confirmed no issues in switches or routers.

Step 1 - System-wide check: CPU had headroom, memory fine, no swap, no page faults.

Step 2 - NIC hardware: No CRC errors, no physical layer issues. Link stable at expected speed.

Step 3 - Ring buffer: No rx_dropped counters. Ring buffers had plenty of capacity.

Step 4 - SoftIRQ: Interrupts well-distributed across cores. No signs of CPU saturation.

Step 5 - CPU backlog: No drops in /proc/net/softnet_stat. Backlog queue was fine.

Step 6 - Protocol stack: Clean. No drops in IP or UDP processing.

Step 7 - Socket layer:

$ nstat -az | grep UdpRcvbufErrors
UdpRcvbufErrors 45892 0.0

Found it.

The counter was incrementing rapidly during heavy load.

Root cause: The application was receiving large amounts of packets in short microbursts. The default UDP socket receive buffer was too small. Packets were being dropped at the socket layer before the application could read them.

The Fix:

Increase maximum socket buffer size:

sysctl -w net.core.rmem_max=8388608

Application was updated to request the larger socket buffer with setsockopt(SO_RCVBUF). Frame drops eliminated.

Total troubleshooting time: 20 minutes.

The systematic approach meant I checked each stage in order. While I found the issue at Step 7, I had already validated that earlier stages (NIC, ring buffer, SoftIRQ, CPU backlog, protocol stack) were all healthy. This gave me confidence that the socket layer was the problem—not a combination of issues.

Why This Systematic Approach Works

When the network is cleared and you're troubleshooting the host, it's easy to get lost in the thousands of tunables and metrics Linux exposes.

The USE Method keeps me focused:

Utilization: Are resources busy?
Saturation: Is work queuing?
Errors: Are there failures?

I apply this at each layer—NIC hardware, ring buffer, SoftIRQ, CPU backlog, protocol stack, socket—and I find the problem.

Following the packet's actual path from the NIC through to the application provides a logical flow. Just like troubleshooting the network hop-by-hop, I'm checking each stage of the host network stack in the order packets actually traverse it.

The power of this approach:

It's methodical, so it's repeatable
It's data-driven, so it's defensible
It works across different workloads and hardware
I check each buffer and queue systematically
I validate earlier stages before moving to later ones

I'm not guessing. I'm systematically checking each stage of the packet's journey, using specific metrics at each point.

What About TCP?

Everything I covered applies to TCP as well, but TCP adds layers of complexity:

Connection state machines
Congestion control algorithms (Cubic, BBR, etc.)
Retransmission logic
Window scaling and buffering
Dozens of TCP-specific tunables
TCPExtTCPRcvCollapsed, TCPMemoryPressures, etc.

TCP stack troubleshooting deserves its own deep-dive, which I will cover in a future post.

Closing

Sometimes the network infrastructure really is fine, and the problem is in the host network stack. Socket buffer exhaustion, NIC ring buffer limits, CPU backlog drops, interrupt distribution—these are the real culprits when "everything looks clean" but performance is terrible.

I help organizations troubleshoot these complex performance issues that span network infrastructure and host configuration. If you're dealing with persistent problems that don't have obvious answers, reach out.

References

Primary Documentation:

Red Hat Enterprise Linux 9 - Tuning the network performance Red Hat, Inc. https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/9/html/monitoring_and_managing_system_status_and_performance/tuning-the-network-performance_monitoring-and-managing-system-status-and-performance
The USE Method Brendan Gregg https://www.brendangregg.com/usemethod.html
Linux Kernel Networking Documentation The Linux Kernel Organization https://www.kernel.org/doc/html/latest/networking/
ss(8) - Linux manual page man7.org (The Linux man-pages project) https://man7.org/linux/man-pages/man8/ss.8.html
ethtool(8) - Linux manual page man7.org (The Linux man-pages project) https://man7.org/linux/man-pages/man8/ethtool.8.html
sysctl(8) - Linux manual page man7.org (The Linux man-pages project) https://man7.org/linux/man-pages/man8/sysctl.8.html
nstat(8) - Linux manual page man7.org (The Linux man-pages project) https://man7.org/linux/man-pages/man8/nstat.8.html