PCIe Passthrough: NIC Name Instability and MAC Pinning

My network configuration broke after a simple reboot, and the culprit was a NIC that decided to change its name from enp4s0 to enp5s0 for no apparent reason.

I had a VM configured for PCIe passthrough to handle high-throughput traffic, and everything was working perfectly. Then I performed a scheduled maintenance reboot on the Proxmox host. When the system came back up, the VM failed to start because the host’s network bridge was looking for an interface that no longer existed. The hardware hadn’t changed, the cables hadn’t moved, but the kernel had shifted the bus numbering.

What I expected

I expected that once I assigned a physical PCIe NIC to a VM or pinned it to a specific bridge on the host, that identity would be immutable. In a sane world, a piece of hardware plugged into a specific physical slot should maintain a consistent identifier. I assumed the Predictable Network Interface Names (PNI) system in modern Linux kernels was doing exactly what it claimed to do: making names predictable.

What actually happened

Predictable naming is only predictable if the underlying PCI topology is static. It isn’t.

The root cause is that the kernel derives NIC names like enp4s0 from the PCI slot path. If the BIOS or the PCIe switch re-enumerates the bus during POST, the path changes. This is common in systems with multi-root PCIe topologies or when you’re using PCIe switches to expand the number of devices.

I’ve seen this happen after a BIOS update, after adding a new NVMe drive, or even after a random power cycle. The hardware is the same, but the kernel sees it at a different address. If you’re using a standard /etc/network/interfaces file or a Netplan config that references enp4s0, your networking dies the moment that name shifts to enp5s0.

This is the same flavor of instability I ran into with GPUs, which I covered in my post on GPU PCI Address Instability. While a shifting GPU address is a nuisance for VM config, a shifting NIC name is a catastrophic failure because it kills your management access to the node.

The Fix: MAC Pinning

The only way to ensure a NIC keeps the same name regardless of where it sits on the PCI bus is to stop relying on the bus path and start relying on the MAC address. The MAC address is burned into the hardware and doesn’t care about PCIe switches or BIOS enumeration.

I solved this using systemd.link files. This tells the kernel: “Whenever you see a device with this specific MAC address, call it eth0 (or whatever name I want), regardless of where it is on the PCI bus.”

1. Identify your MAC address

First, find the MAC of the unstable interface. I used ip link show while the system was still up.

ip link show
# Look for the line: link/ether xx:xx:xx:xx:xx:xx

2. Create the .link file

I created a configuration file in /etc/systemd/network/. I chose the name 10-lan.link to ensure it loads early in the boot process.

# /etc/systemd/network/10-lan.link
[Match]
MACAddress=xx:xx:xx:xx:xx:xx

[Link]
Name=eth0

3. Update the network configuration

Once the link file is in place, the kernel will rename the interface to eth0 before the networking service starts. I then updated my Proxmox network configuration to use this stable name.

# Edit /etc/network/interfaces
auto eth0
iface eth0 inet manual

auto vmbr0
iface vmbr0 inet static
    address 10.0.0.x/24
    gateway 10.0.0.1
    bridge-ports eth0
    bridge-stp off
    bridge-fd 0

After a reboot, the NIC was consistently named eth0, and the bridge came up without a hitch.

Dealing with the “D3cold” Trap

While fixing the naming was the immediate win, I noticed that some of my passthrough devices were becoming unresponsive after these shifts. If you’re passing through high-end NICs or GPUs, you might hit the D3cold power state issue.

Some devices enter a deep sleep state (D3cold) where they essentially turn off. If the device is in D3cold when the host tries to reset it for passthrough, the reset can fail, and the device becomes “bricked” until a full cold boot (power cycle) occurs. I’ve written about this in more detail in GPU D3cold Power States, but it applies to PCIe NICs as well.

To prevent this on the host, I disable D3cold for the specific device address:

# Replace 0000:08:00.0 with your actual PCI address
echo 0 > /sys/bus/pci/devices/0000:08:00.0/d3cold_allowed

Since this command needs to run at boot, I added it to a simple systemd one-shot service. It’s a small detail, but it prevents the “why is my NIC missing from lspci?” panic after a warm reboot.

Proxmox 8.4+ and the Q35 Requirement

If you are passing this NIC (or any PCIe device) into a VM, the machine type matters. In older versions of Proxmox, i440fx was the default. However, starting with Proxmox 8.4, the q35 machine type is practically mandatory for stable PCIe passthrough.

The i440fx chipset is ancient and doesn’t handle PCIe topologies correctly, which can lead to the very instability I’m describing here. If you’re still using i440fx, you’re essentially asking for bus enumeration errors.

I updated my VM config using the CLI to ensure it’s using q35 and that the device is marked as a PCIe device:

# Set machine type to q35 and pass through the device
qm set <vmid> --machine q35 --hostpci0 <pci_address>,pcie=1

The pcie=1 flag is the part people usually forget. Without it, the VM treats the device as a legacy PCI device, which can cause performance degradation or initialization failures.

Why this matters for production homelabs

If you’re just running a few containers on a single node, you’ll probably never encounter this. But once you move to a multi-node cluster with bare-metal Kubernetes and dedicated hardware for networking or AI (like I do for my production homelab setup), stability is everything.

When you’re automating infrastructure with tools like OpenTofu or ArgoCD, you can’t have a node go offline because a NIC decided to rename itself. It breaks the GitOps loop. You end up spending your Saturday morning in the IPMI console instead of improving your agent orchestration pipelines.

Lessons Learned

The biggest takeaway here is that “Predictable Network Interface Names” are a lie if your hardware topology is dynamic. They are predictable based on the path, not the device.

If you are doing PCIe passthrough or using a complex PCIe switch setup, do not trust enpXsY names. Pin your interfaces by MAC address. It takes five minutes to set up a .link file, and it saves you from the nightmare of a node that refuses to join the cluster after a reboot.

Here is the checklist I use now for any new PCIe device added to the cluster:

MAC Pinning: Create a .link file for any NIC that will be used in a bridge.
Machine Type: Ensure VMs use q35 with pcie=1.
Power States: Check if the device is prone to D3cold issues and disable it if necessary.
BIOS Check: Ensure IOMMU is enabled and “Primary Display” is set to the onboard VGA to prevent the host from trying to grab a passed-through GPU/NIC during POST.

If you’re building out similar high-performance infrastructure and want to avoid these kinds of hardware traps, I offer consulting on infrastructure and predictive maintenance to help get these systems production-ready.

What I expected

What actually happened

The Fix: MAC Pinning

1. Identify your MAC address

2. Create the .link file

3. Update the network configuration

Dealing with the “D3cold” Trap

Proxmox 8.4+ and the Q35 Requirement

Why this matters for production homelabs

Lessons Learned

Related Posts

Proxmox Backup Server: Incremental Backups for Your Whole Cluster

Proxmox Cluster Quorum: How Many Nodes Do You Actually Need

Tailscale Subnet Routers: Accessing Your LAN Without the VPN Headache

Comments