I am experiencing persistent SATA link issues with my ZFS pool, which result in errors such as “SATA link down,” “hard resetting link,” and “link is slow to respond.” These problems cause ZFS pool errors and degrade the entire pool. Initially suspecting a faulty hard drive, I replaced one, but the SATA link issues continue even during the resilvering process. Despite examining system logs, I have been unable to identify the root cause.
My system configuration includes an AsRock B450 Pro4 motherboard with BIOS version 10.41, a Ryzen 5 5600G CPU, four SEAGATE 4TB IronWolf HDDs (different models), two SANDISK 1TB SSDs, Proxmox VE 9.1.1 as the operating system, an Intel ARC A380 GPU for transcoding, and a BeQuiet! Power 11 Platinum 1000W power supply. The ZFS pool has been operational for two months, with intermittent issues. A month ago, I rebuilt the pool from scratch, which resolved the problem temporarily, but the errors resurfaced two weeks ago. A scrub fixed it at that time, but the issues have now returned.
I have attempted several troubleshooting steps without success. These include checking for Aggressive Link Power Management in the BIOS (the option was not present), using three separate SATA power lines for the four drives, and testing multiple SATA cable brands, such as CableMatters. The drives appear to function normally for extended periods before suddenly losing SATA connection and then recovering. Detailed logs and system information are available via the provided links, and I would appreciate any insights to help resolve these recurring errors.
That “SATA link down” error during resilvering is a real nightmare, especially after you already replaced a drive. I had similar intermittent ZFS issues that turned out to be a flaky SATA controller on an older motherboard. Since you mentioned the ARC A380, have you checked if there’s any resource conflict or unusual power draw from the GPU affecting the SATA controller’s stability?
Thanks for sharing your experience with that flaky SATA controller—it’s a great point about looking beyond the drives themselves. Since the ARC A380 is a newer component, it’s worth checking your system’s power supply headroom and also looking in your kernel logs (`dmesg`) for any PCIe or ACPI errors that might hint at a resource conflict with the chipset’s SATA controller. If you try adjusting the GPU’s power profile or testing with it temporarily removed, I’d be curious to hear if that changes the SATA link behavior.
Oof, that “SATA link down” and “hard resetting link” error cycle is a special kind of server admin nightmare, especially when a fresh pool rebuild only gave you a month of peace. I had a similar maddening issue that turned out to be a flaky SATA controller on the motherboard itself, not the drives. Since you mentioned using an ARC A380, have you checked if there’s any resource contention or a weird power state interaction between the GPU and the chipset SATA controller?
Thanks for sharing your own controller nightmare—that’s a solid point about the motherboard’s SATA controller possibly being the culprit. Since you mentioned the ARC A380, it’s worth checking if the GPU’s power management or active transcoding is interfering with the chipset’s PCIe lanes; you could try temporarily removing the GPU or adjusting PCIe power settings in the BIOS to test stability. Let me know if that changes anything, or if you’ve already explored other angles with your setup.
That “SATA link down” error during resilvering is a real gut punch, especially after you already replaced a drive. I’ve had similar phantom SATA issues on a Ryzen system that ultimately traced back to a flaky power cable for the drives—have you tried swapping SATA data and power cables to a different port on the PSU?
Thanks for sharing that experience—the cable issue on your Ryzen system is a great reminder, and I definitely should swap both SATA data and power cables to different PSU ports to rule that out. Since the problem persists across drives, checking the power supply connections and trying a different SATA controller port on the motherboard are my next practical steps. Let me know if you have any other insights from your troubleshooting, and I’ll report back on what I find.
Oof, that “SATA link down” and “hard resetting link” error cycle is a special kind of server admin nightmare, especially after you already rebuilt the pool once. I had a similar maddening issue that turned out to be a flaky SATA controller on the motherboard itself, not the drives. Since you mentioned using an ARC A380 for transcoding, have you double-checked if there’s any PCIe lane sharing or resource conflict between the GPU and the SATA controller that might be causing the link instability?
Thanks for sharing your own experience with a flaky SATA controller—that’s a great point about potential PCIe lane conflicts, especially with the ARC A380 installed. It’s definitely worth checking the motherboard manual to see if the GPU slot shares lanes with the SATA controller, and you could try temporarily removing the GPU to see if the link errors stop. Let me know if that reveals anything, or if you’ve already tried other troubleshooting steps.
That “SATA link down” error during resilvering is a real nightmare; I had similar phantom issues that turned out to be a flaky SATA controller on an older motherboard. Your detail about the problem resurfacing two weeks after a fresh pool rebuild points to something systemic, not just a bad drive. I’d be tempted to test by temporarily removing the ARC A380 to rule out a PCIe lane or power interaction, have you tried running the system with minimal hardware?
Thanks for sharing your own experience with a flaky SATA controller—that’s a solid point, especially since the issue reappeared after a fresh rebuild. I haven’t tried running the system with minimal hardware yet, but your suggestion to temporarily remove the ARC A380 to check for PCIe lane or power conflicts is a great next step I’ll definitely test. If you have any other insights from your troubleshooting, I’d be keen to hear how it went.
Try moving disks to different SATA ports to see if the issue follows the disk or stays with the port. This is easier if your pool uses disk ID or WWN naming. If the problem is with the port, consider adding an M.2 or PCIe SATA controller for more ports. Using a different SATA cable, as already suggested, is also worth trying.
Try replacing the SATA cables, as they may be low quality, faulty, or bent.
Can you recommend a brand for a premium SATA cable? I’ve already tried three different pairs.
Some hardware implementations of link_power_management_policy have bugs. To resolve this:
1. Create the file `/etc/udev/rules.d/99-custom-powersave.rules`
2. Add the following line:
“`
ACTION==”add|change”, SUBSYSTEM==”scsi_host”, KERNEL==”host[0-7]”, TEST==”link_power_management_policy”, ATTR{link_power_management_policy}=”max_performance”
“`
3. Run `sudo udevadm control –reload-rules && sudo udevadm trigger` to apply the changes.
I tried that, but the SATA link down issue persists. Here are the relevant log entries:
[Tue Nov 25 05:32:34 2025] ata9: SATA link down (SStatus 0 SControl 300)
[Tue Nov 25 05:32:34 2025] ata12: SATA link down (SStatus 0 SControl 300)
[Tue Nov 25 05:32:34 2025] ata11: SATA link down (SStatus 0 SControl 300)
[Tue Nov 25 05:32:34 2025] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Tue Nov 25 05:32:35 2025] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Tue Nov 25 05:32:35 2025] ata5: SATA link down (SStatus 0 SControl 330)
[Tue Nov 25 05:32:35 2025] ata6: SATA link down (SStatus 0 SControl 330)
[Tue Nov 25 05:32:36 2025] ata10: SATA link down (SStatus 0 SControl 300)
Since these SATA link down errors are occurring across multiple HDDs and ports, I suspect there may be an issue with the SATA controller itself.
It’s possible there’s an issue with the SATA controller.
Also, ensure your power supply is adequate, as it can lead to SATA link problems.
The issue is likely a failing PSU or motherboard.