From 32 to 2 ports: Ideal SATA/SAS Controllers for ZFS & Linux MD RAID

Absolutely fantastic post on controllers for home storage systems can be found here

I need a lot of reliable and cheap storage space (media collection, backups). Hardware RAID tends to be expensive and clunky. I recognize quite a few advantages in ZFS on Solaris/FreeBSD, and Linux MD RAID:

  • Performance. In many cases they are as fast as hardware RAID, and sometimes faster because the OS is aware of the RAID layout and can optimize I/O patterns for it. Indeed, even the most compute intensive RAID5 or 6 parity calculations take negligible CPU time on a modern processor. For a concrete example, Linux 2.6.32 on a Phenom II X4 945 3.0GHz computes RAID6 parity at close to 8 GB/s on a single core (check dmesg: “raid6: using algorithm sse2x4 (7976 MB/s)”). So achieving a throughput of 500 MB/s on a Linux MD raid6 array requires spending less than 1.5% CPU time computing parity. Now regarding the optimized I/O patterns, here is an interesting anecdote: one of the steps that Youtube took in its early days to scale their infrastructure up was to switch from hardware RAID to software RAID on their database server. They noticed a 20-30% increase in I/O throughput. Watch Seattle Conference on Scalability: YouTube Scalability @ 34’50”.
  • Scalability. ZFS and Linux MD RAID allow building arrays across multiple disk controllers, or multiple SAN devices, alleviating throughput bottlenecks that can arise on PCIe links, or GbE links. Whereas hardware RAID is restricted to a single controller, with no room for expansion.
  • Reliability. No hardware RAID = one less hardware component that can fail.
  • Ease of recoverability. The data can be recovered by putting the disks in any server. There is no reliance on a particular model of RAID controller.
  • Flexibility. It is possible to create arrays on any disk on any type of controller in the system, or to move disks from one controller to another.
  • Ease of administration. There is only one software interface to learn: zpool(1M) or mdadm(8). No need to install proprietary vendor tools, or to reboot into BIOSes to manage arrays.
  • Cost. Obviously cheaper since there is no hardware RAID controller to buy.

Consequently, many ZFS and Linux MD RAID users, such as me, look for non-RAID controllers that are simply reliable, fast, cheap, and otherwise come with no bells and whistles. Most motherboards have up to 4 or 6 onboard ports (be sure to always enable AHCI mode in the BIOS as it is the best designed hardware interface that a chip can present to the OS to enable maximum performance), but for more than 4 or 6 disks, there are surprisingly not that many choices of controllers. Over the years, I have spent quite some time on the controllers manufacturers’ websites, the LKML, linux-ide and ZFS mailing lists, and have established a list of SATA/SAS controllers that are ideal for ZFS or Linux MD RAID. I also included links to online retailers because some of these controllers are not that easy to find online.

The reason the list contains SAS controllers is because they are just as good as an option as SATA controllers: many of them are as inexpensive as SATA controllers (even though they target the enterprise market), they are fully compatible with SATA 3Gbps and 6Gbps disks, and they support all the usual features: hotplug, queueing, etc. A SAS controller typically present SFF-8087 connectors, also known as internal mini SAS, or even iPASS connectors. Up to 4 SATA drives can be connected to such a connector with an SFF-8087 to 4xSATA forward breakout cable (as opposed to reverse breakout). This type of cable usually sells for $15-30. Here are a few links if you have trouble finding them.

There are really only 4 significant manufacturers of discrete non-RAID SATA/SAS controller chips on the market: LSI, Marvell, JMicron, and Silicon Image. Controller cards from Adaptec, Areca, HighPoint, Intel, Supermicro, Tyan, etc, most often use chips from one of these 4 manufacturers.

Here is my list of non-RAID SATA/SAS controllers, from 16-port to 2-port controllers, with the kernel driver used to support them under Linux, and Solaris. There is also limited information on FreeBSD support. I focused on native PCIe controllers only, with very few PCI-X (actually only 1 very popular: 88SX6081). The MB/s/port number in square brackets indicates the maximum practical throughput that can be expected from each SATA port, assuming concurrent I/O on all ports, given the bottleneck of the host link or bus (PCIe or PCI-X). I assumed for all PCIe controllers that only 60-70% of the maximum theoretical PCIe throughput can be achieved, and for all PCI-X controllers that only 80% of the maximum theoretical PCI-X throughput can be achieved on this bus. These assumptions concur with what I have seen in real world benchmarks assuming a Max_Payload_Size setting of either 128 or 256 bytes for PCIe (a common default value), and a more or less default PCI latency timer setting for PCI-X. As of May 2010, modern disks can easily reach 120-130MB/s of sequential throughput at the beginning of the platter, so avoid controllers with a throughput of less than 150MB/s/port if you want to reduce the possibility of bottlenecks to zero.

32 ports

  • [SAS] 4 x switched Marvell 88SE9485, 6Gbps, PCIe (gen2) x16 [150-175MB/s/port]
    [Update 2011-09-29: Availability: $850 $850. This is a HighPoint HBA combining 4 x 8-port Marvell 88SE9485 with PCIe switching technology: RocketRAID 2782.]

    • Linux/Solaris/FreeBSD support: see Marvell 88SE9485 or 88SE9480 below

24 ports

  • [SAS] 3 x switched Marvell 88SE9485, 6Gbps, PCIe (gen2) x16 [200-233MB/s/port]
    [Update 2011-09-29: Availability: $540 $620. This is a HighPoint HBA combining 3 x 8-port Marvell 88SE9485 with PCIe switching technology: RocketRAID 2760A.]

    • Linux/Solaris/FreeBSD support: see Marvell 88SE9485 or 88SE9480 below

16 ports

  • [SAS] LSI SAS2116, 6Gbps, PCIe (gen2) x8 [150-175MB/s/port]
    Availability: $400 $510. LSI HBA based on this chip: LSISAS9200-16e, LSISAS9201-16i. [Update 2010-10-27: only the model with external ports used to be available but now the one with internal ports is available and less expensive.]

  • [SAS] 2 x switched Marvell 88SE9485, 6Gbps, PCIe (gen2) x16 [300-350MB/s/port]
    [Update 2011-09-29: Availability: $450 $480. This is a HighPoint HBA combining 3 x 8-port Marvell 88SE9485 with PCIe switching technology: RocketRAID 2740 and 2744.]

    • Linux/Solaris/FreeBSD support: see Marvell 88SE9485 or 88SE9480 below

8 ports

  • [SAS] Marvell 88SE9485 or 88SE9480, 6Gbps, PCIe (gen2) x8 [300-350MB/s/port]
    Availability: $280. [Update 2011-07-01: Supermicro HBA based on this chip: AOC-SAS2LP-MV8]. Areca HBA based on the 9480: ARC-1320. HighPoint HBA based on the 9485: RocketRAID 272x. Lots of bandwidth available to each port. However it is currently not supported by Solaris. I would recommend the LSI SAS2008 instead, which is cheaper, better supported, and provides just as much bandwidth.

    • Linux support: mvsas (94xx: 2.6.31+, ARC-1320: 2.6.32+)
    • Solaris support: not supported (see 88SE6480)
    • Mac OS X support: [Update 2014-06-26: the only 88SE9485 or 88SE9480 HBAs supported by Mountain Lion (10.8) and up seem to be HighPoint HBAs]
  • [SAS] LSI SAS2008, 6Gbps, PCIe (gen2) x8 [300-350MB/s/port]
    Availability: $130 $140 $180 $220 $220 $230 $240 $290. [Update 2010-12-21: Intel HBA based on this chip: RS2WC080]. Supermicro HBAs based on this chip: AOC-USAS2-L8i AOC-USAS2-L8e (these are 2 “UIO” cards with the electronic components mounted on the other side of the PCB which may not be mechanically compatible with all chassis). LSI HBAs based on this chip: LSISAS9200-8e LSISAS9210-8i LSISAS9211-8i LSISAS9212-4i4e. Lots of bandwidth per port. Good Linux and Solaris support.

  • [SAS] LSI SAS1068E, 3Gbps, PCIe (gen1) x8 [150-175MB/s/port]
    Availability: $110 $120 $150 $150. Intel HBAs based on this chip: SASUC8I. Supermicro HBAs based on this chip: AOC-USAS-L8i AOC-USASLP-L8i (these are 2 “UIO” cards – see warning above.) LSI HBAs based on this chip: LSISAS3081E-R LSISAS3801E. Can provide 150-175MB/s/port of concurrent I/O, which is good enough for HDDs (but not SSDs). Good Linux and Solaris support. This chip is popular because it has very good Solaris support and was chosen by Sun for their second generation Sun Fire X4540 Server “Thumper”. However, beware, this chip does not support drives larger than 2TB.

    • Linux support: mptsas
    • Solaris support: mpt
    • FreeBSD support: mpt (supported at least since 7.3)
  • [SATA] Marvell 88SX6081, 3Gbps, PCI-X 64-bit 133MHz [107MB/s/port]
    Availability: $100. Supermicro HBAs based on this chip: AOC-SAT2-MV8 Based on PCI-X, which is an aging technology being replaced with PCIe. The approximate 107MB/s/port of concurrent I/O it supports is a bottleneck with modern HDDs. However this chip is especially popular because it has very good Solaris support and was chosen by Sun for their first generation Sun Fire X4500 Server “Thumper”.

    • Linux support: sata_mv (no suspend support)
    • Solaris support: marvell88sx
    • FreeBSD support: ata (supported at least since 7.0, if the hptrr driver is commented out)
  • [SAS] Marvell 88SE6485 or 88SE6480, 3Gbps, PCIe (gen1) x4 [75-88MB/s/port]
    Availability: $100. Supermicro HBAs based on this chip: AOC-SASLP-MV8. The PCIe x4 link is a bottleneck for 8 drives, restricting the concurrent I/O to 75-88MB/s/port. A better and slightly more expensive alternative is the LSI SAS1068E.

4 ports

  • [SAS] LSI SAS2004, 6Gbps, PCIe (gen2) x4 [300-350MB/s/port]
    Availability: $160. LSI HBA based on this chip: LSISAS9211-4i. Quite expensive; I would recommend buying a (cheaper!) 8-port controller.

  • [SAS] LSI SAS1064E, 3Gbps, PCIe (gen1) x8 [300-350MB/s/port]
    Availability: $120 $130. Intel HBA based on this chip: SASWT4I. [Update 2010-10-27: LSI HBA based on this chip: LSISAS3041E-R.] It is quite expensive. [Update 2014-12-04: And it does not support drives larger than 2TB.] For these reasons, I recommend instead buying a cheaper 8-port controller.

    • Linux support: mptsas
    • Solaris support: mpt
    • FreeBSD support: mpt (supported at least since 7.3)
  • [SAS] Marvell 88SE6445 or 88SE6440, 3Gbps, PCIe (gen1) x4 [150-175MB/s/port]
    Availability: $80. Areca HBA based on the 6440: ARC-1300. Adaptec HBA based on the 6440: ASC-1045/1405. Provides good bandwidth at a decent price.

    • Linux support: mvsas (6445: 2.6.25 or 2.6.31 ?, 6440: 2.6.25+, ARC-1300: 2.6.32+)
    • Solaris support: not supported (see 88SE6480)
  • [SATA] Marvell 88SX7042, 3Gbps, PCIe (gen1) x4 [150-175MB/s/port]
    Availability: $70. Adaptec HBA based on this chip: AAR-1430SA. Rosewill HBA based on this chip: RC-218. This is the only 4-port SATA controller supported by Linux providing acceptable throughput to each port. [2010-05-30 update: I bought one for $50 from Newegg in October 2009. Listed at $70 when I wrote this blog. Currently out of stock and listed at $90. Its popularity is spreading…]

  • [SAS] Marvell 88SE6340, 3Gbps, PCIe (gen1) x1 [38-44MB/s/port]
    Hard to find. Only found references to this chip on Marvell’s website. Performance is low anyway (38-44MB/s/port).

    • Linux support: mvsas
    • Solaris support: not supported (see 88SE6480)
  • [SATA] Marvell 88SE6145 or 88SE6141, 3Gbps, PCIe (gen1) x1 [38-44MB/s/port]
    Hard to find. Chip seems to be mostly found on motherboards for onboard SATA. Performance is low anyway (38-44MB/s/port).

    • Linux support: ahci
    • Solaris support: ahci
    • FreeBSD support: ahci

2 ports

  • [SATA] Marvell 88SE9128 or 88SE9125 or 88SE9120, 6Gbps, PCIe (gen2) x1 [150-175MB/s/port]
    Availability: $25 $35. HighPoint HBA based on this chip: Rocket 620. LyCOM HBA based on this chip: PE-115. Koutech HBA based on this chip: PESA230. This is the only 2-port chip on the market with no bottleneck caused by the PCIe link at Max_Payload_Size=128. Pretty surprising that it is being sold for such a low price.

    • Linux support: ahci
    • Solaris support: not supported [Update 2010-09-21: Despite being AHCI-compliant, this series of chips seems unsupported by Solaris according to reader comments, see below.]
    • FreeBSD support: ahci
  • [SATA] Marvell 88SE6121, 3Gbps, PCIe (gen1) x1 [75-88MB/s/port]
    Hard to find. Chip seems to be mostly found on motherboards for onboard SATA.

    • Linux support: ahci
    • Solaris support: ahci
    • FreeBSD support: ahci
  • [SATA] JMicron JMB362 or JMB363 or JMB366, 3Gbps, PCIe (gen1) x1 [75-88MB/s/port]
    Availability: $22.

    • Linux support: ahci
    • Solaris support: ahci
    • FreeBSD support: ahci
  • [SATA] SiI3132, 3Gbps, PCIe (gen1) x1 [75-88MB/s/port]
    Availability: $20. Warning: the overall bottleneck of the PCIe link is 150-175MB/s, or 75-88MB/s/port, but the chip has a 110-120MB/s bottleneck per port. So a single SATA device on a single port cannot fully use the 150-175MB/s by itself, it will be bottlenecked at 110-120MB/s.

Finding cards based on these controller chips can be surprisingly difficult (I have had to zoom on product images on newegg.com to read the inscription on the chip before buying), hence the reason I included some links to online retailers.

For reference, the maximum practical throughputs per port I assumed have been computed with these formulas:

  • For PCIe gen2: 300-350MB/s (60-70% of 500MB/s) * pcie-link-width / number-of-ports
  • For PCIe gen1: 150-175MB/s (60-70% of 250MB/s) * pcie-link-width / number-of-ports
  • For PCI-X 64-bit 133MHz: 853MB/s (80% of 1066MB/s) / number-of-ports

To anyone building ZFS or Linux MD RAID storage servers, I recommend to first make use of all onboard AHCI ports on the motherboard. Then put any extra disks on a discrete controller, and I recommend specifically these ones:

  • For a 2-port controller: Marvell 88SE9128 or 88SE9125 or 88SE9120. I do not primarily recommend it because it is SATA 6Gbps, but because it supports PCIe gen2, which allows the controller to handle an overall throughput of at least 300-350MB/s, or 150-175MB/s/port, with a default PCIe Max_Payload_Size setting of 128 bytes. It is also fully AHCI compliant, in other words robust, well-designed, and virtually compatible with all operating systems; a notable exception is Solaris for which I recommend instead the next best controller: JMicron JMB362 or JMB363 or JMB366. The icing on the cake is that cards using these chips are inexpensive (starting from $22, or $11/port).
  • For an 8-port controller: LSI SAS1068E if you are fine with it only supporting drives up to 2TB. Controllers based on this chip can be found inexpensively (starting from $110, or $13.75/port) and are supported out of the box by many current and older Linux and Solaris versions. In fact this chip is the one that Sun used in their second generation Sun Fire X4540 Server “Thumper”. The fact that it can support up to 150-175MB/s/port due to the PCIe bottleneck with concurrent I/O on all ports is sufficient for current disks. However if you need more throughput (eg. are using SSDs), or need to use drives larger than 2TB, then go for its more expensive successor, LSI SAS2008, which supports PCI gen2, which should allow for 300-350MB/s/port before hitting the PCIe bottleneck.