Nathan Stratton’s Homepage


Optimizing for Memory Intensive Workloads

by on Aug.20, 2013, under Hardware

Processor speed is an important factor when deciding on a new server spec, however with virtualization and other memory intensive workloads many times the memory system has a far greater impact on performance then even CPU speed. Xeon e5-2600 CPUs support 3 basic types of third generation dual data rate (DDR3) memory via 4 channels in up to 3 banks for each CPU. How or if those slots are filled and with what is very important to understand.

Types of Memory
The basic element of each type of DIMM is the DRAM chip that provides 4 or 8 bits of data. When ECC is used the DRAM chips provide 72 bits allowing many errors to be corrected without loss. Since ECC functions differently with 4 bit and 8 bit chips, different DIMMs types should never be mixed. The DRAMs on the DIMMs are arranged in groups called ranks, groups of chips that can be access simultaneously by the same chip select (CS) control signal.

UDIMM – With unregistered DIMMs each chip on the DIMM has its data and control lines directly tied to the integrated over the memory bus to the memory controller off the QPI ring integrated into each CPU. Each DRAM on this bus adds to the electrical load, because of this load UDIMM support is limited to only 2 dual-rank UDIMMs per channel. However this direct access to the DRAMs by the memory controller allows UDIMMs to provide the fastest and lowest latency memory access of all types.

RDIMM – Registered DIMMs are the most common in use in servers today. RDIMMs have an extra chip, a buffer that isolates the control lines between the memory controller and each DRAM chip. This buffer slightly increases latency, but allows a RDIMMs to support up to quad ranks and fill all 3 DIMM banks.

LRDIMM – Load Reduced DIMMs are a relatively new type of DDR3 memory that buffers all control and data lines from the DRAM chips. This isolation decreases the electrical load on the memory controller and allows the highest memory configuration possible. Since the DRAM chips are hidden by the buffer, LRDIMMs are able to implement rank multiplication offering the memory controller virtual ranks that may be less then the physical ranks on the DIMM. This hiding of physical ranks allows more rank support then the DDR3 memory architecture naively supports by the CPU. This increased capacity does come at the price of not only speed and latency, but also increased power consumption.

Memory speed
The clock frequency of the memory bus used to access DIMMs on the e5-2600 series is 1600, 1333, 1066, or 800 MHz (up to 1866 MHz on the new e5-2600 v2). This memory bus speed is controlled by the BIOS and is set per system. It is not possible to access memory in different banks at a different speeds. The maximum memory speed on the e5-2600 series is limited by the number of banks, ranks used, and the speed of the QPI ring. To support full 1600 MHz memory a 8.0 GT/s QPI is required, something that is not available on the standard or basic e5-2600 processors.

The e5-2600 supports up to 8 physical ranks per channel, DIMMs using single, dual, or quad ranks can be used, however quad rank DIMMs lower the clock frequency of the memory bus. The more ranks that are available to a channel the more parallelism can be preformed by the memory controller increasing memory performance, thus dual ranked DIMMs should be used if possible.

While the e5-2600 can physically support 3 banks of memory, only 2 banks can be used at 1600 MHz. If all 3 banks are used, the maximum clock frequency supported on the memory bus is 1066 MHz. Fully populating channels is not required on the e5-2600, it is highly recommended that all 4 are populated and if possible with two DIMMs in each channel increasing the ranks available on each channel.

Column Address Strobe, or CAS latency with DDR3 is the amount of clock cycles it takes between the moment the memory controller requests access to a DRAM and when that data is available on the DRAM chip on the DIMM. In searching for memory, particularly with lower cost UDIMMs, pay attention to the CAS latency and stay away from anything over a CAS of 9.

Maximum Performance
Bottom line… If you want the maximum memory performance use 16 sticks of 1600 MHz 1.5 volt dual rank UDIMMs with the lowest CAS latency as possible populating first two banks of all 4 channels. ECC DIMMs are also preferred by not required by the e5-2600.

Leave a Comment :, , , , , more...

Looking for something?

Use the form below to search the site:

Cool Links!

A few highly recommended links...