The Machine (computer architecture)

From Wikipedia, the free encyclopedia

The Machine is an experimental computer made by Hewlett Packard Enterprise. It was created as part of a research project to develop a new type of computer architecture for servers. The design focused on a “memory centric computing” architecture, where NVRAM replaced traditional DRAM and disks in the memory hierarchy. The NVRAM was byte addressable and could be accessed from any CPU via a photonic interconnect.[1][2] The aim of the project was to build and evaluate this new design.

The Machine was a computer cluster with many individual nodes connected over a memory fabric. The fabric interconnect used VCSEL-based silicon photonics with a custom chip called the X1.[3] Access to memory is non-uniform and may include multiple hops. The Machine was envisioned to be a rack-scale computer initially with 80 processors and 320 TB of fabric attached memory, with potential for scaling to more enclosures up to 32 ZB.[4][5] The fabric attached memory is not cache coherent and requires software to be aware of this property.[4] Since traditional locks need cache coherency, hardware was added to the bridges to do atomic operations at that level.[4] Each node also has a limited amount of local private cache-coherent memory (256 GB).[6][4] Storage and compute on each node had completely separate power domains.[4]

A logical diagram showing a single node in the Machine. Dozens of nodes are connected together using the backplane.
A logical diagram showing a single node in the Machine. Dozens of nodes are connected together using the backplane. The initial prototype contained DRAM, with the eventual goal of being replaced with more NVRAM.

The whole fabric attached memory of The Machine is too large to be mapped into a processor's virtual address space (which was 48-bits wide[4]). A way is needed to map windows of the fabric attached memory into processor memory. Therefore, communication between each node SoC and the memory pool goes through an FPGA-based “Z-bridge” component that manages memory mapping of the local SoC to the fabric attached memory.[4] The Z-bridge deals with two different kinds of addresses: 53-bit logical Z addresses and 75-bit Z addresses, which allows addressing 8PB and 32ZB respectively.[4] Each Z-bridge also contained a firewall to enforce access control.[7] The interconnect protocol was developed in-house and known as Next Generation Memory Interconnect (NGMI).[4] This protocol evolved into the open Gen-Z standard.[8][9] The Z-bridge connects to the SoC using PCIe, avoiding major software changes.[9]

A half rack prototype of the machine was unveiled at HPE Discover in London in 2016.[10] Each node contained ARMv8-A based Broadcom/Cavium ThunderX2 SoCs.[11][12][3] In total there were 40 32-core SoCs.[13] Due to unavailability of adequate memristor-based NVRAM or phase-change memory, the prototype used 160 TB of battery-backed DRAM.[14][12][15] Despite this setback, software architect Keith Packard said this "can be used to prove the other parts of the design before switching".[4] According to The Register, HPE's partnership with SK Hynix to develop memristor-based NVRAM ran into funding and directional problems and they were working with Sandisk on Resistive RAM (ReRAM) for The Machine.[16] According to The Next Platform, HPE considered switching to Intel Optane DIMMs "when production quantities of are available on the market".[9]

The Next Platform estimated the rack prototype to consume 24 kW to 36 kW of power.[9]

Software overview

History

References

Related Articles

Wikiwand AI