# **CXL Disaggregated Memory Solution** for HPC and AI Workloads

**Jungmin Choi** 

**Memory System Architect** 





# **Growing Memory Bandwidth and Capacity Gap**

- $\bigcirc$  Increase in core counts requires continued increase in memory bandwidth and capacity
- $\bigcirc$  Gap between such requirements and platform provisioning capability is growing
- **CXL** creates new opportunities beyond physical limitations, and enables efficient memory disaggregation



**SK** hynix

### **Challenges in Today's Datacenter**



- Challenge 1 : Memory stranding and data spill
  - Memory utilization of each node in a compute cluster varies time to time
  - Unused memory in each node never be utilized by other nodes, which causes memory stranding and data spill





[Storage swap & Performance degradation]

### **Challenges in Today's Datacenter**



- Challenge 2 : Data transfer overhead & data duplication
  - In a distributed computing system, there is a network-based data transfer overhead between remote nodes
  - Duplication of shared data between nodes increases local memory pressure



### Solution: CXL Disaggregated Memory System

- SK hynix
- Support memory pooling and sharing with CXL disaggregated memory system
  - Memory pooling : Mitigate memory stranding and data spill by sharing memory resources between nodes
  - Memory sharing : Remove data transfer overhead and data duplication by sharing data between nodes



[Memory Pooling]

Allocate CXL memory based on memory usage for each node



Share data objects based on zero-copy between nodes

#### [Memory Sharing]

# **CXL Disaggregated Memory Research Platform**

- Built a Niagara HW/SW Research platform, an FPGA-based MH-SLD type of CXL disaggregated memory prototype, for memory pooling and sharing usage exploration
  - 2U memory appliance which can connect up to 8 CXL host servers (without CXL switch)
  - Supports up to 4 channels of DDR4-DIMM (1TB)
  - Supports DCD (Dynamic Capacity Device) feature defined in CXL specification 3.1



[Niagara HW/SW Research Platform]



**SK** hynix

[Rack-Scale System with Niagara]

#### **DCD-Enabled Infrastructure**





\* Niagara supports DCD APIs defined in the CXL specification 3.1

# **Use Case of CXL Disaggregated Memory**

○ Use Case 1: Memory Pooling (Collaborate with <sup>▲</sup> MemVerge)

- Niagara can dynamically allocate/deallocate disaggregated memory resources for each node without RESET
- Improve memory utilization and performance of a system equipped with CXL disaggregated memory



Normalized Execution Time vs Baseli

[Execution Time of CloudSuite In-Memory Analytics Benchmark]

Spill to Niagara outperforms NVMe by up to 2.5x

hvnix

[Memory Pooling without Workload Interruption]

# **Use Case of CXL Disaggregated Memory**

Use Case 2: Memory Sharing (Collaborate with MemVerge)

- No more object serialization and transfer over network for remote object access
- No more duplicate object copies on different nodes → zero copy





#### **Future Work**



- Research on Memory Sub-System Architecture based on Disaggregated Memory for AI Applications
  - System benefit for AI applications such as LLM (Large Language Model) and DLRM (Deep Learning Recommendation Model)
- Research on Value-added Function for Efficient Use of Disaggregated Memory
  - Near data processing
  - Telemetry (hotness tracking, average latency and throughput)

We look forward to an open collaboration with industry partners to enable HW/SW ecosystem

# **Demo in SK hynix Booth #207**

) Demonstrate the memory resource allocation/deallocation of CXL disaggregated memory based on the dynamic changes of memory requirements from VM servers



