

## You Don't Know 'Jack': CXL Fabric Orchestration and Management

Grant Mackey – Jackrabbit Labs









The open-source software services company for shared memory management





### Why is Open-Source Softwar



| re Needed for CXL?       |
|--------------------------|
| mentation slows adoption |
| ndardized                |
| ware Stack               |
|                          |

- App Plugins
- **Management API**
- Management Libraries
  - Host Agent(s)
- In-band Management Protocol
- **Management Endpoints** 
  - **SwitchOS**
- Hardware Abstraction Layer



The CXL spec contains a Fabric Management API, but FMAPI is not orchestration!

• FMAPI is just an API to complete actions on the fabric, not a tool to manage state

CMD set

• The number of command <u>sets</u> grows quickly with major version updates



\*There is presently no mechanism to acknowledge a set state cmd, orchestrator has to explicitly verify

## This is the CXL FMAPI slide An API is not a platform

| CXL 3.1      | CXL 2.0      |
|--------------|--------------|
| √+           | $\checkmark$ |
| $\checkmark$ | Х            |



## Jack – CXL Fabric Management CLI Tool

45

### Implements the CXL Fabric Management API

### **CXL Enabled Hosts**



| # | @ | Port State | Туре   | LD | Ver | CXL Ver | MLW | NLW | MLS | CLS | Speeds |
|---|---|------------|--------|----|-----|---------|-----|-----|-----|-----|--------|
|   | - |            |        |    |     |         |     |     |     |     |        |
| 0 | + | Upstream   | T1     | -  | 2.0 | AB      | 16  | 16  | 5.0 | 5.0 | 45     |
| 1 | + | Downstream | T3-MLD | 16 | 2.0 | AB      | 16  | 16  | 5.0 | 5.0 | 45     |
| 2 | + | Downstream | T3-MLD | 16 | 2.0 | AB      | 16  | 16  | 5.0 | 5.0 | 45     |
| 3 | + | Downstream | T3-MLD | 16 | 2.0 | AB      | 16  | 16  | 5.0 | 5.0 | 45     |
| 4 | + | Downstream | T3-MLD | 16 | 2.0 | AB      | 16  | 16  | 5.0 | 5.0 | 45     |
| 5 | + | Downstream | T3-MLD | 16 | 2.0 | AB      | 16  | 16  | 5.0 | 5.0 | 45     |
| 6 | + | Downstream | T3-MLD | 16 | 2.0 | AB      | 16  | 16  | 5.0 | 5.0 | 45     |
| 7 | + | Downstream | T3-MLD | 16 | 2.0 | AB      | 16  | 16  | 5.0 | 5.0 | 45     |
| 8 | + | Downstream | T3-MLD | 16 | 2.0 | AB      | 16  | 16  | 5.0 | 5.0 | 45     |
|   |   |            |        |    |     |         |     |     |     |     |        |

- 2.0 AB

16 16 5.0 5.0

### # jack show vcs 0

9

+ Upstream

# jack show port

Show VCS: VCS ID : 0 State : Enabled USP ID : 0 vPPBs : 8

### vPPB PPID LDID Status

- Bound Physical Port 0 0: 0 Bound LD 1: 1 2: 0 Bound LD 2 3: 3 0 Bound LD 4: 0 Bound LD 4 5: - Unbound \_ 6: - Unbound -7: - Unbound \_

Τ1

| LTSSM | LN | Flags |  |
|-------|----|-------|--|
|       |    |       |  |
| L0    | 0  | Р     |  |
| LØ    | 0  | Р     |  |
| LØ    | 0  | Р     |  |
| L0    | 0  | Р     |  |
| LØ    | 0  | Р     |  |
| L0    | 0  | Р     |  |



## **CXL Fabrics Need a Platform to Shine** Specs and state management won't get you there



Once a host is assigned ownership of a cxl device the fabric cannot take it back via any CXL specification mechanisms



Orchestration outside of the CXL spec is needed to enable a composable memory system rather than a statically allocated at boom memory topology



### Interacting with cluster schedulers and resource managers

Resource schedulers don't want to know how memory fabrics like CXL work

- They don't care about Ultra/Ethernet/Infiniband/NV or UALink either.
- They want the OS or a module to handle it so they can schedule resources



Container 'x' interfaces

- Resource, CRI
- Storage, CSI
- Network, CNI
- AND device plugin support



*completely* punt on caring about hardware



Ceph – storage corosync – state sync

# I don't care what you are, give me 'X'

# openstack.

### The chimera! Has 41 types of resource services with varying levels of hardware abstraction



# Where Jack and the Orchestrator are going If you aren't already use this other Resource manager API





### Challenges

- Potential fragmentation of the shared memory software ecosystem will delay adoption
- Lack of application development in the open
- Lack of platforms emulated or real to do said development on

### Call to Action

- Experiment with QEMU today! QEMU supports more CXL features (i.e. CXL 3.0+) than CPU HW today
- Software application development doesn't have to wait until hardware (i.e. switches) are available
- Evaluate where open-source tools / libraries / APIs can be used in your projects

| Intel           | libcxl                                                | https://github.com/pmem/ndctl                          |
|-----------------|-------------------------------------------------------|--------------------------------------------------------|
| Jackrabbit Labs | libmem                                                | https://github.com/JackrabbitLabs/libmem               |
| Jackrabbit Labs | Jack - CXL FM API CLI Tool                            | https://github.com/JackrabbitLabs/jack                 |
| Jackrabbit Labs | CXL Switch Emulator                                   | https://github.com/JackrabbitLabs/cse                  |
| Samsung         | Scalable Memory Development Kit (SMDK)                | https://github.com/OpenMPDK/SMDK                       |
| Micron          | CXL Memory Resource Kit (CMRK)                        | https://github.com/cxl-micron-reskit/cxl-reskit        |
| SK Hynix        | Heterogeneous Memory Software Development Kit (HMSDK) | https://github.com/skhynix/hmsdk                       |
| Micron          | CXL Library CLI                                       | https://github.com/cxl-micron-reskit/mxcli             |
| Micron          | FAMFS                                                 | https://github.com/cxl-micron-reskit/famfs             |
| Intel           | Unified Memory Framework                              | https://github.com/oneapi-src/unified-memory-framework |
| QEMU            | QEMU                                                  | https://github.com/qemu/qemu                           |
| Samsung         | libcxlmi                                              | https://github.com/computexpresslink/libcxlmi          |
|                 |                                                       |                                                        |

**Review** 





## **JACKRABBIT LABS**

Driving CXL Adoption with Open Source

### Hosts, Switches, and Devices can be connected in a Direct or Switched Topology



SH-SLD – DRAM Drives Single-Headed Single Logical Devices



MH-MLD – DRAM Drives Multi-Headed Multi-Logical Devices



SH-MLD – DRAM Drives Single-Headed Multi-Logical Devices



### Multi-Layer Switch **Device Pooling / Sharing**





### **Ethernet Switch Appliance**



- Managed Ethernet switches run a SwitchOS
- e.g. SONiC, Cumulus, FBOSS, EOS, NX-OS
- Managed through in-band / out-of-band Ethernet links
- Hardware Abstraction Layer (HAL)
- Can be run on a low-end BMC or larger x86 processor ٠
- SONiC = Debian + Ethernet Management Containers •
- Typically has a CLI shell + Web API / GUI



- CXL Switch appliances are equivalent to Ethernet switches •
- Will run a SwitchOS to manage CXL switch silicon
- The "Fabric Manager" lives in this SwitchOS (Or at least a software agent of a larger orchestration system)
- Has a Hardware Abstraction Layer (HAL) for CXL switch silicon
- External interface can be REST, GUI over Ethernet or an inband protocol over CXL links

### **SwitchOS**

### The Management Abstraction Layer to the Silicon



### **Direct Attach Multi-Port Devices**







- Directly connected Multi-Headed (Multi-Port) devices
- No switch architecture •
- Memory devices housed in separate / bladed enclosure ٠
- Lower latency more cables / complex enclosure
- Still requires separate management entity ۲

- CXL Switch appliances are equivalent to Ethernet switches •
- Will run a SwitchOS to manage CXL switch silicon
- The "Fabric Manager" lives in this SwitchOS (Or at least a software agent of a larger orchestration system)
- Has a Hardware Abstraction Layer (HAL) for CXL switch silicon
- External interface can be REST, GUI over Ethernet or an inband protocol over CXL links

### **SwitchOS**

### The Management Abstraction Layer to the Silicon