

### Annual Flash Controller Update

**David McIntyre** 

@DavidSMcIntyre
#flashmem

1:1 Mtgs Text FMS to (408) 772-7044





Data Center Drivers
 Memory Hierarchy Drivers
 Flash Controller Challenges
 Supporting Technologies



#### **Data Center Trends**

#### Hyper Converged Infrastructure

- Integrated Compute/Storage/Networking
- Massive interconnectivity (25Gb to 100Gb)
- Software managed virtualized resources

#### Hyper Scale

- Independent scaling of compute and storage resources
- Good for elastic workloads, e.g. Hadoop, NoSQL
- Acceleration As a Service





### **Hyperscaler Priority**

#### Hyperscale in 2020

| By 2020, | Today:                                                |                        |
|----------|-------------------------------------------------------|------------------------|
| 47%      | of all data center <b>servers</b>                     | 21%                    |
| 68%      | of all data center processing power                   | 39%                    |
| 57%      | of all <b>data</b> stored in data centers             | 49%                    |
| 53%      | of all data center <b>traffic</b>                     | 34%                    |
| dialo    | © 2016 Cisco and/or its affiliates. All rights reserv | ed, Cisco Confidential |

 Flash controllers must support hyperscale requirements (deterministic latency, performance/watt, endurance, reliability)



#### **Data Center Trends**

- Storage
  - Computational Storage
  - Convergence of RAM/cache and SCM
  - NVDIMM N and P
  - NVMe-oF, NVMe/TCP
- Compute
  - GPU, TPU and FPGA accelerators
- Networking
  - Low latency, high performance RDMA networks
  - 100Gbps+
- Hybrid Cloud
  - For lease and on premises-equipment





#### **Data Center Applications**





### **Memory and Storage Tiering**

Flash Memory Summit





### **Flash System Challenges**

Error correction costs increasing

- Endurance limits
- Slow write speeds continue
- IO bottlenecking
- Emerging NV technologies (MRAM, PCM, RRAM)
- Form Factors (M.2, EDSFF (E1.x, E3.x))

#### Price/Performance Gaps in Storage Technologies



# **Controller Trends- Intel Optane**

#### Separate Media Controllers

- Silicon Motion SM2263: 1TB QLC NAND
- Intel SSL3D: 32 GB Optane



Intel® Rapid Storage Technology Software

# Computational Storage

- Tightly coupled CSSD
  - Embedded CPU Cores
  - Hardware Accelerators
  - Memory

Flash Memory Summit

- NAND Flash
- Purpose-built data paths
  - Any to any connectivity
  - 10X-100X Internal Bandwidth
- Distributed, Scale-out model



#### Adaptive Storage Acceleration

- Encryption
- Compression
- Data Dedupe
- RAID & Erasure codes
- Key-Value Offloads
- Database ETL & Query Offloads
- Spark-SQL / Map-Reduce
- Video / Image Transcoding, Processing and Delivery
- Search Text, Image, Video etc.
- Stats / Counters
- Machine Learning

### **Computational Storage- Smart SSD**





#### **Computational Storage Controller Options**









Source: NVMexpress.org (both graphics)



- SOC Integrated solution
- Hybrid Controller- SCM caching
- Deterministic Latency
- Flash Density and Performance (3D QLC)
- Byte addressable
- Opportunity for NAND to support load/store-driven data center applications (e.g. NVDIMM-P)



#### **NVDIMM-N Controller Architecture**





Backup and Restore Solution Courtesy of Agigatech



- > NVDIMM-P
  - High capacity, transactional access DDR4/DDR5
  - Persistent memory
- Supported applications
  - Database caching
  - Enterprise storage
  - High Performance computing



NVDIMM-P Target



### **Controller Options**

#### Technology scaling favors programmability and parallelism

|              |                                 |             | 5 + 9 array of brist                     |                                                |                                                                      |
|--------------|---------------------------------|-------------|------------------------------------------|------------------------------------------------|----------------------------------------------------------------------|
| CPUs         | DSPs                            | Multi-Cores | ASICs                                    | GPGPUs                                         | FPGAs                                                                |
| Single Cores | Multi-Co<br>Coarse-(<br>CPUs an | Grained     | Fixed Design<br>Efficient<br>Performance | Massively<br>Parallel<br>Processor<br>Elements | Massively<br>Parallel<br>Programmable<br>Logic and SOC<br>attributes |



#### Flash Controller Technology Options



- Data center metric is performance/watt
- Performance, power efficiency and flexibility is required to support data center applications



### **Technology Comparison**

| Technology | Pros                                                                    | Cons                                                                                                      |
|------------|-------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
| CPU        | Well established products                                               | <ul><li>Limited cores for<br/>parallel processing</li><li>Power consumption</li></ul>                     |
| FPGA       | Heterogeneous parallel<br>processing<br>Performance/Watt<br>Flexibility | <ul> <li>Rudimentary<br/>development<br/>environment</li> <li>Inefficient per unit<br/>costing</li> </ul> |
| GPU        | Same task parallel processing<br>Developer ecosystem                    | <ul> <li>Power consumption</li> <li>Leading variable<br/>types</li> </ul>                                 |
| ASIC       | Highest Performance                                                     | <ul><li>High NRE</li><li>Custom design</li></ul>                                                          |
| ASSP       | Custom Performance                                                      | Limited functionality                                                                                     |



### **Error Correction Overview**

## Driving Factors for New ECC

- Increasing Bit errors in NAND Flash
- Soft error occurrences
- Decrease in write cycles
- RS, BCH overhead for data and spare area
- Increase use of Metadata in file systems
- Correction Overhead
- Gate count
- Requirement for no data loss

#### **Comparing ECC Solutions**

| Features   | BCH | LDPC   |
|------------|-----|--------|
| Gate Count | Low | Mid    |
| Latency    | Low | Medium |
| Tuneablity | low | high   |
| Soft Data  | No  | Yes    |





### **Flash Controller Support**

| IP       | Ю          | Speed    | Logic<br>Density | Comments                                              |
|----------|------------|----------|------------------|-------------------------------------------------------|
| ONFI 4.1 | 40 pins/ch | 1200MT/s | 5KLE/ch          | NAND flash control, wear leveling, garbage collection |
| DDR4/5   |            | 6.4 Gbps | 10KLE            | Flash control modes available for NVDIMM              |
| PCM      |            |          | 5KLE             | PCM- Pending production \$                            |
| MRAM     |            |          | 5KLE             | MRAM- Persistent memory<br>controller                 |
| BCH      |            |          | <10KLE           | Baseline ECC standard                                 |
| LDPC     |            |          | 50KLE+           | Increased performance for FPGAs                       |
| PCIe     | Gen 4x8    | 16 GT/s  | HIP              | Flash Cache                                           |



#### Typical SSD Controller Architecture





#### **Coherent Networks Roadmap**

#### Cache coherency will continue to expand into SCM into SSD caches

| NEAR |                |                               | FAR             |                 |              |
|------|----------------|-------------------------------|-----------------|-----------------|--------------|
| НВМ  | DDR            | Accelerator / Local SCM       | Chassis SCM     | Rack Pooled SCM | Messaging    |
|      |                | PCle Phy CCIX                 | Future Spec     | Rev             |              |
|      | г -<br>18<br>С | 302.3 short and long haul Phy | Gen-Z           |                 | <sub>1</sub> |
|      | -<br>  8<br> _ | <sup>02.3 Phy</sup> OpenCAPI  | Future Spec Rev | F               | Re: OFA.org  |

Flash Memory Summit 2019 Santa Clara, CA

# Flash Memory Summit

### **Changing of the Guard**









### **Controller Challenges Summary**

- Host Interface IO
  - Gen Z, CCIX, OpenCAPI
  - PCIe Gen 5
  - NVMe-oF and NVMe/TCP
- > Application Requirements
  - Deterministic latencies
  - Load/Store vs Block
  - Performance
  - Endurance
- Hybrid Control
  - 3D NAND, 2D NAND
  - Cache: 3DXpoint, MRAM





#### Flash Control has extended into tiered subsystem management

- Caching has extended into SCM, necessitating hybrid control
- IO interfaces need to support fabric
- Advancing geometries and process technologies require more and advanced error correction
- Hyperscaler applications demand load/store performance with deterministic latency



## Annual Flash Controller Update

David McIntyre

Text **FMS** to (408) 772-7044