

### **NVM** Usage in the Al Era

Dave Eggleston
Intuitive Cognition Consulting



"A groundbreaking tour of the mind, and explains the two systems that drive the way we think."

"System 1 is fast, intuitive, and emotional; System 2 is slower, more deliberative, and more logical."

Daniel Kahneman is professor emeritus of psychology and public affairs at Princeton University.

WINNER OF THE NOBEL PRIZE IN ECONOMICS











## $17 \times 24 = ?$









 $17 \times 24 = ?$ 

# Intuition

REASONING



## Intuition

### System 1

- ☐ Lightning fast
- Automatic
- Real time
- Effortless
- Approximate

# Edge

#### REASONING

#### SYSTEM 2

- Slow
- ☐ Interrupt driven
- Background
- ☐ Energy inefficient
- Precise

### **DATACENTER**



# Edge





■ Non-von Neumann architecture Inference

#### **DATACENTER**



■ VON NEUMANN ARCHITECTURE
Inference Training





Edge Inference



**DATACENTER** Inference



**DATACENTER TRAINING** 



MCU Bottleneck

Fast & Coherent Checkpointing

Ultra low power & Speed

Parallelism

NVM?

NVM?

NVM? © Intuitive Cognition Consulting

Dave Eggleston

10



Edge Inference



DATACENTER Inference



DATACENTER
TRAINING



MCU Bottleneck

Memory Bottleneck



Fast & Coherent
Checkpointing

Ultra low power & Speed

Parallelism

Che

NVM?



© Intuitive Cognition Consulting
Dave Eggleston

NVM?

11

Flash Memory Summit 2019 Santa Clara, CA



Edge Inference



**DATACENTER** Inference



**DATACENTER TRAINING** 



MCU Bottleneck

Memory Bottleneck

I/O Bottleneck

Fast & Coherent Checkpointing

NVM?

Ultra low power & Speed

**Parallelism** 

NVM?

NVM?

© Intuitive Cognition Consulting Dave Eggleston

12





## Edge Inference



MCU Bottleneck

Ultra low power & Speed

Analog NVM-based MAC acceleration

- Brain operates on <20 Watts</li>
- Von Neumann inference >20 GigaWatts!
- Want non-von Neumann architecture (low power)
- Want real-time inference (speed)
- MCU lacks matrix math capability
- Want a MAC accelerator to do matrix math
- Use analog NVM for MAC acceleration!



## Edge Inference

MCU Bottleneck

Ultra low power & Speed

Analog NVM-based MAC acceleration





- Brain operates on <20 Watts</li>
- Von Neumann inference >20GigaWatts!
- Must have non-von Neumann architecture
- Want real-time inference (speed)
- MCU lacks matrix math capability
- Want a MAC accelerator to do matrix math
- Use analog NVM for MAC acceleration!





B 2018 Myrtin Allrights reserved

© Intuitive Cognition Consulting Dave Eggleston

MYTHIC



## Edge Inference

MCU Bottleneck

Ultra low power & Speed

Analog NVM-based MAC acceleration





- Brain operates on <20 Watts</li>
- Von Neumann inference >20GigaWatts!
- Must have non-von Neumann architecture
- Want real-time inference (speed)
- MCU lacks matrix math capability
- Want a MAC accelerator to do matrix math
- Use analog NVM for MAC acceleration!

#### SYNTIANT





© Intuitive Cognition Consulting Dave Eggleston

#### **Mythic Mixed-Signal Computing**





Made possible with Mixed-Signal Computing

Flash Memory Summit 2019 Santa Clara, CA

MYTHIC



### DATACENTER Inference

Memory Bottleneck

Parallelism

CCIX Memory with Inference engines



Santa Clara, CA



Hold trained model in memory

- Inference bottlenecked by CPU-DDR
- Want parallelism for speed
- Multiple sets of CCIX Memory and Inference engines
- CCIX enables load/store peer-peer sharing
- Avoid CPU-DDR bottleneck
- Use DRAM or NVM based CCIX Memory!



Dave Eggleston



#### **DATACENTER TRAINING**

I/O Bottleneck

Fast & Coherent Checkpointing

CCIX/CXL coherent NVM expansion







Want more memory

- Want fast & coherent NVM checkpointing
- Want fast rebuild time
- Avoid 5us+ latency on I/O
- Avoid using precious DDR slots
- Coherent memory expansion on CCIX/CXL!
- Mix NVM and DRAM on CCIX/CXL!

Cache coherent memory expansion





Edge Inference



**DATACENTER** Inference



**DATACENTER TRAINING** 



MCU Bottleneck

Memory Bottleneck

I/O Bottleneck

Ultra low power & Speed

**Parallelism** 

Fast & Coherent Checkpointing

Analog NVM based MAC acceleration

**CCIX Memory with** Inference engines © Intuitive Cognition Consulting

**CCIX/CXL** coherent **NVM** expansion

Flash Memory Summit 2019 Santa Clara, CA

Dave Eggleston

18



## Now on to the AI NVM Experts!













Dave Eggleston
Intuitive Cognition Consulting
Technology & Business Strategy

Email: dave@in-cog.com

Twitter: @NVM\_DaveE

LinkedIn:

linkedin.com/in/deggleston/



Flash Memory Summit 2019 Santa Clara, CA