



#### Non-Volatile Neural Network Accelerator in Your SoC

#### Sang-Soo Lee, CEO and Co-Founder Seung-Hwan Song, CTO and Co-Founder ANAFLASH Inc.



# **Company Overview**





#### WHAT WE DO?

Founded in 2017 we develop Logic Compatible NV-DNN and eFlash IPs for Edge Computing



**TEAM** Executives have

combined 60+ years of Engineering & Management Experience



TECHNOLOGY Patent pending NV-DNN and eFlash IPs in Standard CMOS process



WHERE WE ARE?

Headquartered in San Jose CA USA









Berkeley S K Y  $\supset$  E C K

### Artificial Intelligence in the Edge









# Growing Cloud Energy Concern

9,000 terawatt hours (TWh)



#### N. Jones (Nature 2018)

20.9% of projected ENERGY FORECAST electricity demand Widely cited forecasts suggest that the total electricity demand of information and communications technology (ICT) will accelerate in the 2020s, and that data centres will take a larger slice. Networks (wireless and wired) Production of ICT Consumer devices (televisions, computers, mobile phones) Data centres 2010 2012 2014 2016 2018 2020 2022 2024 2026 2028 2030

**Flash Memory Summit** 



## Let's move to Edge! However,...

- Challenges in the Edge Environment
  - Size, Weight, and Power (SWaP) limited
  - Compute and memory resource limited
  - Cost sensitive

How to make it work under these challenges?



- Technique that allows approximate results in applications not requiring strict accuracy
- This can improve power efficiency a lot
- In case, such errors can be managed by system level techniques statistically (i.e. ECC and redundancy, etc.)
- Could be combined with Digital (However,...)



## Analog vs. Digital Computation

| ANALOG                                          | DIGITAL                                        | Which is better for efficiency? |
|-------------------------------------------------|------------------------------------------------|---------------------------------|
| Narrow signal swing                             | Full VDD-GND swing                             | ANALOG                          |
| Information from single transistor (continuous) | Information from single<br>transistor (1 or 0) | ANALOG                          |
| Multi-bit single wire                           | Single-bit single-wire                         | ANALOG                          |
| Result affected by noise and variation          | High noise margin                              | DIGITAL                         |

R. Sarpeshkar (Neural Computation 1998)

• Analog has more advantages for efficiency!

# Let's do Analog Computing in Edge





#### Analog is significantly efficient at low-precision!



### Memory Access Bottleneck

 Off-chip access from CPU to memory (storage) has long (and unpredictable) latency, and limited bandwidth
 Heavy





# Analog CIM Architecture

- Analog Compute-in-Memory IP integrated in CPU
- Reduce off-chip memory access





# Lesson Learning from Human Brain

- Brain has much more efficiency with much small values of SWaP
  - $3.6 \times 10^{15}$  synaptic operation with  $12W \rightarrow 3 \times 10^{14}$
  - i9 CPU running 3GHz with 140W  $\rightarrow$  2x10<sup>7</sup>

 Biological neural network doesn't discriminate computational device and memory device



# **Candidates for Analog Computing**

- Logic gates (e.g. NAND, XOR, etc.)  $\rightarrow$  No
- Transistor, capacitor, inductor, etc.  $\rightarrow$  Yes

- SRAM (Not able to store multi-bit, volatile)  $\rightarrow$  No
- MRAM (Not able to store multi-bit, non-volatile)  $\rightarrow$  No
- ReRAM (multi-bit, nonvolatile) → Yes
- Flash (multi-bit, nonvolatile)  $\rightarrow$  Yes



# No Process Overhead in Standard Logic Process Leverage High Performance Digital Logic (Scalable)



#### **Precise Analog Programming Scheme**



#### Logic Compatible Flash Based Synapse





Cell current proportional to X·W (=0µA,5µA,10µA)



### 65nm Test Chip Summary



M. Kim et al., IEDM 2018





High efficiency (171.1 TOPS/W) by analog CIM arch. Recognition accuracy close to the SW model Flash Memory Summit 2019 Santa Clara, CA



- Growing need to move AI computing toward Edge
- Analog computing can improve power efficiency by approximately computing neural network
- Analog computing-in-memory using logic compatible embedded Flash memory is a strong candidate to overcome memory bottleneck
- Test chip result fabricated in 65nm logic process shows power efficiency of 171.1 TOPS/W



#### THANK YOU FOR YOUR ATTENTION



info@anaflash.com



Always-on Local AI and NVM solution For Battery-Powered Smart Devices

3003 N. First St. #221 San Jose, CA 95134