

### The Open Source DRAM Simulator DRAMSys

#### Prof. Dr. Matthias Jung, JMU Würzburg













## DRAMSys in a Nutshell

#### Simulation and Design Space Exploration of Modern DRAM-based Memory Systems:

- Which DRAM configuration?
- When to support DDR5 or LPDDR5?
- How to configure the memory controller?
- What is the system-level application behavior?

#### **DRAMSys Offers:**

- High-speed and flexible models of all standards
- Fast and accurate design space exploration
- Early identification of bottlenecks
- Connection to cores (e.g. SystemC, gem5, ...)









# DRAMSys Open Source Model

- Open source: DDR3/4, LPDDR4, Wide I/O 1/2, GDDR5/X, GDDR6, and HBM2
- Commercial/academic licenses: DDR5, LPDDR5, HBM3, Trace Analyzer tool
- New standard models will be open-sourced when a level of maturity is reached
- Customer-specific consulting, modifications and developments



Thanks to our Key Partners:







Invented by Robert H. Dennard in 1966

#### The DRAM Device / Operation







- Using Sub-Arrays for efficient wiring
- Bank parallelism, but banks share data and command bus



#### **DRAMs Basic Operations**



#### Important DRAM Commands:

x8

- ACT: Activates a specific row in a specific bank (sensing into PSA) [tRCD]
- **RD**: Read from activated row (prefetch from PSA to SSA and burst out) [*tCL* + *tBURST*]
- **PRE**: Precharges set LWL=0 set LBL=VDD/2 [*tRP*]
- REFA: DRAM cells are leaky and have to be refreshed [tREFI & tRFC]





#### JEDEC Standard: e.g. Timing Dependencies







#### **DRAMSys** Architecture



- Based on SystemC TLM2, compliant with TLM-AT coding style
- Flexible SW-Architecuture to support various JEDEC DRAM standards (e.g., DDR4, LPDDR4, GDDR6, HBM, ...)
- For RTL-like accuracy a custom TLM protocol (DRAM-AT) is used





## **Custom TLM Protocol**

- Simulation speed can be increased by reducing the number of events
- Clock signal has the highest event generation rate
- Do we need to simulate each clock cycle to generate cycle-accurate results?



- Simulation of state changes is sufficient, idle clock cycles can be skipped!
  - Large event reduction at low memory access densities
  - No loss of accuracy





#### **Custom TLM Protocol**







### **DRAMSys Simulation Speed**



- Simulation of only the important events
- Speedup from 4x to 10.000x depending on trace density
- Average speedups depend on applications
- Typical values: 400x
- 100% RTL Accuracy





#### Number of DRAM Standards is Growing!







#### DDR5 JEDEC Standards



- Defines commands, states, timings and interface properties
- Very complex protocol
  - DDR3: 226 pages
  - DDR4: 266 pages
  - DDR5: 496 pages
- Descriptions are not formal
- And not even correct ...











<i>1e"</ti>

## DRAMmI: a formal Description for JEDEC Standards

The ideal case: A *formal* language, which has the power to ...

Information of ~100 Pages in the Standard



A new standard requires a serious amount of handcraft:

- New models for fast simulation and verification
- Adapt memory ulletmodels and HW IP every time





## Modeling All DRAM States and Commands



- Comprehensive Model with clear separation between states and commands
- Models only 5 state types
- Support of multiple banks i.e. bank parallelism
- Divided in several subnets:
  - Banks
  - Refresh
  - Power-Down
  - Bankgroups
  - Ranks





# Modeling DRAM Timing





- DRAMs feature a complex timing protocol
- E.g. 90 out of 260 pages of DDR3 standard are showing timing diagrams and explanations for the timings.
- DRAM command timing dependencies can be modeled by a timed inhibitor arc:
- For example, ACT to RD would be  $t_{RCD}$











## Modeling DRAM Timing (8 Banks)







#### Code Generation and Validation

- Timed Petri nets allow a formal representation of a DRAM protocol, however, no graphical handling possible
- DRAMml is a DSL to describe DRAM's behavior with a petri net semantic
- DSL is as a basis for correct-by-construction DRAMSys TLM code generation
- For example: from DDR5 release it took 2 weeks to implement the model







#### DDR3

| 👌 ddr3.py | / Ĉ 4.81 KiB                                                                                                                 |
|-----------|------------------------------------------------------------------------------------------------------------------------------|
| 1         | <pre>from drampyml.component_levels import *</pre>                                                                           |
| 2         | from drampyml.commands import *                                                                                              |
| 3         | from drampyml.dram import Dram                                                                                               |
| 4         | <pre>from drampyml.constants import *</pre>                                                                                  |
| 5         | from drampyml.syntax import Max                                                                                              |
| 6         | from drampyml.busses import CommandBus                                                                                       |
| 7         | from drampyml.conditions import *                                                                                            |
| 8         | <pre>from drampyml.constraints import *</pre>                                                                                |
| 9         |                                                                                                                              |
| 10        | <pre>tBURST = Variable("tBURST", defaultBurstLength / dataRate * tCK)</pre>                                                  |
| 11        | tRDWR = Variable("tRDWR", tRL + tBURST + tCK * 2 - tWL)                                                                      |
| 12        | tRDWR_R = Variable("tRDWR_R", tRL + tBURST + tRTRS - tWL)                                                                    |
| 13        | tWRRD = Variable("tWRRD", tWL + tBURST + tWTR - tAL)                                                                         |
| 14        | tWRRD_R = Variable("tWRRD_R", tWL + tBURST + tRTRS - tRL)                                                                    |
| 15        | <pre>tWRPRE = Variable("tWRPRE", tWL + tBURST + tWR)</pre>                                                                   |
| 16        | tRDPDEN = Variable("tRDPDEN", tRL + tBURST + tCK)                                                                            |
| 17<br>18  | <pre>tWRPDEN = Variable("tWRPDEN", tWL + tBURST + tWR) tWRAPDEN = Variable("tWRAPDEN", tWL + tBURST + tWR + tCK)</pre>       |
| 10        | UWRAPDEN = Valiable( UWRAPDEN , LWL + LBURSI + LWR + LUR)                                                                    |
| 20        | <pre>custom_timings = [</pre>                                                                                                |
| 20        | tBURST,                                                                                                                      |
| 22        | tRDWR,                                                                                                                       |
| 23        | tRDWR_R,                                                                                                                     |
| 24        | twrRD,                                                                                                                       |
| 25        | tWRRD_R,                                                                                                                     |
| 26        | tWRPRE,                                                                                                                      |
| 27        | tRDPDEN,                                                                                                                     |
| 28        | tWRPDEN,                                                                                                                     |
| 29        | twRAPDEN,                                                                                                                    |
| 30        | ]                                                                                                                            |
| 31        |                                                                                                                              |
| 32        | # fmt: off                                                                                                                   |
| 33        | command_timing_constraints = [                                                                                               |
| 34        | # Bank                                                                                                                       |
| 35        | CommandTimingConstraint(Bank, [ACT], [PREPB], tRAS),                                                                         |
| 36        | CommandTimingConstraint(Bank, [ACT], [RD, WR, MWR, RDA, WRA, MWRA], tRCD - tAL),                                             |
| 37        | CommandTimingConstraint(Bank, [ACT], [ACT], tRC),                                                                            |
| 38        | CommandTimingConstraint(Bank, [RD], [PREPB], tAL + tRTP),                                                                    |
| 39        | CommandTimingConstraint(Bank, [RD], [WR, MWR, WRA, MWRA], tRDWR),                                                            |
| 40<br>41  | CommandTimingConstraint(Bank, [RDA], [ACT], tAL + tRTP + tRP),<br>CommandTimingConstraint(Bank, [WR, MWR], [PREPB], tWRPRE), |
| 41        | CommandTimingConstraint(Bank, [WR, MWR], [WR, MWR], [WR, MWRA], tCCD),                                                       |
| 42        | CommandTimingConstraint(Bank, [WR, MWR], [MR, MWR, MWR, MWRA], [CCCD),                                                       |
| 44        | CommandTimingConstraint(Bank, [WR, MWR], [RDA], Max(tWRRD, tWRPRE - tRTP - tAL)),                                            |
| 45        | CommandTimingConstraint(Bank, [WRA, MWRA], [ACT], tWRPRE + tRP),                                                             |
| 46        | CommandTimingConstraint(Bank, [PREPB], [ACT], tRP),                                                                          |
| 47        |                                                                                                                              |
| 48        | # Rank                                                                                                                       |
| 49        | CommandTimingConstraint(Rank, [ACT], [PREAB], tRAS),                                                                         |
| 50        | CommandTimingConstraint(Rank, [ACT], [ACT], tRRD),                                                                           |
| 51        | CommandTimingConstraint(Rank, [ACT], [PDEA], tACTPDEN),                                                                      |
| 52        | CommandTimingConstraint(Rank, [ACT], [REFAB, SREFEN], tRC),                                                                  |
| 53        | <pre>CommandTimingConstraint(Rank, [RD], [PREAB], tAL + tRTP),</pre>                                                         |
| 54        | CommandTimingConstraint(Rank, [RD, RDA], [PDEA, PDEP], tRDPDEN),                                                             |
| 55        | CommandTimingConstraint(Rank, [RD, RDA], [RD, RDA], tCCD),                                                                   |
| 56        | CommandTimingConstraint(Rank, [RD, RDA], [WR, MWR, WRA, MWRA], tRDWR),                                                       |
| 57        | CommandTimingConstraint(Rank, [RDA], [REFAB], tAL + tRTP + tRP),                                                             |
| 58        | CommandTimingConstraint(Rank, [RDA], [PREAB], tAL + tRTP),                                                                   |
| 59        | CommandTimingConstraint(Rank, [RDA], [SREFEN], Max(tRDPDEN, tAL + tRTP + tRP)),                                              |
| 60        | CommandTimingConstraint(Rank, [WR, MWR], [PDEA], tWRPDEN),                                                                   |
| 61        | CommandTimingConstraint(Rank, [WRA, MWRA], [PDEA, PDEP], tWRAPDEN),                                                          |

CommandTimingConstraint(Rank, [WR, MWR, WRA, MWRA], [WR, MWR, WRA, MWRA], tCCD),

62

commandiamangconscialic(Mank, EmMA, HmMA), ErocA, Focij, CmMAroch/, 62 CommandTimingConstraint(Rank, [WR, MWR, WRA, MWRA], [WR, MWR, WRA, MWRA], tCCD), CommandTimingConstraint(Rank, [WR, MWR, WRA, MWRA], [RD, RDA], tWRRD), 63 64 CommandTimingConstraint(Rank, [WRA, MWRA], [REFAB], tWRPRE + tRP), 65 CommandTimingConstraint(Rank, [WRA, MWRA], [PREAB], tWRPRE), 66 CommandTimingConstraint(Rank, [WRA, MWRA], [SREFEN], Max(tWRAPDEN, tWRPRE + tRP)), CommandTimingConstraint(Rank, [PREPB], [REFAB], tRP), 67 CommandTimingConstraint(Rank, [PREPB], [PDEA, PDEP], tPRPDEN), 68 69 CommandTimingConstraint(Rank, [PREPB], [SREFEN], tRP), 70 CommandTimingConstraint(Rank, [PREAB], [ACT, REFAB, SREFEN], tRP), 71 CommandTimingConstraint(Rank, [PREAB], [PDEP], tPRPDEN), CommandTimingConstraint(Rank, [PDEP], [PDXP], tPD), 72 CommandTimingConstraint(Rank, [PDEA], [PDXA], tPD), 73 74 CommandTimingConstraint(Rank, [PDXA], [PDEA], tCKE), 75 CommandTimingConstraint(Rank, [PDXA], [PDEP], tCKE), CommandTimingConstraint(Rank, [PDXP], [REFAB, SREFEN, ACT], tXP), 76 CommandTimingConstraint(Rank, [PDXA], [ACT, PREPB, PREAB, RD, RDA, WR, MWR, WRA, MWRA], tXP), 77 78 CommandTimingConstraint(Rank, [REFAB], [ACT, REFAB, SREFEN], tRFC), 79 CommandTimingConstraint(Rank, [REFAB], [PDEP], tREFPDEN), CommandTimingConstraint(Rank, [SREFEX], [ACT, REFAB, PDEP, SREFEN], tXS), 80 CommandTimingConstraint(Rank, [SREFEX], [RD, RDA, WR, MWR, WRA, MWRA], tXSDLL), 81 82 CommandTimingConstraint(Rank, [SREFEX], [SREFEX], tCKESR), 83 84 # Channel CommandTimingConstraint(Channel, [RD, RDA], [RD, RDA], tBURST + tRTRS, [], Different(Rank)), 85 86 CommandTimingConstraint(Channel, [RD, RDA], [WR, MWR, WRA, MWRA], tRDWR\_R, [], Different(Rank)), 87 CommandTimingConstraint(Channel, [WR, MWR, WRA, MWRA], [WR, MWR, WRA, MWRA], tBURST + tRTRS, [], Different(Rank)), 88 CommandTimingConstraint(Channel, [WR, MWR, WRA, MWRA], [RD, RDA], tWRRD\_R, [], Different(Rank)), 89 ] 90 # fmt: on 91 92 faw\_constraints = [FAWConstraint([ACT], Rank, tFAW)] 93 94 bus\_commands = { 95 ACT, 96 RD, 97 RDA. 98 WR, 99 WRA, 100 MWR. 101 MWRA. 102 PREPB, 103 PREAB, 104 PDEP. 105 PDEA, 106 PDXA, 107 PDXP, 108 REFAB, 109 SREFEX, 110 SREFEN, 111 } 112 113 command\_bus = CommandBus("Bus", bus\_commands) 114 115 dram = Dram( 116 "DDR3", 117 [command\_bus], 118 command\_timing\_constraints, 119 faw\_constraints, 120 custom\_timings, 121 ) 122













#### Where to get the tools:

#### **DRAMSys:**

https://github.com/tukl-msd/DRAMSys

gem5: https://gem5.googlesource.com/

**DRAMPower:** 

https://github.com/tukl-msd/DRAMPower

**3D-ICE:** <u>https://github.com/esl-epfl/3d-ice</u>









### The Open Source DRAM Simulator DRAMSys

#### Prof. Dr. Matthias Jung, JMU Würzburg











