| --- |
| title: "ISCA 2018" |
| date: 2018-05-13T18:51:37-04:00 |
| draft: false |
| weight: 10 |
| permalink: events/isca-2018 |
| --- |
| |
| <div style="font-size:150%;border:none;margin: 0;padding:.1em;text-align:center;color:#000"> |
| |
| AMD gem5 APU Simulator: Modeling GPUs Using the Machine |
| ISA |
| |
| </div> |
| |
| <div style="font-size:120%;border:none;margin:0;padding:.1em;text-align:center;color#000"> |
| |
| Held in conjunction with [ISCA 2018](http://iscaconf.org/isca2018/). |
| June 2nd, 2018. |
| |
| </div> |
| |
| # Important Dates |
| |
| The tutorial will be held on day one of the conference - June 2nd, 2018 |
| |
| ISCA 2018 early registration and hotel reservation deadline - April |
| 16th, 2018 |
| |
| # Abstract |
| |
| AMD Research has developed an APU (Accelerated Processing Unit) model |
| that extends gem5 \[1\] with a GPU timing model that executes the GCN |
| (Graphics Core Next) generation 3 machine ISA \[2, 3\]. In addition to |
| supporting a modern machine ISA, the model supports running the |
| open-source Radeon Open Compute platform (ROCm) stack without |
| modification. This allows users to run a wide variety of applications |
| written in several high-level languages, including C++, HIP, OpenMP, and |
| OpenCL. This provides researchers the ability to evaluate many different |
| types of workloads, from traditional compute applications to emerging |
| modern GPU workloads, such as task parallel and machine learning |
| applications. The resulting AMD gem5 APU simulator is a cycle-level, |
| flexible research model that is capable of representing many different |
| APU configurations, on-chip cache hierarchies, and system designs. Our |
| APU extensions allow researchers to model both CPU and GPU memory |
| requests and the interactions between them. In particular, the model |
| uses SLICC and Ruby to implement a wide variety of coherence and |
| synchronization solutions, which is a critical research area in |
| heterogeneous computing. The model has been used in several top-tier |
| computer architecture publications in the last several years \[MICRO |
| 2013, HPCA 2014, ASPLOS 2014, ISCA 2014, HPCA 2015, ASPLOS 2015, MICRO |
| 2016, HPCA 2017, ISCA 2017, HPCA 2018\]. |
| |
| In this tutorial, we will describe the capabilities of the AMD gem5 APU |
| simulator that will be publically released with a liberal BSD license |
| before ISCA 2018. We will detail the simulated APU architecture, review |
| the execution flow, and describe how the simulator has been used. The |
| presentation will also discuss key design decisions and tradeoffs. For |
| example, we use the system-call emulation mode to avoid running a full |
| OS and kernel driver, therefore we will describe the simulator’s |
| system-call emulation interface, and how the ROCm runtime and user space |
| drivers interact with it. Also, our GPU model now directly executes |
| native machine ISA instructions rather than the HSAIL intermediate |
| language representation. Previously relying on executing the |
| intermediate language simplified workload compilation, but was less |
| accurate when modeling hardware behavior. In this tutorial, we will |
| highlight many of the improvements enabled by executing the GCN3 ISA. |
| |
| \[1\]. Nathan Binkert et al. [The gem5 |
| Simulator](https://doi.org/10.1145/2024716.2024718). In SIGARCH Computer |
| Architecture News, vol. 39, no. 2, pp. 1-7, Aug. 2011. |
| |
| \[2\]. AMD. [AMD GCN3 ISA Architecture |
| Manual](https://gpuopen.com/compute-product/amd-gcn3-isa-architecture-manual/) |
| |
| \[3\]. Anthony Gutierrez et al. [Lost in Abstraction: Pitfalls of |
| Analyzing GPUs at the Intermediate Language |
| Level](https://doi.org/10.1109/HPCA.2018.00058). In HPCA 2018. |
| |
| # Slides |
| |
| # Schedule |
| |
| | Topic | Presenter | Time | |
| | ------------------------------- | -------------- | -------------- | |
| | Background | Tony | 8:00-8:15 am | |
| | ROCm Stack, GCN3 ISA, and uArch | Tony | 8:15-9:15 am | |
| | HSA Queuing | Sooraj | 9:15-10:00 am | |
| | Break | 10:00-10:30 am | |
| | Ruby and GPU Protocol Tester | Tuan | 10:30-11:15 am | |
| | Demo/Workloads and Q+A | TBD | 11:15-12:00 pm | |
| |
| # Presenters |
| |
| Tony Gutierrez (AMD Research) |
| |
| Sooraj Puthoor (AMD Research) |
| |
| Brad Beckmann (AMD Research) |
| |
| Tuan Ta (Cornell) |