blob: dd7fc785a518bda78080f1cfeadeb055d5f02223 [file] [log] [blame] [view]
---
title: "ISCA 2018"
date: 2018-05-13T18:51:37-04:00
draft: false
weight: 10
permalink: events/isca-2018
---
<div style="font-size:150%;border:none;margin: 0;padding:.1em;text-align:center;color:#000">
AMD gem5 APU Simulator: Modeling GPUs Using the Machine
ISA
</div>
<div style="font-size:120%;border:none;margin:0;padding:.1em;text-align:center;color#000">
Held in conjunction with [ISCA 2018](http://iscaconf.org/isca2018/).
June 2nd, 2018.
</div>
# Important Dates
The tutorial will be held on day one of the conference - June 2nd, 2018
ISCA 2018 early registration and hotel reservation deadline - April
16th, 2018
# Abstract
AMD Research has developed an APU (Accelerated Processing Unit) model
that extends gem5 \[1\] with a GPU timing model that executes the GCN
(Graphics Core Next) generation 3 machine ISA \[2, 3\]. In addition to
supporting a modern machine ISA, the model supports running the
open-source Radeon Open Compute platform (ROCm) stack without
modification. This allows users to run a wide variety of applications
written in several high-level languages, including C++, HIP, OpenMP, and
OpenCL. This provides researchers the ability to evaluate many different
types of workloads, from traditional compute applications to emerging
modern GPU workloads, such as task parallel and machine learning
applications. The resulting AMD gem5 APU simulator is a cycle-level,
flexible research model that is capable of representing many different
APU configurations, on-chip cache hierarchies, and system designs. Our
APU extensions allow researchers to model both CPU and GPU memory
requests and the interactions between them. In particular, the model
uses SLICC and Ruby to implement a wide variety of coherence and
synchronization solutions, which is a critical research area in
heterogeneous computing. The model has been used in several top-tier
computer architecture publications in the last several years \[MICRO
2013, HPCA 2014, ASPLOS 2014, ISCA 2014, HPCA 2015, ASPLOS 2015, MICRO
2016, HPCA 2017, ISCA 2017, HPCA 2018\].
In this tutorial, we will describe the capabilities of the AMD gem5 APU
simulator that will be publically released with a liberal BSD license
before ISCA 2018. We will detail the simulated APU architecture, review
the execution flow, and describe how the simulator has been used. The
presentation will also discuss key design decisions and tradeoffs. For
example, we use the system-call emulation mode to avoid running a full
OS and kernel driver, therefore we will describe the simulators
system-call emulation interface, and how the ROCm runtime and user space
drivers interact with it. Also, our GPU model now directly executes
native machine ISA instructions rather than the HSAIL intermediate
language representation. Previously relying on executing the
intermediate language simplified workload compilation, but was less
accurate when modeling hardware behavior. In this tutorial, we will
highlight many of the improvements enabled by executing the GCN3 ISA.
\[1\]. Nathan Binkert et al. [The gem5
Simulator](https://doi.org/10.1145/2024716.2024718). In SIGARCH Computer
Architecture News, vol. 39, no. 2, pp. 1-7, Aug. 2011.
\[2\]. AMD. [AMD GCN3 ISA Architecture
Manual](https://gpuopen.com/compute-product/amd-gcn3-isa-architecture-manual/)
\[3\]. Anthony Gutierrez et al. [Lost in Abstraction: Pitfalls of
Analyzing GPUs at the Intermediate Language
Level](https://doi.org/10.1109/HPCA.2018.00058). In HPCA 2018.
# Slides
# Schedule
| Topic | Presenter | Time |
| ------------------------------- | -------------- | -------------- |
| Background | Tony | 8:00-8:15 am |
| ROCm Stack, GCN3 ISA, and uArch | Tony | 8:15-9:15 am |
| HSA Queuing | Sooraj | 9:15-10:00 am |
| Break | 10:00-10:30 am |
| Ruby and GPU Protocol Tester | Tuan | 10:30-11:15 am |
| Demo/Workloads and Q+A | TBD | 11:15-12:00 pm |
# Presenters
Tony Gutierrez (AMD Research)
Sooraj Puthoor (AMD Research)
Brad Beckmann (AMD Research)
Tuan Ta (Cornell)