_pages/events/arm-summit-2017.md - public/gem5-website - Git at Google

 ---
 title: "ARM research summit 2017"
 date: 2018-05-13T18:51:37-04:00
 draft: false
 weight: 1000
 permalink: events/arm-summit-2017
 ---

 The [ARM Research Summit](https://developer.arm.com/research/summit) is
 an academic summit to discuss future trends and disruptive technologies
 across all sectors of computing. On the first day of the Summit, ARM
 Research will host a gem5 workshop to give a brief overview of gem5 for
 computer engineers who are new to gem5 and dive deeper into some of
 gem5's more advanced capabilities. The attendees will learn what gem5
 can and cannot do, how to use and extend gem5, as well as how to
 contribute back to gem5.

 The ARM Research Summit will take place in Cambridge (UK) over the days
 of 11-13 September 2017. The gem5 workshop will be a full day event on
 the 11th September.

 # Streaming & Offline viewing

 The workshop is being streamed live and all talks will be available on
 YouTube after the workshop. See the [main summit
 page](https://developer.arm.com/research/summit/summit-live) for
 details.

 # Target Audience

 The primary audience is researchers who are using, or planning to use,
 gem5 for architecture research.

 **Prerequisites**: Attendees are expected to have a working knowledge of
 C++, Python, and computer systems.

 # Registration

 See the main [ARM Research Summit
 website](https://developer.arm.com/research/summit) for details about
 registration.

 # Schedule

 The workshop will take place on Monday the 11th September 2017 at
 Robinson College in Cambridge (UK). The workshop starts at 9.00 and runs
 in parallel with the main Summit program until 16.30 when it joins the
 main
 program.

 | Time        | Topic                                                                                                                                                                             |
 | ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | 09.00-09.30 | Welcome and introduction to gem5 — [slides](Media:Summit2017_Intro_to_gem5.pdf "wikilink")                                                                                        |
 | 09.30-09.45 | [Interacting with gem5 using workload-automation & devlib](#WA "wikilink") — [slides](Media:Summit2017_wa_devlib.pdf "wikilink")                                                  |
 | 09.45-10.00 | [ARM Research Starter Kit: System Modeling using gem5](#StarterKit "wikilink") — [slides](Media:Summit2017_starterkit.pdf "wikilink")                                             |
 | 10.00-10.15 | Break                                                                                                                                                                             |
 | 10.15-10.30 | [Debugging a target-agnostic JIT compiler with GEM5](#JIT_Debugging "wikilink")                                                                                                   |
 | 10.30-11.00 | [Learning gem5: Modeling Cache Coherence with gem5](#Ruby "wikilink") — [slides](Media:Summit2017_learning_gem5_ruby.pdf "wikilink")                                              |
 | 11.00-11.15 | Break (overlaps with main program break)                                                                                                                                          |
 | 11.15-11.45 | [A Detailed On-Chip Network Model inside a Full-System Simulator](#Garnet2 "wikilink") — [slides](Media:Summit2017_garnet2.0_tutorial.pdf "wikilink")                             |
 | 11.45-12.00 | [Integrating and quantifying the impact of low power modes in the DRAM controller in gem5](#DRAMPower "wikilink") — [slides](Media:Summit2017_drampower.pdf "wikilink")           |
 | 12.00-12.15 | Break                                                                                                                                                                             |
 | 12.15-12.45 | [CPU power estimation using PMCs and its application in gem5](#PowMon "wikilink") — [slides](Media:Summit2017_powmon.pdf "wikilink")                                              |
 | 12.45-13.00 | [gem5: empowering the masses](#PowerFramework "wikilink") — [slides](Media:Summit2017_powerframework.pdf "wikilink")                                                              |
 | 13.00-14.15 | Lunch                                                                                                                                                                             |
 | 14.15-14.45 | [Trace-driven simulation of multithreaded applications in gem5](#ElasticSimMATE "wikilink") — [slides](Media:Summit2017_elasticsimmate.pdf "wikilink")                            |
 | 14.45-15.00 | [Generating Synthetic Traffic for Heterogeneous Architectures](#TraceGeneration "wikilink") — [slides](Media:Summit2017_trace_generation.pdf "wikilink")                          |
 | 15:00-15:15 | Break                                                                                                                                                                             |
 | 15:15-16:45 | [System Simulation with gem5, SystemC and other Tools](#SystemC "wikilink") — [slides](Media:Summit2017_systemc.pdf "wikilink")                                                   |
 | 15:45-16:00 | [COSSIM: An Integrated Solution to Address the Simulator Gap for Parallel Heterogeneous Systems](#COSSIM "wikilink") — [slides](Media:Summit2017_COSSIM.pdf "wikilink")           |
 | 16:00-16:15 | [Simulation of Complex Systems Incorporating Hardware Accelerators](#ComplexSystems "wikilink") — [slides](Media:Summit2017_complex_fs_incorporating_accelerators.pdf "wikilink") |
 | 16:15-16:30 | Break                                                                                                                                                                             |
 | 16:30-18:15 | Introduction to ARM Research                                                                                                                                                      |
 | 18:20-20.00 | Poster Session & Pre-Dinner Drinks                                                                                                                                                |
 | 20.00-21.30 | Buffet Dinner                                                                                                                                                                     |

 # Talks

 <span id="ElasticSimMATE">

 ## Trace-driven simulation of multithreaded applications in gem5

 The gem5 modular simulator provides a rich set of CPU models which
 permits balancing simulation speed and accuracy. The growing interest in
 using gem5 for design-space exploration however requires higher
 simulation speeds so as to enable scalability analysis with systems
 comprising tens to hundreds of cores. One relevant approach for enabling
 significant speedups lies in using trace-driven simulation, in which CPU
 cores are abstracted away thereby enabling to refocus simulation effort
 on memory/interconnect subsystems which play a key role on performance.
 This talk describes some of the work carried out on the Mont-Blanc
 european projects on trace-driven simulation and discusses the related
 challenges for multicore architectures in which trace injection requires
 to account for the API synchronization of the underlying running
 application. The ElasticSimMATE tool is presented as an initiative
 towards combining Elastic Traces and SimMATE so as to enable fast and
 accurate simulation of multithreaded applications on ARM multicore
 systems.

 > **Dr Gilles Sassatelli** is a CNRS senior scientist at LIRMM, a
 > CNRS-University of Montpellier academic research unit with a staff of
 > over 400. He is vice-head of the microelectronics department and leads
 > a group of 20 researchers working in the area of smart embedded
 > digital systems. He has authored over 200 peer-reviewed papers and has
 > occupied key roles in a number of international conferences. Most of
 > his research is conducted in the frame of international EU-funded
 > projects such as the DreamCloud and Mont-Blanc projects.

 > **Alejandro Nocua** received the Ph.D. degree in Microelectronics from
 > the University of Montpellier, France, in 2016. Currently, he is a
 > postdoctoral researcher at the French National Center for Scientific
 > Research (CNRS). His research interests include the analysis of
 > high-performance and energy-efficiency design methodologies. He
 > received his Master degree in Science from the National Institute of
 > Astrophysics, Optics and Electronics (INAOE), Mexico, in 2013.
 > Alejandro was awarded his BS degree in Electronics Engineering from
 > Industrial University of Santander (UIS), Colombia in 2011.

 > **Florent Bruguier** received the M.S. and Ph.D. degrees in
 > microelectronics from the University of Montpellier, France, in 2009
 > and 2012, respectively. From 2012 to 2015, he was a Scientific
 > Assistant with the Montpellier Laboratory of Informatics, Robotics,
 > and Microelectronics, University of Montpellier. Since 2015, he is a
 > Permanent Associate Professor. He has co-authored over 30
 > publications. His research interests are focused on self-adaptive and
 > secure approaches for embedded systems.

 > **Anastasiia Butko**, Ph.D. is a Postdoctoral Fellow in the
 > Computational Research Division at Lawrence Berkeley National
 > Laboratory (LBNL), CA. Her research interests lie in the general area
 > of computer architecture, with particular emphasis on high-performance
 > computing, emerging and heterogeneous technologies, associated
 > parallel programming and architectural simulation techniques. Broadly,
 > her reasearch addresses the question of how alternative technologies
 > can provide continuing performance scaling in the approaching
 > Post-Moore’s Law era. Her primary research projects include
 > development of the EDA tools for fast superconducting logic design,
 > development of the classical ISA for quantum processor control,
 > development of the fast and flexible System-on-Chip generators using
 > Chisel DSL. Dr. Butko co-leads Open Source Supercomputing project and
 > is a technical committee member of the RISC-V foundation.
 >
 > Dr. Butko received her Ph.D. in Microelectronics from the University
 > of Montpellier, France (2015). Her doctoral thesis developed fast and
 > accurate simulation techniques for many-core architectures
 > exploration. Her graduate work has been conducted within the European
 > project MontBlanc, which aims to design a new supercomputer
 > architecture using low-power embedded technologies.
 >
 > Dr. Butko received her MSc. Degree in Microelectronics from UM2,
 > France and MSc and BSc Degrees in Digital Electronics from NTUU "KPI",
 > Ukraine. During her Master she participated on the international
 > program of double diploma between Montpellier and Kiev universities.

 </span>

 <span id="Ruby">

 ## Modeling Cache Coherence with gem5

 Correctly implementing cache coherence protocols is hard and these
 implementation details can affect the system's performance. Therefore,
 it is important to robustly model the detailed cache coherence
 implementation. The popular computer architecture simulator gem5 uses
 Ruby as its cache coherence model providing higher fidelity cache
 coherence modeling than many other simulators.

 In this talk, I will give a brief overview of Ruby, including SLICC: the
 domain-specific language Ruby uses to specify cache protocols. I will
 show the extreme flexibility of this model and details of a simple cache
 coherence protocol. After this talk, you will be able to dive in and
 begin writing your own coherence protocols\!

 > **Jason Lowe-Power** is an Assistant Professor at University of
 > California, Davis in the Computer Science department. Jason's research
 > focuses on increasing the energy efficiency and performance of
 > end-to-end applications like analytic database operations used by
 > Amazon, Google, Target, etc. One important aspect of this research is
 > adding hardware mechanisms to systems that enable all programmers to
 > use emerging hardware accelerators like GPUs. Additionally, Jason is a
 > leader of the open-source architectural simulator, gem5, used by over
 > 1500 academic papers. Jason received his PhD from University of
 > Wisconsin-Madison in Summer 2017. He was awarded the Wisconsin
 > Distinguished Graduate Fellowship Cisco Computer Sciences Award in
 > 2014 and 2015.

 </span>

 <span id="Garnet2">

 ## A Detailed On-Chip Network Model inside a Full-System Simulator

 Compute systems are ubiquitous, with form factors ranging from
 smartphones at the edge to datacenters in the cloud. Chips in all these
 systems today comprise 10s to 100s of homogeneous/heterogeneous cores or
 processing elements. The growing emphasis on parallelism, distributed
 computing, heterogeneity, and energy-efficiency across all these systems
 makes the design of the Network-on-Chip (NoC) fabric connecting the
 cores critical to both high-performance and low power consumption.

 It is imperative to model the details of the NoC when architecting and
 exploring the design-space of a complex many-core system. If ignored, an
 inaccurate NoC model could lead to over-design or under-design due to
 incorrect trade-off choices, causing performance losses at runtime. To
 this end, we have designed and integrated a detailed on-chip network
 model called Garnet inside the gem5 (www.gem5.org) full-system
 architectural simulator which is being used extensively by both industry
 and academia. Together with Garnet, gem5 provides plug-and-play models
 of cores, caches, cache coherence protocols, NoC, memory controller, and
 DRAM, with varying levels of details, enabling computer architects and
 designers to trade-off simulation speed and accuracy.

 In this talk, we will first introduce the basic building blocks of NoCs
 and present the state-of-the-art used in chips today. We will then
 present Garnet, and demonstrate how it faithfully models the
 state-of-the-art, while also offering immense flexibility in modifying
 various parts of the microarchitecture to serve the needs of both
 homogeneous many-cores and heterogeneous accelerator-based systems of
 the future via case studies and code-snippets. Finally, we will
 demonstrate how Garnet works within the entire gem5 ecosystem.

 > **Tushar Krishna** is an Assistant Professor in the Schools of ECE and
 > CS at Georgia Tech. He received a Ph.D. in Electrical Engineering and
 > Computer Science from the Massachusetts Institute of Technology in
 > 2014. Prior to that he received a M.S.E from Princeton University in
 > 2009, and a B.Tech from the Indian Institute of Technology (IIT) Delhi
 > in 2007, both in Electrical Engineering.
 >
 > Before joining Georgia Tech in 2015, Dr. Krishna was a post-doctoral
 > researcher in the VSSAD Group at Intel, Massachusetts, and then at the
 > Singapore-MIT Alliance for Research and Technology at MIT.
 >
 > Dr. Krishna's research interests are in computer architecture,
 > interconnection networks, networks-on-chip, deep learning
 > accelerators, and FPGAs.

 </span>

 <span id="SystemC">

 ## System Simulation with gem5, SystemC and other Tools

 SystemC TLM based virtual prototypes have become the main tool in
 industry and research for concurrent hardware and software development,
 as well as hardware design space exploration. However, there exists a
 lack of accurate, free, changeable and realistic SystemC models of
 modern CPUs. Therefore, many researchers use the cycle accurate open
 source system simulator gem5, which has been developed in parallel to
 the SystemC standard. In this tutorial we present the coupling of gem5
 with SystemC that offers full interoperability between both simulation
 frameworks, and therefore enables a huge set of possibilities for system
 level design space exploration. Furthermore, we show several examples
 for coupling gem5 with SystemC and other tools.

 > **Matthias Jung** received his PhD degree in Electrical Engineering
 > from the University of Kaiserslautern Germany in 2017. His research
 > interest are SystemC based virtual prototypes, especially with the
 > focus on the modeling of memory systems and memory controller design.
 > Since may 2017 he is a researcher at Fraunhofer IESE, Kaiserslautern,
 > Germany.

 > **Christian Menard** received a Diploma degree in Information Systems
 > Technology from TU Dresden in Germany in 2016 and joined the chair for
 > compiler construction as a Ph.D. student within the excellence cluster
 > cfaed in TU Dresden. His current research includes system-level
 > modeling of widely heterogeneous hardware as well dataflow compilers
 > for heterogeneous MPSoC platforms.

 </span>

 <span id="PowMon">

 ## CPU power estimation using PMCs and its application in gem5

 Fast and accurate estimation of CPU power consumption is necessary to
 inform run-time power management approaches and allow effective design
 space exploration. Power simulators, combined with a full-system
 architectural simulator such as gem5, enable power-performance
 trade-offs to be investigated early in the design of a system. However,
 the accuracy of existing power simulators is known to be low, and this
 can lead to incorrect conclusions being made. In this talk, I will
 present our statistically rigorous methodology for building accurate
 run-time power models using Performance Monitoring Counters (PMCs) for
 mobile and embedded devices, and demonstrate how our models make more
 efficient use of limited training data and better adapt to unseen
 scenarios by uniquely considering stability. Models built using the
 methodology for both ARM Cortex-A7 and Cortex-A15 CPUs exhibit a 3.8%
 and 2.8% average error respectively. I will also present online
 resources that we have made available from the work, including software
 tools, documentation, raw data and further results. I will also present
 results from an investigation into the correlation between gem5 activity
 statistics and hardware PMCs. Based on this, a gem5 power model for a
 simulated quadcore ARM Cortex-A15 has been created, built using the
 above methodology, and its accuracy compared against experimental
 results obtained from hardware.

 > **Geoff Merrett** is an Associate Professor in the Department of
 > Electronics and Computer Science at the University of Southampton. He
 > received the BEng (1st, Hons) and PhD degrees in Electronic
 > Engineering from Southampton in 2004 and 2009 respectively. His
 > research interests are in energy-aware and self-powered computing
 > systems, with application across the spectrum from highly constrained
 > IoT devices to many-core mobile and embedded systems. He has published
 > over 100 peer-reviewed articles in these areas, and given invited
 > talks at a number of international events. Dr Merrett is a
 > Co-Investigator on the EPSRC-funded £5.6M PRiME Programme Grant (where
 > he leads the applications and cross-layer interaction theme),
 > "Continuous on-line adaptation in many-core systems: From graceful
 > degradation to graceful amelioration", and deputy-lead on the
 > "Wearable and Autonomous Computing for Future Smart Cities" Platform
 > Grant. He is technical manager of Southampton’s ARM-ECS Research
 > Centre, an award-winning industry-academia collaboration between the
 > University of Southampton and ARM. He coordinates IoT research at the
 > University, and leads the wireless sensing theme of its Pervasive
 > Systems Centre. He is an Associate Editor for the IET CDS journal,
 > serves as a reviewer for a number of leading journals, and on TPCs for
 > a range of conferences. He co-manages the UK’s Energy Harvesting
 > Network, was General Chair of the ACM Workshop on Energy-Harvesting
 > and Energy-Neutral Sensing Systems in 2013, 2014, and 2015, and was
 > the General Chair of the European Workshop on Microelectronics
 > Education 2016. He is a member of the IEEE, IET and Fellow of the HEA.

 </span>

 # Short Talks

 <span id="JIT_Debugging">

 ## Debugging a target-agnostic JIT compiler with GEM5

 **Author:** Boris Shingarov - LabWare

 We explain how GEM5 enabled us to develop a target-agnostic JIT
 compiler, in which no knowledge about the target ISA is coded by the
 human programmer; instead, the backend is inferred, using logic
 programming, from a formal machine description written in a Processor
 Description Language. Debugging such a JIT presents some challenges
 which can not be addressed using traditional approaches. One such
 challenge is the impedance mismatch between the high-level abstractions
 in the PDL and the low-level inferred implementation. In this talk, we
 present a new debugger based on simulating the execution of the target
 runtime VM in GEM5; the debugger frontend connects to this simulation
 using the RSP wire protocol.
 </span>

 <span id="COSSIM">

 ## COSSIM: An Integrated Solution to Address the Simulator Gap for Parallel Heterogeneous Systems

 In an era of complex networked heterogeneous systems, simulating
 independently only parts, components or attributes of a
 system-under-design is not a viable, accurate or efficient option. The
 interactions are too many and too complicated to produce meaningful
 results and the optimization opportunities are severely limited when
 considering each part of a system in an isolated manner. COSSIM offers a
 framework that can handle the simulation of a complete system-of-systems
 including processors, peripherals and networks that can appeal to
 Parallel (Heterogeneous) Systems designers and application developers in
 an integrated way.

 The framework is based on gem5 as the main simulation engine for
 processor-based systems and extends its capabilities by integrating it
 with the OMNET++ network simulator. This integration allows independent
 gem5 instances to be networked with all network protocols and
 hierarchies that can be supported by OMNET++, thus creating a very
 flexible solution. The integration of the two main simulation tools is
 realized through the IEEE 1516 High-Level Architecture standard (HLA),
 through which all communication tasks are performed. Through HLA and
 custom libraries, a two-level (per node and global) synchronization
 scheme is also implemented to ensure a coherent notion of time between
 all nodes.

 Since HLA is IP-based all gem5 instances and OMNET++ can be executed on
 the same physical machine or on any distributed system (or any
 combination in between). The overall framework – the set of gem5 nodes,
 the OMNET++ simulator and the CERTI HLA – are integrated in a unified
 Eclipse-based GUI that has been developed to provide easy simulation
 set-up, execution and visualization of results. McPAT is also integrated
 in a semi-automated way through the GUI in order to provide power and
 energy estimations for each node, while OMNET++ provides power
 estimations for networking-related components (NICs and network
 devices).

 > **Andreas Brokalakis** is a senior hardware engineer at Synelixis
 > Solutions Ltd. At the same time he is pursuing a PhD degree at the
 > Technical University of Crete, Greece. He holds a Bachelor degree in
 > Computer Engineering from University of Patras, Greece and a Master’s
 > Degree on Hardware/Software Co-design from the same university.
 > Current work and research interests involve computer architecture and
 > arithmetic, as well as design of ASIC and FPGA systems and
 > accelerators.

 > **Nikolaos Tampouratzis** is a PhD student at Technical University of
 > Crete, working on simulation tools for computing systems. He has
 > joined Telecommunication Systems Institute, Technical University of
 > Crete since October 2012 as a research associate, providing research
 > and development services to several EU-funded research projects. He
 > received his Computer Science diploma from the University of Crete
 > (UOC, Greece), with specialization in Hardware Design and FPGAs. He
 > continued his studies in the Technical University of Crete (TUC
 > Greece) where he received his Master Diploma in Electronic and
 > Computer Engineering in which he specialized in Computer Architecture
 > and Hardware Design.

 </span>

 <span id="ComplexSystems">

 ## Simulation of Complex Systems Incorporating Hardware Accelerators

 The breakdown of Dennard scaling coupled with the persistently growing
 transistor counts increased the importance of application-specific
 hardware acceleration; such an approach offers significant performance
 and energy benefits compared to general-purpose solutions. In order to
 thoroughly evaluate such architectures, the designer should perform a
 quite extensive design space exploration so as to evaluate the
 trade-offs across the entire system. The design, until recently, has
 been predominantly done using Register Transfer Level languages such as
 Verilog and VHDL, which, however, lead to a prohibitively long and
 costly design effort. In order to reduce the design time a wide range of
 both commercial and academic High-Level Synthesis (HLS) tools have
 emerged; most of these tools, handle hardware accelerators that are
 described in synthesizable SystemC. The problem today, however, is that
 most simulators used for evaluating the complete user applications (i.e.
 full-system CPU/Mem/Peripheral simulators) lack any type of SystemC
 accelerator support.

 Within this context, we extend gem5 to support the simulation of generic
 SystemC accelerators. We introduce a novel flow that enables us to
 rapidly prototype synthesisable SystemC hardware accelerators in
 conjunction with gem5. The proposed solution handles automatically all
 communication and synchronisation issues.

 Compared to a standard gem5 system, several changes at different levels
 are required, from the OS and device drivers level down to the
 implementation of a device model in the gem5 simulator. Instead of using
 files to write data for an external accelerator, perform the simulation
 and then read back the results, our approach communicates with the
 SystemC simulator through programmed I/Os and DMA engines, supporting
 full global synchronisation. Apart from the apparent benefits concerning
 the implementation and simulation accuracy, the proposed solution is
 also orders of magnitude faster.

 > **Nikolaos Tampouratzis** is a PhD student at Technical University of
 > Crete, working on simulation tools for computing systems. He has
 > joined Telecommunication Systems Institute, Technical University of
 > Crete since October 2012 as a research associate, providing research
 > and development services to several EU-funded research projects. He
 > received his Computer Science diploma from the University of Crete
 > (UOC, Greece), with specialization in Hardware Design and FPGAs. He
 > continued his studies in the Technical University of Crete (TUC
 > Greece) where he received his Master Diploma in Electronic and
 > Computer Engineering in which he specialized in Computer Architecture
 > and Hardware Design.

 </span>

 <span id="TraceGeneration">

 ## Generating Synthetic Traffic for Heterogeneous Architectures

 Modern system-on-chip architectures consist of many heterogeneous
 processing elements. The communication fabric and memory hierarchy
 supporting these processing elements heavily influence the system’s
 overall performance. Exploring the design space of these heterogeneous
 architectures with detailed models of each processing element can be
 time-consuming. Statistical simulation has been shown to be an effective
 tool for quickly evaluating architectures by abstracting away
 complexity.

 This talk describes work done on modelling the spatial and temporal
 behaviour of a processing element’s address stream. We present a
 methodology that can automatically characterize a processing element by
 observing its reads and writes. Using these characteristics we can
 stimulate a communication fabric connecting many different processing
 elements by synthetically recreating their addresses. These addresses
 arrive at their destination in the memory hierarchy, spawning new
 messages and responses to read and write requests. Architects can now
 combine ynthetic processing elements that represent various different
 components on current and future systems-on-chip to evaluate the impact
 of changes at the interconnection network and memory hierarchy.

 > **Mario Badr** is a PhD Candidate at the University of Toronto working
 > under the supervision of Dr. Natalie Enright Jerger. He received his
 > B.A.Sc. and M.A.Sc from the University of Toronto in Electrical
 > Engineering and Computer Engineering, respectively. He has interned
 > with Qualcomm Research Silicon Valley and received the Roberto
 > Padovani Scholarship for his outstanding technical contributions. In
 > addition, he has been recognized at the university and departmental
 > levels for excellence as a teaching assistant. His research interests
 > include performance evaluation in computer architecture, heterogeneous
 > architectures, and multi-threaded workloads.

 </span>

 <span id="StarterKit">

 ## ARM Research Starter Kit: System Modeling using gem5

 ARM Research Enablement aims to enhance computing research by enabling
 researchers worldwide to easily access ARM-based IP and technologies,
 and helping them to increase their research impact. As a part of our
 research enablement activities, we provide a System Modeling Research
 Starter Kit using gem5. We have released a High Performance In-order
 (HPI) CPU timing model based on ARMv8-A in gem5. I will present a
 high-level overview of the released system, its documentation and
 benchmark scripts. This talk will target those who are new to gem5 as
 well as those who would like to promote gem5 in research.

 > **Ashkan Tousi** is a Senior Research Engineer at ARM Cambridge and an
 > Honorary Lecturer at the University of Glasgow. He received his PhD in
 > computing science (parallel computing) in 2015. He currently leads
 > research enablement activities at ARM, which cover a range of
 > different research areas from SoC design to IoT and data science.

 </span>

 <span id="WA">

 ## Interacting with gem5 using workload-automation & devlib

 Running workloads on gem5 is often not straightforward. This talk will
 discuss workload-automation and devlib, 2 new open-source tools to
 interact with gem5. These frameworks, written to interact with various
 hardware platforms, have recently been extended to include gem5 as a
 platform. We will discuss use cases and advantages/disadvantages of each
 tool and show how they can make your gem5 work easier.

 > **Anouk Van Laer** is a Modelling Engineer in Architecture: Systems &
 > Technology group at ARM. She obtained her PhD at University College
 > London, where she investigated the effects of optical interconnects on
 > the performance of chip multiprocessors, using gem5.

 </span>

 <span id="PowerFramework">

 ## gem5: empowering the masses

 This talk will give an overview of the state of power modelling in gem5.
 After discussing the basic power modelling infrastructure, it will cover
 the state of CPU DVFS as well as recent improvements in how CPU power
 states are controlled for the ARM architecture in gem5. The talk will
 cover these improvements in power modelling, highlighting the way in
 which the accuracy and versatility of the simulator have been improved.

 > **Sascha Bischoff** is a Senior Software Engineer in the Architecture:
 > Systems & Technology group at ARM in Cambridge. Whilst completing his
 > PhD with the University of Southampton, he spent 3.5 years based in
 > ARM Research in Cambridge. He has spent a large part of the last 6
 > years working with gem5, typically with a focus on power management,
 > ideally without impacting the delivered
 performance.

 </span>

 <span id="DRAMPower">

 ## Integrating and quantifying the impact of low power modes in the DRAM controller in gem5

 Across applications, DRAM is a significant contributor to the overall
 system power, with the DRAM access energy per bit up to three orders of
 magnitude higher compared to on-chip memory accesses. To improve the
 power efficiency, DRAM technology incorporates multiple low power modes,
 each with different trade-offs between achievable power savings and
 performance impact due to entry and exit delay requirements. Accurate
 modeling of these low power modes and entry and exit control is crucial
 to analyze the trade-offs across controller configurations and workloads
 with varied memory access characteristics.

 In this talk, we will give an overview of the decision making logic we
 added to the DRAM controller in gem5 that triggers transitions to/from
 the power-down modes. Integrating this functionality makes gem5 the
 first publicly available DRAM low power full-system simulator, providing
 the research community a tool for DRAM power analysis for a breadth of
 use cases. We will conclude with simulation data that characterises the
 low power behaviour and shows energy and performance trade-offs for
 realistic workloads.

 **Note:** This talk is based on a paper accepted at MEMSYS 17. Authors
 from ARM: Radhika Jagtap, Wendy Elsasser and Andreas Hansson. Authors
 from University of Kaiserslautern: Matthias Jung and Norbert Wehn.

 > **Radhika Jagtap** is a Senior Research Engineer working in the Memory
 > & Systems research group. She has plenty of experience with gem5
 > (elastic traces, interconnect, memory controller) and is involved in
 > several collaborative research projects, especially with academics.
 > Currently she is exploring the problem of energy efficient data
 > movement for sparse data workloads.

 </span>