layout: toc title: Publications permalink: /publications/

Table of Contents

{:.no_toc}

  • TOC {:toc}

If you use gem5 in your research, we would appreciate a citation to the original paper in any publications you produce. Moreover, we would appreciate if you cite also the speacial features of gem5 which have been developed and contributed to the main line since the publication of the original paper in 2011. In other words, if you use feature X please also cite the according paper Y from the list below.

Original Paper


  • The gem5 Simulator. Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. May 2011, ACM SIGARCH Computer Architecture News.

Special Features of gem5


GPUs

DRAM Controller, DRAM Power Estimation

KVM

Elastic Traces

SystemC Couping

Other Publications related to gem5


Publications using gem5 / m5


2017

  • [https://chess.eecs.berkeley.edu/pubs/1194/KimEtAl_CyPhy17.pdfAn Integrated Simulation Tool for Computer Architecture and Cyber-Physical Systems]. Hokeun Kim, Armin Wasicek, and Edward A. Lee. In Proceedings of the 6th Workshop on Design, Modeling and Evaluation of Cyber-Physical Systems (CyPhy’17), Seoul, Korea, October 19, 2017.

  • [http://www.lirmm.fr/~sassate/ADAC/wp-content/uploads/2017/06/opensuco17.pdfEfficient Programming for Multicore Processor Heterogeneity: OpenMP versus OmpSs]. Anastasiia Butko, Florent Bruguier, Abdoulaye Gamatié and Gilles Sassatelli. In Open Source Supercomputing (OpenSuCo’17) Workshop co-located with ISC’17, June 2017.

  • [https://hal-lirmm.ccsd.cnrs.fr/lirmm-01467328MAGPIE: System-level Evaluation of Manycore Systems with Emerging Memory Technologies]. Thibaud Delobelle, Pierre-Yves Péneau, Abdoulaye Gamatié, Florent Bruguier, Sophiane Senni, Gilles Sassatelli and Lionel Torres, 2nd International Workshop on Emerging Memory Solutions (EMS) co-located with DATE’17, March 2017.

2016

  • [http://ieeexplore.ieee.org/document/7776838An Agile Post-Silicon Validation Methodology for the Address Translation Mechanisms of Modern Microprocessors]. G. Papadimitriou, A. Chatzidimitriou, D. Gizopoulos, R. Morad, IEEE Transactions on Device and Materials Reliability (TDMR 2016), Volume: PP, Issue: 99, December 2016.

  • [http://ieeexplore.ieee.org/document/7753339Unveiling Difficult Bugs in Address Translation Caching Arrays for Effective Post-Silicon Validation]. G. Papadimitriou, D. Gizopoulos, A. Chatzidimitriou, T. Kolan, A. Koyfman, R. Morad, V. Sokhin, IEEE International Conference on Computer Design (ICCD 2016), Phoenix, AZ, USA, October 2016.

  • [http://ieeexplore.ieee.org/document/7833682/Loop optimization in presence of STT-MRAM caches: A study of performance-energy tradeoffs]. Pierre-Yves Péneau, Rabab Bouziane, Abdoulaye Gamatié, Erven Rohou, Florent Bruguier, Gilles Sassatelli, Lionel Torres and Sophiane Senni, 26th International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS), September 21-23 2016.

  • [http://ieeexplore.ieee.org/abstract/document/7774439Full-System Simulation of big.LITTLE Multicore Architecture for Performance and Energy Exploration]. Anastasiia Butko, Florent Bruguier, Abdoulaye Gamatié, Gilles Sassatelli, David Novo, Lionel Torres and Michel Robert. Embedded Multicore/Many-core Systems-on-Chip (MCSoC), 2016 IEEE 10th International Symposium on, September 21-23, 2016.

  • [http://ieeexplore.ieee.org/document/7448986Exploring MRAM Technologies for Energy Efficient Systems-On-Chip]. Sophiane Senni, Lionel Torres, Gilles Sassatelli, Abdoulaye Gamatié and Bruno Mussard, IEEE Journal on Emerging and Selected Topics in Circuits and Systems , Volume: 6, Issue: 3, Sept. 2016.

  • [https://cpc2016.infor.uva.es/wp-content/uploads/2016/06/CPC2016_paper_11.pdfArchitectural exploration of heterogeneous memory systems]. Marcos Horro, Gabriel Rodríguez, Juan Touriño and Mahmut T. Kandemir. 19th Workshop on Compilers for Parallel Computing (CPC), July 2016.

  • [http://ieeexplore.ieee.org/document/7604675ISA-Independent Post-Silicon Validation for the Address Translation Mechanisms of Modern Microprocessors]. G. Papadimitriou, A. Chatzidimitriou, D. Gizopoulos and R. Morad, IEEE International Symposium on On-Line Testing and Robust System Design (IOLTS 2016), Sant Feliu de Guixols, Spain, July 2016.

  • Anatomy of microarchitecture-level reliability assessment: Throughput and accuracy. A.Chatzidimitriou, D.Gizopoulos, IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Uppsala, Sweden, April 2016.

  • Agave: A benchmark suite for exploring the complexities of the Android software stack. Martin Brown, Zachary Yannes, Michael Lustig, Mazdak Sanati, Sally A. McKee, Gary S. Tyson, Steven K. Reinhardt, IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Uppsala, Sweden, April 2016.

2015

2014

2013

2012

  • Hardware Prefetchers for Emerging Parallel Applications, Biswabandan Panda, Shankar Balachandran. In the proceedings of the IEEE/ACM Parallel Architectures and Compilation Techniques,PACT, Minneapolis, October 2012.
  • Lazy Cache Invalidation for Self-Modifying Codes. A. Gutierrez, J. Pusdesris, R.G. Dreslinski, and T. Mudge. In the proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES), Tampere, Finland, October 2012.
  • Accuracy Evaluation of GEM5 Simulator System. A. Butko, R. Garibotti, L. Ost, and G. Sassatelli. In the proceeding of the IEEE International Workshop on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), York, United Kingdom, July 2012.
  • Viper: Virtual Pipelines for Enhanced Reliability. A. Pellegrini, J. L. Greathouse, and V. Bertacco. In the proceedings of the International Symposium on Computer Architecture (ISCA), Portland, OR, June 2012.
  • Reducing memory reference energy with opportunistic virtual caching. Arkaprava Basu, Mark D. Hill, Michael M. Swift. In the proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012).
  • Cache Revive: Architecting Volatile STT-RAM Caches for Enhanced Performance in CMPs. Adwait Jog, Asit Mishra, Cong Xu, Yuan Xie, V. Narayanan, Ravi Iyer, Chita Das. In the proceedings oF the IEEE/ACM Design Automation Conference (DAC), San Francisco, CA, June 2012.

2011

  • Full-System Analysis and Characterization of Interactive Smartphone Applications. A. Gutierrez, R.G. Dreslinski, T.F. Wenisch, T. Mudge, A. Saidi, C. Emmons, and N. Paver. In the proceeding of the IEEE International Symposium on Workload Characterization (IISWC), pages 81-90, Austin, TX, November 2011.
  • Universal Rules Guided Design Parameter Selection for Soft Error Resilient Processors, L. Duan, Y. Zhang, B. Li, and L. Peng. Proceedings of the International Symposium on Performance Analysis of Systems and Software(ISPASS), Austin, TX, April 2011.

2010

  • Using Hardware Vulnerability Factors to Enhance AVF Analysis, V. Sridharan, D. R. Kaeli. Proceedings of the International Symposium on Computer Architecture (ISCA-37), Saint-Malo, France, June 2010.
  • Leveraging Unused Cache Block Words to Reduce Power in CMP Interconnect, H. Kim, P. Gratz. IEEE Computer Architecture Letters, vol. 99, (RapidPosts), 2010.
  • A Fast Timing-Accurate MPSoC HW/SW Co-Simulation Platform based on a Novel Synchronization Scheme, Mingyan Yu, Junjie Song, Fangfa Fu, Siyue Sun, and Bo Liu. Proceedings of the International MultiConfernce of Engineers and Computer Scientists. 2010 pdf
  • Simulation of Standard Benchmarks in Hardware Implementations of L2 Cache Models in Verilog HDL, Rosario M. Reas, Anastacia B. Alvarez, Joy Alinda P. Reyes, Computer Modeling and Simulation, International Conference on, pp. 153-158, 2010 12th International Conference on Computer Modelling and Simulation, 2010
  • A Simulation of Cache Sub-banking and Block Buffering as Power Reduction Techniques for Multiprocessor Cache Design, Jestoni V. Zarsuela, Anastacia Alvarez, Joy Alinda Reyes, Computer Modeling and Simulation, International Conference on, pp. 515-520, 2010 12th International Conference on Computer Modelling and Simulation, 2010

2009

  • Efficient Implementation of Decoupling Capacitors in 3D Processor-DRAM Integrated Computing Systems. Q. Wu, J. Lu, K. Rose, and T. Zhang. Great Lakes Symposium on VLSI. 2009.

  • Evaluating the Impact of Job Scheduling and Power Management on Processor Lifetime for Chip Multiprocessors. A. K. Coskun, R. Strong, D. M. Tullsen, and T. S. Rosing. Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems. 2009.

  • ” Devices and architectures for photonic chip-scale integration.” J. Ahn, M. Fiorentino1, R. G. Beausoleil, N. Binkert, A. Davis, D. Fattal, N. P. Jouppi, M. McLaren, C. M. Santori, R. S. Schreiber, S. M. Spillane, D. Vantrease and Q. Xu. Journal of Applied Physics A: Materials Science & Processing. February 2009.

  • System-Level Power, Thermal and Reliability Optimization. C. Zhu. Thesis at Queen’s University. 2009.

  • A light-weight fairness mechanism for chip multiprocessor memory systems. M. Jahre, L. Natvig. Proceedings of the 6th ACM conference on Computing Frontiers. 2009.

  • Decoupled DIMM: building high-bandwidth memory system using low-speed DRAM devices. H. Zheng, J. Lin, Z. Zhang, and Z. Zhu. International Symposium on Computer Architecture (ISCA). 2009.

  • On the Performance of Commit-Time-Locking Based Software Transactional Memory. Z. He and B. Hong. The 11th IEEE International Conference on. High Performance Computing and Communications (HPCC-09). 2009.

  • A Quantitative Study of Memory System Interference in Chip Multiprocessor Architectures. M. Jahre, M. Grannaes and L. Natvig. The 11th IEEE International Conference on. High Performance Computing and Communications (HPCC-09). 2009.

  • Hardware Support for Debugging Message Passing Applications for Many-Core Architectures. C. Svensson. Masters Thesis at the University of Illinois at Urbana-Champaign, 2009.

  • Initial Experiments in Visualizing Fine-Grained Execution of Parallel Software Through Cycle-Level Simulation. R. Strong, J. Mudigonda, J. C. Mogul, N. Binkert. USENIX Workshop on Hot Topics in Parallelism (HotPar). 2009.

  • MPreplay: Architecture Support for Deterministic Replay of Message Passing Programs on Message Passing Many-core Processors. C. Erik-Svensson, D. Kesler, R. Kumar, and G. Pokam. University of Illinois Technical Report number UILU-09-2209.

  • Low-power Inter-core Communication through Cache Partitioning in Embedded Multiprocessors. C. Yu, X. Zhou, and P. Petrov .Symposium on Integrated Circuits and System Design (sbcci). 2009.

  • Integrating NAND flash devices onto servers. D. Roberts, T. Kgil, T. Mudge. Communications of the ACM (CACM). 2009.

  • A High-Performance Low-Power Nanophotonic On-Chip Network. Z. Li, J. Wu, L. Shang, A. Mickelson, M. Vachharajani, D. Filipovic, W. Park∗ and Y. Sun. International Symposium on Low Power Electronic Design (ISLPED). 2009.

  • Core monitors: monitoring performance in multicore processors. P. West, Y. Peress, G. S. Tyson, and S. A. McKee. Computing Frontiers. 2009.

  • Parallel Assertion Processing using Memory Snapshots. M. F. Iqbal, J. H. Siddiqui, and D. Chiou. Workshop on Unique Chips and Systems (UCAS). April 2009.

  • Leveraging Memory Level Parallelism Using Dynamic Warp Subdivision. J. Meng, D. Tarjan, and K. Skadron. Univ. of Virginia Dept. of Comp. Sci. Tech Report (CS-2009-02).

  • Reconfigurable Multicore Server Processors for Low Power Operation. R. G. Dreslinski, D. Fick, D. Blaauw, D. Sylvester and T. Mudge. 9th International Symposium on Systems, Architectures, MOdeling and Simulation (SAMOS). July 2009.

  • Near Threshold Computing: Overcoming Performance Degradation from Aggressive Voltage Scaling R. G. Dreslinski, M. Wieckowski, D. Blaauw, D. Sylvester, and T. Mudge. Workshop on Energy Efficient Design (WEED), June 2009.

  • Workload Adaptive Shared Memory Multicore Processors with Reconfigurable Interconnects. S. Akram, R. Kumar, and D. Chen. IEEE Symposium on Application Specific Processors, July 2009.

  • Eliminating Microarchitectural Dependency from Architectural Vulnerability. V. Sridharan, D. R. Kaeli. Proceedings of the 15th International Symposium on High-Performance Computer Architecture (HPCA-15), February 2009.

  • Producing Wrong Data Without Doing Anything Obviously Wrong! T. Mytkowicz, A. Diwan, M. Hauswirth, P. F. Sweeney. Proceedings of the 14th international conference on Architectural support for programming languages and operating systems (ASPLOS). 2009.

  • End-To-End Performance Forecasting: Finding Bottlenecks Before They Happen A. Saidi, N. Binkert, S. Reinhardt, T. Mudge. Proceedings of the 36th International Symposium on Computer Architecture (ISCA-36), June 2009.

  • Fast Switching of Threads Between Cores. R. Strong, J. Mudigonda, J. C. Mogul, N. Binkert, D. Tullsen. ACM SIGOPS Operating Systems Review. 2009.

  • Express Cube Topologies for On-Chip Interconnects. B. Grot, J. Hestness, S. W. Keckler, O. Mutlu. Proceedings of the 15th International Symposium on High-Performance Computer Architecture (HPCA-15), February 2009.

  • Enhancing LTP-Driven Cache Management Using Reuse Distance Information. W. Lieu, D. Yeung. Journal of Instruction-Level Parallelism 11 (2009).

2008

  • Analyzing the Impact of Data Prefetching on Chip MultiProcessors. N. Fukumoto, T. Mihara, K. Inoue, and K. Murakami. Asia-Pacific Computer Systems Architecture Conference. 2008.

  • Historical Study of the Development of Branch Predictors. Y. Peress. Masters Thesis at Florida State University. 2008.

  • Hierarchical Domain Partitioning For Hierarchical Architectures. J. Meng, S. Che, J. W. Sheaffer, J. Li, J. Huang, and K. Skadron. Univ. of Virginia Dept. of Comp. Sci. Tech Report CS-2008-08. 2008.

  • Memory Access Scheduling Schemes for Systems with Multi-Core Processors. H. Zheng, J. Lin, Z. Zhang, and Z. Zhu. International Conference on Parallel Processing, 2008.

  • Register Multimapping: Reducing Register Bank Conflicts Through One-To-Many Logical-To-Physical Register Mapping. N. L. Duong and R. Kumar. Tehnical Report CHRC-08-07.

  • Cross-Layer Custimization Platform for Low-Power and Real-Time Embedded Applications. X. Zhou. Dissertation at the University of Maryland. 2008.

  • Probabilistic Replacement: Enabling Flexible Use of Shared Caches for CMPs. W. Liu and D. Yeung. University of Maryland Technical Report UMIACS-TR-2008-13. 2008.

  • Observer Effect and Measurement Bias in Performance Analysis. T. Mytkowicz, P. F. Sweeney, M. Hauswirth, and A. Diwan. University of Colorado at Boulder Technical Report CU-CS 1042-08. June, 2008.

  • Power-Aware Dynamic Cache Partitioning for CMPs. I. Kotera, K. Abe, R. Egawa, H. Takizawa, and H. Kobayashi. 3rd International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC). 2008.

  • Modeling of Cache Access Behavior Based on Zipf’s Law. I. Kotera, H. Takizawa, R. Egawa, H. Kobayashi. MEDEA 2008.

  • Hierarchical Verification for Increasing Performance in Reliable Processors. J. Yoo, M. Franklin. Journal of Electronic Testing. 2008.

  • Transaction-Aware Network-on-Chip Resource Reservation. Z. Li, C. Zhu, L. Shang, R. Dick, Y. Sun. Computer Architecture Letters. Volume PP, Issue 99, Page(s):1 - 1.

  • Predictable Out-of-order Execution Using Virtual Traces. J. Whitham, N. Audsley. Proceedings of the 29th IEEE Real-time Systems Symposium, December 2008. pdf

  • Architectural and Compiler Mechanisms for Acelerating Single Thread Applications on Multicore Processors. H. Zhong. Dissertation at The University of Michigan. 2008.

  • Mini-Rank: Adaptive DRAM Architecture for Improving Memory Power Efficiency. H. Zheng, J. Lin, Z. Zhang, E. Gorbatov, H. David, Z. Zhu. Proceedings of the 41st Annual Symposium on Microarchitecture (MICRO-41), November 2008.

  • Reconfigurable Energy Efficient Near Threshold Cache Architectures. R. Dreslinski, G. Chen, T. Mudge, D. Blaauw, D. Sylvester, K. Flautner. Proceedings of the 41st Annual Symposium on Microarchitecture (MICRO-41), November 2008.

  • Distributed and low-power synchronization architecture for embedded multiprocessors. C. Yu, P. Petrov. Internation Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October 2008.

  • Thermal Monitoring Mechanisms for Chip Multiprocessors. J. Long, S.O. Memik, G. Memik, R. Mukherjee. ACM Transactions on Architecture and Code Optimization (TACO), August 2008.

  • Multi-optimization power management for chip multiprocessors. K. Meng, R. Joseph, R. Dick, L. Shang. Proceedings of the 17th international conference on Parallel Architectures and Compilation Techniques (PACT), 2008.

  • ” Three-Dimensional Chip-Multiprocessor Run-Time Thermal Management.” C. Zhu, Z. Gu, L. Shang, R.P. Dick, R. Joseph. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), August 2008.

  • ” Latency and bandwidth efficient communication through system customization for embedded multiprocessors”. C. Yu and P. Petrov. DAC 2008, June 2008.

  • Corona: System Implications of Emerging Nanophotonic Technology. D. Vantrease, R. Schreiber, M. Monchiero, M. McLaren, N., P. Jouppi, M. Fiorentino, A. Davis, N. Binkert, R. G. Beausoleil, and J. Ahn. Proceedings of the 35th International Symposium on Computer Architecture (ISCA-35), June 2008.

  • Improving NAND Flash Based Disk Caches. T. Kgil, D. Roberts and T. N. Mudge. Proceedings of the 35th International Symposium on Computer Architecture (ISCA-35), June 2008.

  • A Taxonomy to Enable Error Recovery and Correction in Software. V. Sridharan, D. A. Liberty, and D. R. Kaeli. Workshop on Quality-Aware Design (W-QUAD), in conjunction with the 35th International Symposium on Computer Architecture (ISCA-35), June 2008.

  • Quantifying Software Vulnerability. V. Sridharan and D. R. Kaeli. First Workshop on Radiation Effects and Fault Tolerance in Nanometer Technologies, in conjunction with the ACM International Conference on Computing Frontiers, May 2008.

  • Core Monitors: Monitoring Performance in Multicore Processors. P. West. Masters Thesis at Florida State University. April 2008.

  • Full System Critical Path Analysis. A. Saidi, N. Binkert, T. N. Mudge, and S. K. Reinhardt. 2008 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2008.

  • A Power and Temperature Aware DRAM Architecture. S. Liu, S. O. Memik, Y. Zhang, G. Memik. 45th annual conference on Design automation (DAC), 2008.

  • Streamware: Programming General-Purpose Multicore Processors Using Streams. J. Gummaraju, J. Coburn, Y. Turner, M. Rosenblum. Procedings of the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2008.

  • Application-aware snoop filtering for low-power cache coherence in embedded multiprocessors. X. Zhou, C. Yu, A. Dash, and P. Petrov. Transactions on Design Automation of Electronic Systems (TODAES). January 2008.

  • An approach for adaptive DRAM temperature and power management. Song Liu, S. O. Memik, Y. Zhang, and G. Memik. Proceedings of the 22nd annual international conference on Supercomputing. 2008.

2007

  • Modeling and Characterizing Power Variability in Multicore Architectures. K. Meng, F. Huebbers, R, Joseph, and Y. Ismail. ISPASS-2007.

  • A High Performance Adaptive Miss Handling Architecture for Chip Multiprocessors. M. Jahre, and L. Natvig. HiPEAC Journal 2007.

  • Performance Effects of a Cache Miss Handling Architecture in a Multi-core Processor. M. Jahre and L. Natvig. NIK-2007 conference. 2007.

  • Prioritizing Verification via Value-based Correctness Criticality. J. Yoo, M. Franklin. Proceedings of the 25th International Conference on Computer Design (ICCD), 2007.

  • DRAM-Level Prefetching for Fully-Buffered DIMM: Design, Performance and Power Saving. J. Lin, H. Zheng, Z. Zhu, Z. Zhang ,H. David. ISPASS 2007.

  • ” Virtual Exclusion: An architectural approach to reducing leakage energy in caches for multiprocessor systems”. M. Ghosh, H. Lee. Proceedings of the International Conference on Parallel and Distributed Systems. December 2007.

  • Dependability-Performance Trade-off on Multiple Clustered Core Processors. T. Funaki, T. Sato. Proceedings of the 4th International Workshop on Dependable Embedded Systems. October 2007.

  • Predictive Thread-to-Core Assignment on a Heterogeneous Multi-core Processor. T. Sondag, V. Krishnamurthy, H. Rajan. PLOS ‘07: ACM SIGOPS 4th Workshop on Programming Languages and Operating Systems. October 2007.

  • Power deregulation: eliminating off-chip voltage regulation circuitry from embedded systems. S. Kim, R. P. Dick, R. Joseph. 5th IEEE/ACM International Conference on Hardware/Software Co-Design and System Synthesis (CODES+ISSS). October 2007.

  • Aggressive Snoop Reduction for Synchronized Producer-Consumer Communication in Energy-Efficient Embedded Multi-Processors. C. Yu, P. Petrov. 5th IEEE/ACM International Conference on Hardware/Software Co-Design and System Synthesis (CODES+ISSS). October 2007.

  • Three-Dimensional Multiprocessor System-on-Chip Thermal Optimization. C. Sun, L. Shang, R.P. Dick. 5th IEEE/ACM International Conference on Hardware/Software Co-Design and System Synthesis (CODES+ISSS). October 2007.

  • Sampled Simulation for Multithreaded Processors. M. Van Biesbrouck. (Thesis) UC San Diego Technical Report CS2007-XXXX. September 2007.

  • Representative Multiprogram Workloads for Multithreaded Processor Simulation. M. Van Biesbroucky, L. Eeckhoutz, B. Calder. IEEE International Symposium on Workload Characterization (IISWC). September 2007.

  • The Interval Page Table: Virtual Memory Support in Real-Time and Memory-Constrained Embedded Systems. X. Zhou, P. Petrov. Proceedings of the 20th annual conference on Integrated circuits and systems design. 2007.

  • A power-aware shared cache mechanism based on locality assessment of memory reference for CMPs. I. Kotera, R. Egawa, H. Takizawa, H. Kobayashi. Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture (MEDEA). September 2007.

  • Architectural Support for the Stream Execution Model on General-Purpose Processors. J. Gummaraju, M. Erez, J. Coburn, M. Rosenblum, W. J. Dally. The Sixteenth International Conference on Parallel Architectures and Compilation Techniques (PACT). September 2007.

  • An Energy Efficient Parallel Architecture Using Near Threshold Operation. R. Dreslinski, B. Zhai, T. Mudge, D. Blaauw, D. Sylvester. The Sixteenth International Conference on Parallel Architectures and Compilation Techniques (PACT). September 2007.

  • When Homogeneous becomes Heterogeneous: Wearout Aware Task Scheduling for Streaming Applications. D. Roberts, R. Dreslinski, E. Karl, T. Mudge, D. Sylvester, D. Blaauw. Workshop on Operationg System Support for Heterogeneous Multicore Architectures (OSHMA). September 2007.

  • ” On-Chip Cache Device Scaling Limits and Effective Fault Repair Techniques in Future Nanoscale Technology”. D. Roberts, N. Kim,T. Mudge. Digital System Design Architectures, Methods and Tools (DSD). August 2007.

  • Energy Efficient Near-threshold Chip Multi-processing. B. Zhai, R. Dreslinski, D. Blaauw, T. Mudge, D. Sylvester. International Symposium on Low Power Electronics and Design (ISLPED). August 2007.

  • ” A Burst Scheduling Access Reordering Mechanism”. J. Shao, B.T. Davis. IEEE 13th International Symposium on High Performance Computer Architecture (HPCA). 2007.

  • Enhancing LTP-Driven Cache Management Using Reuse Distance Information. W. Liu, D. Yeung. University of Maryland Technical Report UMIACS-TR-2007-33. June 2007.

  • Thermal modeling and management of DRAM memory systems. J. Lin, H. Zheng, Z. Zhu, H. David, and Z. Zhang. Proceedings of the 34th Annual international Symposium on Computer Architecture (ISCA). June 2007.

  • Duplicating and Verifying LogTM with OS Support in the M5 Simulator. G. Blake, T. Mudge. Sixth Annual Workshop on Duplicating, Deconstructing, and Debunking (WDDD). June 2007.

  • Analysis of Hardware Prefetching Across Virtual Page Boundaries. R. Dreslinski, A. Saidi, T. Mudge, S. Reinhardt. Proc. of the 4th Conference on Computing Frontiers. May 2007.

  • Reliability in the Shadow of Long-Stall Instructions. V. Sridharan, D. Kaeli, A. Biswas. Third Workshop on Silicon Errors in Logic - System Effects (SELSE-3). April 2007.

  • Extending Multicore Architectures to Exploit Hybrid Parallelism in Single-thread Applications. H. Zhong, S. A. Lieberman, S. A. Mahlke. Proc. 13th Intl. Symposium on High Performance Computer Architecture (HPCA). February 2007.

2006

  • Evaluation of the Data Vortex Photonic All-Optical Path Interconnection Network for Next-Generation Supercomputers. W. C. Hawkins. Dissertation at Georgia Tech. December 2006.

  • Running the manual: an approach to high-assurance microkernel development. P. Derrin, K. Elphinstone, G. Klein, D. Cock, M. M. T. Chakravarty. Proceedings of the 2006 ACM SIGPLAN workshop on Haskell. 2006.

  • The Filter Checker: An Active Verification Management Approach. J. Yoo, M. Franklin. 21st IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT’06), 2006.

  • Physical Resource Matching Under Power Asymmetry. K. Meng, F. Huebbers, R. Joseph, Y. Ismail. Presented at the 2006 P=ac2 Conference. 2006. pdf

  • Process Variation Aware Cache Leakage Management. K. Meng, R. Joseph. Proceedings of the 2006 International Symposium on Low Power Electronics and Design (ISLPED). October 2006.

  • FlashCache: a NAND flash memory file cache for low power web servers. T. Kgil, T. Mudge. Proceedings of the 2006 international conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES). October 2006.

  • PicoServer: Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor. T. Kgil, S. D’Souza, A. Saidi, N. Binkert, R. Dreslinski, S. Reinhardt, K. Flautner, T. Mudge. 12th Int’l Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). October 2006.

  • Integrated Network Interfaces for High-Bandwidth TCP/IP. N. L. Binkert, A. G. Saidi, S. K. Reinhardt. 12th Int’l Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). October 2006.

  • Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource. L. R. Hsu, S. K. Reinhardt, R. Iyer, S. Makineni. Proc. 15th Int’l Conf. on Parallel Architectures and Compilation Techniques (PACT), September 2006.

  • Impact of CMP Design on High-Performance Embedded Computing. P. Crowley, M. A. Franklin, J. Buhler, and R. D. Chamberlain. Proc. of 10th High Performance Embedded Computing Workshop. September 2006.

  • BASS: A Benchmark suite for evaluating Architectural Security Systems. J. Poe, T. Li. ACM SIGARCH Computer Architecture News. Vol. 34, No. 4, September 2006.

  • The M5 Simulator: Modeling Networked Systems. N. L. Binkert, R. G. Dreslinski, L. R. Hsu, K. T. Lim, A. G. Saidi, S. K. Reinhardt. IEEE Micro, vol. 26, no. 4, pp. 52-60, July/August, 2006.Link

  • Considering All Starting Points for Simultaneous Multithreading Simulation. M. Van Biesbrouck, L. Eeckhout, B. Calder. Proc. of the Int’l Symp. on Performance Analysis of Systems and Software (ISPASS). 2006.pdf

  • Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures. M. Becchi, P. Crowley. Proc. of the 3rd Conference on Computing Frontiers. pp29-40. May 2006. pdf

  • Integrated System Architectures for High-Performance Internet Servers. N. L. Binkert. Dissertation at the University of Michigan. February 2006.

  • Exploring Salvage Techniques for Multi-core Architectures. R. Joseph. 2nd Workshop on High Performance Computing Reliability Issues. February 2006. pdf

  • A Simple Integrated Network Interface for High-Bandwidth Servers. N. L. Binkert, A. G. Saidi, S. K. Reinhardt. University of Michigan Technical Report CSE-TR-514-06, January 2006. pdf

2005

  • Software Defined Radio - A High Performance Embedded Challenge. H. lee, Y. Lin, Y. Harel, M. Woh, S. Mahlke, T. Mudge, K. Flautner. Proc. 2005 Int’l Conf. on High Performance Embedded Architectures and Compilers (HiPEAC). November 2005. pdf

  • How to Fake 1000 Registers. D. W. Oehmke, N. L. Binkert, S. K. Reinhardt, and T. Mudge. Proc. 38th Ann. Int’l Symp. on Microarchitecture (MICRO), November 2005. pdf

  • Virtualizing Register Context. D. W. Oehmke. Dissertation at the University of Michigan, 2005. pdf

  • Performance Validation of Network-Intensive Workloads on a Full-System Simulator. A. G. Saidi, N. L. Binkert, L. R. Hsu, and S. K. Reinhardt. First Ann. Workshop on Iteraction between Operating System and Computer Architecture (IOSCA), October 2005. pdf

    • An extended version appears as University of Michigan Technical Report CSE-TR-511-05, July 2005. pdf
  • Performance Analysis of System Overheads in TCP/IP Workloads. N. L. Binkert, L. R. Hsu, A. G. Saidi, R. G. Dreslinski, A. L. Schultz, and S. K. Reinhardt. Proc. 14th Int’l Conf. on Parallel Architectures and Compilation Techniques (PACT), September 2005. pdf

  • Sampling and Stability in TCP/IP Workloads. L. R. Hsu, A. G. Saidi, N. L. Binkert, and S. K. Reinhardt. Proc. First Annual Workshop on Modeling, Benchmarking, and Simulation (MoBS), June

    1. pdf
  • A Unified Compressed Memory Hierarchy. E. G. Hallnor and S. K. Reinhardt. Proc. 11th Int’l Symp. on High-Performance Computer Architecture (HPCA), February 2005. pdf

  • Analyzing NIC Overheads in Network-Intensive Workloads. N. L. Binkert, L. R. Hsu, A. G. Saidi, R. G. Dreslinski, A. L. Schultz, and S. K. Reinhardt. Eighth Workshop on Computer Architecture Evaluation using Commercial Workloads (CAECW), February 2005. pdf

    • An extended version appears as University of Michigan Technical Report CSE-TR-505-04, December 2004. pdf

2004

  • Emulation of realisitic network traffic patterns on an eight-node data vortex interconnection network subsytem. B. Small, A. Shacham, K. Bergman, K. Athikulwongse, C. Hawkins, and D.S. Will. Journal of Optical Networking Vol. 3, No.11, pp 802-809, November 2004. pdf

  • ChipLock: Support for Secure Microarchitectures. T. Kgil, L Falk, and T. Mudge. Proc. Workshop on Architectural Support for Security and Anti-virus (WASSA), October 2004, pp. 130-139. pdf

  • Design and Applications of a Virtual Context Architecture. D. Oehmke, N. Binkert, S. Reinhardt, and T. Mudge. University of Michigan Technical Report CSE-TR-497-04, September 2004. pdf

  • The Performance Potential of an Integrated Network Interface. N. L. Binkert, R. G. Dreslinski, E. G. Hallnor, L. R. Hsu, S. E. Raasch, A. L. Schultz, and S. K. Reinhardt. Proc. Advanced Networking and Communications Hardware Workshop (ANCHOR), June 2004. pdf

  • A Co-Phase Matrix to Guide Simultaneous Multithreading Simulation. M. Van Biesbrouck, T. Sherwood, and B. Calder. IEEE International Symposium on Performance Analysis and Software (ISPASS), March 2004. pdf

  • A Compressed Memory Hierarchy using an Indirect Index Cache. E. G. Hallnor and S. K. Reinhardt. Proc. 3rd Workshop on Memory Performance Issues (WMPI), June 2004. pdf

    • An extended version appears as University of Michigan Technical Report CSE-TR-488-04, March 2004. pdf

2003

  • The Impact of Resource Partitioning on SMT Processors. S. E. Raasch and S. K. Reinhardt. Proc. 12th Int’l Conf. on Parallel Architectures and Compilation Techniques (PACT), pp. 15-25, Sept.

    1. pdf
  • Network-Oriented Full-System Simulation using M5. N. L. Binkert, E. G. Hallnor, and S. K. Reinhardt. Sixth Workshop on Computer Architecture Evaluation using Commercial Workloads (CAECW), February

    1. pdf
  • Design, Implementation and Use of the MIRV Experimental Compiler for Computer Architecture Research. D. A. Greene. Dissertation at the Universtiy of Michigan, 2003. [http://www.eecs.umich.edu/~tnm/theses/daveg.pdg“>pdf ]

2002

  • A Scalable Instruction Queue Design Using Dependence Chains. S. E. Raasch, N. L. Binkert, and S. K. Reinhardt. Proc. 29th Annual Int’l Symp. on Computer Architecture (ISCA), pp. 318-329, May 2002. pdf ps ps.gz

Derivative projects


Below is a list of projects that are based on gem5, are extensions of gem5, or use gem5.

MV5

  • MV5 is a reconfigurable simulator for heterogeneous multicore architectures. It is based on M5v2.0 beta 4.
  • Typical usage: simulating data-parallel applications on SIMT cores that operate over directory-based cache hierarchies. You can also add out-of-order cores to have a heterogeneous system, and all different types of cores can operate under the same address space through the same cache hierarchy.
  • Research projects based on MV5 have been published in ISCA’10, ICCD’09, and IPDPS’10.

Features

  • Single-Instruction, Multiple-Threads (SIMT) cores
  • Directory-based Coherence Cache: MESI/MSI. (Not based on gems/ruby)
  • Interconnect: Fully connected and 2D Mesh. (Not based on gems/ruby)
  • Threading API/library in system emulation mode (No support for full-system simulation. A benchmark suite using the thread API is provided)

Resources

  • Home Page: 1
  • Tutorial at ISPASS ‘11: 2
  • Google group: 3

gem5-gpu

  • Merges 2 popular simulators: gem5 and gpgpu-sim
  • Simulates CPUs, GPUs, and the interactions between them
  • Models a flexible memory system with support for heterogeneous processors and coherence
  • Supports full-system simulation through GPU driver emulation

Resources

  • Home Page: 4
  • Overview slides: 5
  • Mailing list: 6