_site/feed.xml - public/gem5-website - Git at Google

 <?xml version="1.0" encoding="UTF-8"?>
 <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
   <channel>
     <title>gem5</title>
     <description></description>
     <link>http://localhost:4000/</link>
     <atom:link href="http://localhost:4000/feed.xml" rel="self" type="application/rss+xml" />
     <pubDate>Mon, 21 Jan 2019 12:53:57 -0800</pubDate>
     <lastBuildDate>Mon, 21 Jan 2019 12:53:57 -0800</lastBuildDate>
     <generator>Jekyll v3.7.4</generator>

       <item>
         <title>Visualizing Spectre with gem5</title>
         <description>&lt;p&gt;&lt;a href=&quot;https://meltdownattack.com/&quot;&gt;Spectre and Meltdown&lt;/a&gt; took much of our
 community by surprise. I personally found these attacks fascinating
 because they didn’t rely on a &lt;em&gt;bug&lt;/em&gt; in any particular hardware
 implementation, but leveraged undefined behavior. Specifically, Spectre
 and Meltdown can exfiltrate potentially secret memory data by detecting
 the effects of speculative instructions &lt;em&gt;that are later squashed&lt;/em&gt;.&lt;/p&gt;

 &lt;p&gt;Very cool!&lt;/p&gt;

 &lt;p&gt;Out of order processors are very complex. It would make it easier to
 understand exactly what causes speculation attacks like Spectre and
 Meltdown if we had a way to &lt;em&gt;visualize&lt;/em&gt; the attacks. Luckily, gem5
 already has a way to view the details of it’s out of order CPU’s
 pipeline.&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/assets/img/o3-example.png&quot; alt=&quot;o3 pipeline view example&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;The image above was created using the O3 pipeline viewer that is
 included with gem5. In this post, I’ll explain how to use the O3
 pipeline viewer and how to generate images like the above. There is also
 a new project which makes it easier to navigate large pipeline traces
 and it is useful for comparing different pipeline designs:
 &lt;a href=&quot;https://github.com/shioyadan/Konata&quot;&gt;Konata&lt;/a&gt; created by Ryota Shioya.
 Ryota gave a presentation on Konata at a recent &lt;a href=&quot;http://learning.gem5.org/tutorial/index.html&quot;&gt;Learning gem5
 tutorial&lt;/a&gt;. You can find
 the pdf of his presentation
 &lt;a href=&quot;http://learning.gem5.org/tutorial/presentations/vis-o3-gem5.pdf&quot;&gt;here&lt;/a&gt;.
 Konata is a cool tool that’s written in javascript and Ryota describes
 it as “Google maps for an out of order pipeline”.&lt;/p&gt;

 &lt;h2 id=&quot;running-spectre&quot;&gt;Running Spectre&lt;/h2&gt;

 &lt;p&gt;The first step to visualizing what is going on in the pipeline during a
 Spectre attack is getting proof of concept exploit code. I used the code
 that was posted to a github gist by Erik August soon after the attack
 was announced. You can get that code here:
 &lt;a href=&quot;https://gist.github.com/ErikAugust/724d4a969fb2c6ae1bbd7b2a9e3d4bb6&quot;&gt;https://gist.github.com/ErikAugust/724d4a969fb2c6ae1bbd7b2a9e3d4bb6&lt;/a&gt;.&lt;/p&gt;

 &lt;p&gt;First, you need to compile the proof of concept code on your native
 machine (note: I’ll be using x86 for all of my examples).&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;gcc spectre.c -o spectre -static
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;I used gcc 7.2 (the default on Ubuntu 17.10) for my tests, and you may
 want to do the same. &lt;a href=&quot;#effects-of-compilers&quot;&gt;Below&lt;/a&gt; I discuss the
 effects different compilers have on the Specre attack. For instance, if
 you use clang instead you may not be able to reproduce the Spectre
 attack in gem5.&lt;/p&gt;

 &lt;p&gt;My native machine is still vulnerable to Spectre so when I run the
 binary generated above, I get the following output.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Reading 40 bytes:
 Reading at malicious_x = 0xffffffffffdd76c8... Success: 0x54=’T’ score=2
 Reading at malicious_x = 0xffffffffffdd76c9... Success: 0x68=’h’ score=2
 Reading at malicious_x = 0xffffffffffdd76ca... Success: 0x65=’e’ score=2
 Reading at malicious_x = 0xffffffffffdd76cb... Success: 0x20=’ ’ score=2
 Reading at malicious_x = 0xffffffffffdd76cc... Success: 0x4D=’M’ score=2
 Reading at malicious_x = 0xffffffffffdd76cd... Success: 0x61=’a’ score=2
 Reading at malicious_x = 0xffffffffffdd76ce... Success: 0x67=’g’ score=2
 Reading at malicious_x = 0xffffffffffdd76cf... Success: 0x69=’i’ score=2
 Reading at malicious_x = 0xffffffffffdd76d0... Success: 0x63=’c’ score=2
 Reading at malicious_x = 0xffffffffffdd76d1... Success: 0x20=’ ’ score=2
 Reading at malicious_x = 0xffffffffffdd76d2... Success: 0x57=’W’ score=2
 Reading at malicious_x = 0xffffffffffdd76d3... Success: 0x6F=’o’ score=2
 Reading at malicious_x = 0xffffffffffdd76d4... Success: 0x72=’r’ score=2
 Reading at malicious_x = 0xffffffffffdd76d5... Success: 0x64=’d’ score=2
 Reading at malicious_x = 0xffffffffffdd76d6... Success: 0x73=’s’ score=2
 Reading at malicious_x = 0xffffffffffdd76d7... Success: 0x20=’ ’ score=2
 Reading at malicious_x = 0xffffffffffdd76d8... Success: 0x61=’a’ score=2
 Reading at malicious_x = 0xffffffffffdd76d9... Success: 0x72=’r’ score=2
 Reading at malicious_x = 0xffffffffffdd76da... Success: 0x65=’e’ score=2
 Reading at malicious_x = 0xffffffffffdd76db... Success: 0x20=’ ’ score=2
 Reading at malicious_x = 0xffffffffffdd76dc... Success: 0x53=’S’ score=2
 Reading at malicious_x = 0xffffffffffdd76dd... Success: 0x71=’q’ score=2
 Reading at malicious_x = 0xffffffffffdd76de... Success: 0x75=’u’ score=2
 Reading at malicious_x = 0xffffffffffdd76df... Success: 0x65=’e’ score=2
 Reading at malicious_x = 0xffffffffffdd76e0... Success: 0x61=’a’ score=2
 Reading at malicious_x = 0xffffffffffdd76e1... Success: 0x6D=’m’ score=2
 Reading at malicious_x = 0xffffffffffdd76e2... Success: 0x69=’i’ score=2
 Reading at malicious_x = 0xffffffffffdd76e3... Success: 0x73=’s’ score=2
 Reading at malicious_x = 0xffffffffffdd76e4... Success: 0x68=’h’ score=2
 Reading at malicious_x = 0xffffffffffdd76e5... Success: 0x20=’ ’ score=2
 Reading at malicious_x = 0xffffffffffdd76e6... Success: 0x50=’P’ score=9 (second best: 0x06 score=2)
 Reading at malicious_x = 0xffffffffffdd76e7... Success: 0x73=’s’ score=2
 Reading at malicious_x = 0xffffffffffdd76e8... Success: 0x73=’s’ score=2
 Reading at malicious_x = 0xffffffffffdd76e9... Success: 0x69=’i’ score=2
 Reading at malicious_x = 0xffffffffffdd76ea... Success: 0x66=’f’ score=2
 Reading at malicious_x = 0xffffffffffdd76eb... Success: 0x72=’r’ score=2
 Reading at malicious_x = 0xffffffffffdd76ec... Success: 0x61=’a’ score=2
 Reading at malicious_x = 0xffffffffffdd76ed... Success: 0x67=’g’ score=2
 Reading at malicious_x = 0xffffffffffdd76ee... Success: 0x65=’e’ score=2
 Reading at malicious_x = 0xffffffffffdd76ef... Success: 0x2E=’.’ score=2
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h3 id=&quot;running-spectre-in-gem5&quot;&gt;Running Spectre in gem5&lt;/h3&gt;

 &lt;p&gt;To find out if gem5’s out of order CPU implementation is vulnerable to
 Spectre, we need to run the code in gem5. The simplest and fastest way
 to do this is by running in gem5’s syscall-emulation (SE) mode. In SE
 mode we won’t be modeling an OS or any user-mode to kernel-mode
 interaction, but this okay for Spectre since this proof of concept code
 is all in user-mode. If we were investigating Metldown, we would have to
 use full-system (FS) mode since Meltdown specifically allows user-mode
 processes to read data that should only be accessible in kernel mode.&lt;/p&gt;

 &lt;p&gt;So, when running something in gem5, the first step is to create a Python
 runscript since this is &lt;a href=&quot;http://learning.gem5.org/book/part1/simple_config.html&quot;&gt;the “interface” to
 gem5&lt;/a&gt;. For this
 example, what we need is a system with one CPU, an L1 cache, and memory.
 For simplicity, I’m going to modify one of the existing script,
 specifically the &lt;code class=&quot;highlighter-rouge&quot;&gt;two_level.py&lt;/code&gt; script from the &lt;a href=&quot;http://learning.gem5.org/&quot;&gt;Learning gem5
 book&lt;/a&gt;.&lt;/p&gt;

 &lt;p&gt;In the file &lt;code class=&quot;highlighter-rouge&quot;&gt;gem5/configs/learning_gem5/part1/two_level.py&lt;/code&gt;, I simply
 changed the CPU from &lt;code class=&quot;highlighter-rouge&quot;&gt;TimingSimpleCPU()&lt;/code&gt; to
 &lt;code class=&quot;highlighter-rouge&quot;&gt;DerivO3CPU(branchPred=LTAGE())&lt;/code&gt;. I also set the O3CPU to use the LTAGE
 branch predictor instead of the default tournament branch predictor.
 It’s important to use the LTAGE branch predictor as better branch
 predictors actually make Spectre easier to exploit as discussed further
 &lt;a href=&quot;#effects-of-branch-predictor&quot;&gt;below&lt;/a&gt;.&lt;/p&gt;

 &lt;p&gt;Now, we simply need to build gem5 and run it.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;scons -j8 build/X86/gem5.opt

 build/X86/gem5.opt configs/learning_gem5/part1/two_level.py spectre
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;And, the output that I get is the following, just like above when I ran
 the &lt;code class=&quot;highlighter-rouge&quot;&gt;spectre&lt;/code&gt; natively.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;gem5 Simulator System.  http://gem5.org
 gem5 is copyrighted software; use the --copyright option for details.

 gem5 compiled May 10 2018 09:40:08
 gem5 started May 24 2018 11:21:16
 gem5 executing on palisade, pid 27173
 command line: build/X86/gem5.opt configs/learning_gem5/part1/two_level.py spectre

 Global frequency set at 1000000000000 ticks per second
 warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
 0: system.remote_gdb: listening for remote gdb on port 7000
 Beginning simulation!
 info: Entering event queue @ 0.  Starting simulation...
 warn: readlink() called on '/proc/self/exe' may yield unexpected results in various settings.
       Returning '/home/jlp/Code/gem5/spectre-vis/spectre'
 info: Increasing stack size by one page.
 warn: ignoring syscall access(...)
 Reading 40 bytes:                 tput cols
 Reading at malicious_x = 0xffffffffffdd76c8... Success: 0x54=’T’ score=2
 Reading at malicious_x = 0xffffffffffdd76c9... Success: 0x68=’h’ score=2
 Reading at malicious_x = 0xffffffffffdd76ca... Success: 0x65=’e’ score=2
 Reading at malicious_x = 0xffffffffffdd76cb... Success: 0x20=’ ’ score=2
 Reading at malicious_x = 0xffffffffffdd76cc... Success: 0x4D=’M’ score=2
 Reading at malicious_x = 0xffffffffffdd76cd... Success: 0x61=’a’ score=2
 Reading at malicious_x = 0xffffffffffdd76ce... Success: 0x67=’g’ score=2
 Reading at malicious_x = 0xffffffffffdd76cf... Success: 0x69=’i’ score=2
 Reading at malicious_x = 0xffffffffffdd76d0... Success: 0x63=’c’ score=2
 Reading at malicious_x = 0xffffffffffdd76d1... Success: 0x20=’ ’ score=2
 Reading at malicious_x = 0xffffffffffdd76d2... Success: 0x57=’W’ score=2
 Reading at malicious_x = 0xffffffffffdd76d3... Success: 0x6F=’o’ score=2
 Reading at malicious_x = 0xffffffffffdd76d4... Success: 0x72=’r’ score=2
 Reading at malicious_x = 0xffffffffffdd76d5... Success: 0x64=’d’ score=2
 Reading at malicious_x = 0xffffffffffdd76d6... Success: 0x73=’s’ score=2
 Reading at malicious_x = 0xffffffffffdd76d7... Success: 0x20=’ ’ score=2
 Reading at malicious_x = 0xffffffffffdd76d8... Success: 0x61=’a’ score=2
 Reading at malicious_x = 0xffffffffffdd76d9... Success: 0x72=’r’ score=2
 Reading at malicious_x = 0xffffffffffdd76da... Success: 0x65=’e’ score=2
 Reading at malicious_x = 0xffffffffffdd76db... Success: 0x20=’ ’ score=2
 Reading at malicious_x = 0xffffffffffdd76dc... Success: 0x53=’S’ score=2
 Reading at malicious_x = 0xffffffffffdd76dd... Success: 0x71=’q’ score=2
 Reading at malicious_x = 0xffffffffffdd76de... Success: 0x75=’u’ score=2
 Reading at malicious_x = 0xffffffffffdd76df... Success: 0x65=’e’ score=2
 Reading at malicious_x = 0xffffffffffdd76e0... Success: 0x61=’a’ score=2
 Reading at malicious_x = 0xffffffffffdd76e1... Success: 0x6D=’m’ score=2
 Reading at malicious_x = 0xffffffffffdd76e2... Success: 0x69=’i’ score=2
 Reading at malicious_x = 0xffffffffffdd76e3... Success: 0x73=’s’ score=2
 Reading at malicious_x = 0xffffffffffdd76e4... Success: 0x68=’h’ score=2
 Reading at malicious_x = 0xffffffffffdd76e5... Success: 0x20=’ ’ score=2
 Reading at malicious_x = 0xffffffffffdd76e6... Success: 0x4F=’O’ score=2
 Reading at malicious_x = 0xffffffffffdd76e7... Success: 0x73=’s’ score=2
 Reading at malicious_x = 0xffffffffffdd76e8... Success: 0x73=’s’ score=2
 Reading at malicious_x = 0xffffffffffdd76e9... Success: 0x69=’i’ score=2
 Reading at malicious_x = 0xffffffffffdd76ea... Success: 0x66=’f’ score=2
 Reading at malicious_x = 0xffffffffffdd76eb... Success: 0x72=’r’ score=2
 Reading at malicious_x = 0xffffffffffdd76ec... Success: 0x61=’a’ score=2
 Reading at malicious_x = 0xffffffffffdd76ed... Success: 0x67=’g’ score=2
 Reading at malicious_x = 0xffffffffffdd76ee... Success: 0x65=’e’ score=2
 Reading at malicious_x = 0xffffffffffdd76ef... Success: 0x2E=’.’ score=2
 Exiting @ tick 113568969000 because exiting with last active thread context
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h2 id=&quot;visualizing-the-out-of-order-pipeline&quot;&gt;Visualizing the out of order pipeline&lt;/h2&gt;

 &lt;p&gt;To generate pipeline visualizations, we first need to generate a trace
 file of all of the instructions executed by the out of order CPU. To
 create this trace, we can use the &lt;code class=&quot;highlighter-rouge&quot;&gt;O3PipeView&lt;/code&gt; debug flag.&lt;/p&gt;

 &lt;p&gt;Now, the trace for the O3 CPU can be &lt;em&gt;very&lt;/em&gt; large, up to many GBs. When
 creating this trace, you need to be careful to create the smallest trace
 possible. Also, it’s important to dump the trace to a file and not to
 &lt;code class=&quot;highlighter-rouge&quot;&gt;stdout&lt;/code&gt;, which is the default when using debug flags. You can redirect
 the trace to a file by using the &lt;code class=&quot;highlighter-rouge&quot;&gt;--debug-file&lt;/code&gt; option to gem5.&lt;/p&gt;

 &lt;p&gt;To create the trace file, I used the following methodology:&lt;/p&gt;

 &lt;ol&gt;
   &lt;li&gt;Start running spectre in gem5, then hit ctrl-c after the first
 couple of letters. At this point, I wrote down the tick which gem5
 exited (13062347000 for me).&lt;/li&gt;
   &lt;li&gt;Run gem5 with the debug flag &lt;code class=&quot;highlighter-rouge&quot;&gt;O3PipeView&lt;/code&gt; enabled.&lt;/li&gt;
   &lt;li&gt;Watch the output and kill gem5 with ctrl-c after two more letters
 appeared than in step 1.&lt;/li&gt;
 &lt;/ol&gt;

 &lt;p&gt;To generate the trace, I ran the following command. Note: you may have a
 different value for when to start the debugging trace. Also note: when
 producing the trace gem5 will run &lt;em&gt;much&lt;/em&gt; slower.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;build/X86/gem5.opt --debug-flags=O3PipeView --debug-file=pipeview.txt --debug-start=13062347000 configs/learning_gem5/part1/two_level.py spectre
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;My tracefile (&lt;code class=&quot;highlighter-rouge&quot;&gt;pipeview.txt&lt;/code&gt;) was 600 MB for catching just two letters
 in the output.&lt;/p&gt;

 &lt;p&gt;Now, we can process this file to generate the visualization with a
 script: &lt;code class=&quot;highlighter-rouge&quot;&gt;util/o3-pipeview.py&lt;/code&gt;. This script requires the path to the file
 that contains the output generated with the &lt;code class=&quot;highlighter-rouge&quot;&gt;O3PipeView&lt;/code&gt; debug flag.
 Above, we put the output into the file &lt;code class=&quot;highlighter-rouge&quot;&gt;pipeview.txt&lt;/code&gt;, and this file was
 created in the default output directory of gem5 (&lt;code class=&quot;highlighter-rouge&quot;&gt;m5out/&lt;/code&gt;).&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;util/o3-pipeview.py --store_completions m5out/pipeview.txt --color -w 150
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;In the above command, I wanted to see when the stores completed
 (&lt;code class=&quot;highlighter-rouge&quot;&gt;--store_completions&lt;/code&gt;) and specified to use color (&lt;code class=&quot;highlighter-rouge&quot;&gt;--color&lt;/code&gt;) in the
 output and use a width of 150 characters (&lt;code class=&quot;highlighter-rouge&quot;&gt;-w 150&lt;/code&gt;). Processing a large
 file like this one of 600 MB may take a few minutes. The output will be
 in a file called &lt;code class=&quot;highlighter-rouge&quot;&gt;o3-pipeview.out&lt;/code&gt; in the current working directory.&lt;/p&gt;

 &lt;p&gt;You can view this file with &lt;code class=&quot;highlighter-rouge&quot;&gt;less -r o3-pipeview.out&lt;/code&gt;. You may want to
 use the &lt;code class=&quot;highlighter-rouge&quot;&gt;-S&lt;/code&gt; option with less if your terminal is less than 150
 characters wide (or whatever width value you used). Below is a
 screenshot of the top of my trace.&lt;/p&gt;

 &lt;h3 id=&quot;understanding-the-o3-pipeline-viewer&quot;&gt;Understanding the O3 pipeline viewer&lt;/h3&gt;

 &lt;p&gt;&lt;img src=&quot;/assets/img/o3-example-annotated.png&quot; alt=&quot;o3 pipeline view example&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;The above image details how to interpret the output from the pipeline
 viewer. Each &lt;code class=&quot;highlighter-rouge&quot;&gt;.&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;=&lt;/code&gt; represents one cycle of time, which moves from
 left to right. The “tick” column shows the tick of the leftmost &lt;code class=&quot;highlighter-rouge&quot;&gt;.&lt;/code&gt; or
 &lt;code class=&quot;highlighter-rouge&quot;&gt;=&lt;/code&gt;. &lt;code class=&quot;highlighter-rouge&quot;&gt;=&lt;/code&gt; is used to mark the instructions that were later squashed. The
 address of the instruction (and the micro-op number) as well as the
 disassembly is also shown. The sequence number can be ignored as it is
 always monotonically increasing and is the total order of every dynamic
 instruction. Finally, each stage of the O3 pipeline is shown with a
 different letter and color.&lt;/p&gt;

 &lt;h2 id=&quot;digging-deeper-into-spectre&quot;&gt;Digging deeper into Spectre&lt;/h2&gt;

 &lt;p&gt;First, let’s examine the actual instructions that are executed during
 the Spectre attack. The vulnerability is in the &lt;code class=&quot;highlighter-rouge&quot;&gt;victim_function&lt;/code&gt; in
 &lt;code class=&quot;highlighter-rouge&quot;&gt;spectre.c&lt;/code&gt;.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;void victim_function(size_t x) {
   if (x &amp;lt; array1_size) {
     temp &amp;amp;= array2[array1[x] * 512];
   }
 }
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;When this is compiled and then dumped with &lt;code class=&quot;highlighter-rouge&quot;&gt;objdump&lt;/code&gt;, we get the
 following instructions that will be executed. Your code my be slightly
 different, especially the exact addresses of each instruction, depending
 on the version of the compiler and other system-specific configurations.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# NOTE: the movzbl below is MOVZX_B_R_M in gem5.
 # it is implemented with the following microcode.
 #    ld t1, seg, sib, disp, dataSize=1
 #    zexti reg, t1, 7
 #
 000000000040105e &amp;lt;victim_function&amp;gt;:
   40105e:   55                      push   %rbp
   40105f:   48 89 e5                mov    %rsp,%rbp
   401062:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
   401066:   8b 05 14 f0 2b 00       mov    0x2bf014(%rip),%eax # 6c0080 &amp;lt;array1_size&amp;gt; load array1_size (first time is always a miss)
   40106c:   89 c0                   mov    %eax,%eax
   40106e:   48 3b 45 f8             cmp    -0x8(%rbp),%rax  # if (x &amp;lt; array1_size) rax is array1_size, -8(%rbp) is x
   401072:   76 2b                   jbe    40109f &amp;lt;victim_function+0x41&amp;gt; # if (x &amp;lt; array1_size)
   401074:   48 8b 45 f8             mov    -0x8(%rbp),%rax # load x from the stack into rax
   401078:   48 05 a0 00 6c 00       add    $0x6c00a0,%rax  # calculate array1 offset (x+array1)
   40107e:   0f b6 00                movzbl (%rax),%eax # load array1[x]
   401081:   0f b6 c0                movzbl %al,%eax    # zero extend to 32 bits
   401084:   c1 e0 09                shl    $0x9,%eax   # multiply by 512
   401087:   48 98                   cltq               # sign-extend eax
   401089:   0f b6 90 80 1d 6c 00    movzbl 0x6c1d80(%rax),%edx  # load array2[array1[x]*512] **** This is the magic!
   401090:   0f b6 05 e9 0c 2e 00    movzbl 0x2e0ce9(%rip),%eax        # 6e1d80 &amp;lt;temp&amp;gt; Load temp.
   401097:   21 d0                   and    %edx,%eax
   401099:   88 05 e1 0c 2e 00       mov    %al,0x2e0ce1(%rip)        # 6e1d80 &amp;lt;temp&amp;gt;
   40109f:   5d                      pop    %rbp
   4010a0:   c3                      retq
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;Now, we can search for the instruction that we care about in the trace.
 In this case, we want to find a time where the &lt;code class=&quot;highlighter-rouge&quot;&gt;movzbl&lt;/code&gt; at address
 &lt;code class=&quot;highlighter-rouge&quot;&gt;0x401089&lt;/code&gt; is executed speculatively. When searching through the
 pipeline viewer (use &lt;code class=&quot;highlighter-rouge&quot;&gt;\&lt;/code&gt; in less), we’re looking for a time where the
 load completes for the instruction at &lt;code class=&quot;highlighter-rouge&quot;&gt;0x401089&lt;/code&gt; and it is later
 squashed (surrounded by &lt;code class=&quot;highlighter-rouge&quot;&gt;=&lt;/code&gt;). An example is shown below.&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/assets/img/o3-spectre-annotated.png&quot; alt=&quot;annotated O3 pipeline view of
 spectre&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;The image above is from my presentation at &lt;a href=&quot;http://caslab.csl.yale.edu/workshops/hasp2018/&quot;&gt;Hardware and Architectural
 Support for Security and Privacy (HASP)
 2018&lt;/a&gt;.&lt;/p&gt;

 &lt;p&gt;What we see in this image is that the instruction at &lt;code class=&quot;highlighter-rouge&quot;&gt;0x401066&lt;/code&gt; causes a
 cache miss (there is a long time between when the load is issued and the
 data is returned from memory). Since the load of &lt;code class=&quot;highlighter-rouge&quot;&gt;array1_size&lt;/code&gt; was a
 cache miss, the jump at &lt;code class=&quot;highlighter-rouge&quot;&gt;0x401072&lt;/code&gt; is speculated to be &lt;em&gt;not&lt;/em&gt; taken
 (incorrectly). This causes the following instructions to be executed
 speculatively, and, eventually, squashed.&lt;/p&gt;

 &lt;p&gt;The key thing in this trace that &lt;em&gt;is&lt;/em&gt; the Spectre vulnerability is that
 the load for the instruction at &lt;code class=&quot;highlighter-rouge&quot;&gt;0x40107e&lt;/code&gt;, which loads secret data
 happens during the mis-speculated instructions. Then, this data is
 loaded into the registers and operated on (instruction &lt;code class=&quot;highlighter-rouge&quot;&gt;0x401084&lt;/code&gt;).
 Finally, the load at address &lt;code class=&quot;highlighter-rouge&quot;&gt;0x401089&lt;/code&gt; is executed and loads the value
 from memory &lt;em&gt;that is dependent on the secret data loaded previously&lt;/em&gt;.
 Thus, we can later probe the cache to retrieve the secret data.&lt;/p&gt;

 &lt;h3 id=&quot;effects-of-compilers&quot;&gt;Effects of compilers&lt;/h3&gt;

 &lt;p&gt;As previously mentioned, the specific compiler version and compiler
 options have a significant effect on the attack. Below are two traces,
 one from GCC 7.2 and one from clang 4.0.&lt;/p&gt;

 &lt;h4 id=&quot;gcc-72&quot;&gt;GCC 7.2&lt;/h4&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;void victim_function(size_t x) {
   400b2d:       55                      push   %rbp
   400b2e:       48 89 e5                mov    %rsp,%rbp
   400b31:       48 89 7d f8             mov    %rdi,-0x8(%rbp)
   if (x &amp;lt; array1_size) {
   400b35:       8b 05 c5 c5 2c 00       mov    0x2cc5c5(%rip),%eax        # 6cd100 &amp;lt;array1_size&amp;gt;
   400b3b:       89 c0                   mov    %eax,%eax
   400b3d:       48 39 45 f8             cmp    %rax,-0x8(%rbp)
   400b41:       73 34                   jae    400b77 &amp;lt;victim_function+0x4a&amp;gt;
     temp &amp;amp;= array2[array1[x] * 512];
   400b43:       48 8d 15 d6 c5 2c 00    lea    0x2cc5d6(%rip),%rdx        # 6cd120 &amp;lt;array1&amp;gt;
   400b4a:       48 8b 45 f8             mov    -0x8(%rbp),%rax
   400b4e:       48 01 d0                add    %rdx,%rax
   400b51:       0f b6 00                movzbl (%rax),%eax
   400b54:       0f b6 c0                movzbl %al,%eax
   400b57:       c1 e0 09                shl    $0x9,%eax
   400b5a:       48 63 d0                movslq %eax,%rdx
   400b5d:       48 8d 05 9c f6 2c 00    lea    0x2cf69c(%rip),%rax        # 6d0200 &amp;lt;array2&amp;gt;
   400b64:       0f b6 14 02             movzbl (%rdx,%rax,1),%edx
   400b68:       0f b6 05 91 e1 2c 00    movzbl 0x2ce191(%rip),%eax        # 6ced00 &amp;lt;temp&amp;gt;
   400b6f:       21 d0                   and    %edx,%eax
   400b71:       88 05 89 e1 2c 00       mov    %al,0x2ce189(%rip)        # 6ced00 &amp;lt;temp&amp;gt;
   }
 }
   400b77:       90                      nop
   400b78:       5d                      pop    %rbp
   400b79:       c3                      retq
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;iframe height=&quot;500&quot; src=&quot;/assets/img/gcc72-static-tage.html&quot; frameborder=&quot;0&quot;&gt;
 &lt;/iframe&gt;
 &lt;p&gt;However, clang generates the following code.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;void victim_function(size_t x) {
   400ac0:       55                      push   %rbp
   400ac1:       48 89 e5                mov    %rsp,%rbp
   400ac4:       48 89 7d f8             mov    %rdi,-0x8(%rbp)
   if (x &amp;lt; array1_size) {
   400ac8:       48 8b 7d f8             mov    -0x8(%rbp),%rdi
   400acc:       8b 04 25 90 c0 6c 00    mov    0x6cc090,%eax
   400ad3:       89 c1                   mov    %eax,%ecx
   400ad5:       48 39 cf                cmp    %rcx,%rdi
   400ad8:       0f 83 2f 00 00 00       jae    400b0d &amp;lt;victim_function+0x4d&amp;gt;
     temp &amp;amp;= array2[array1[x] * 512];
   400ade:       48 8b 45 f8             mov    -0x8(%rbp),%rax
   400ae2:       0f b6 0c 05 a0 c0 6c    movzbl 0x6cc0a0(,%rax,1),%ecx
   400ae9:       00
   400aea:       c1 e1 09                shl    $0x9,%ecx
   400aed:       48 63 c1                movslq %ecx,%rax
   400af0:       0f b6 0c 05 40 f2 6c    movzbl 0x6cf240(,%rax,1),%ecx
   400af7:       00
   400af8:       0f b6 14 25 50 dc 6c    movzbl 0x6cdc50,%edx
   400aff:       00
   400b00:       21 ca                   and    %ecx,%edx
   400b02:       40 88 d6                mov    %dl,%sil
   400b05:       40 88 34 25 50 dc 6c    mov    %sil,0x6cdc50
   400b0c:       00
   }
 }
   400b0d:       5d                      pop    %rbp
   400b0e:       c3                      retq
   400b0f:       90                      nop
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;iframe height=&quot;500&quot; src=&quot;/assets/img/clang-static-tage.html&quot; frameborder=&quot;0&quot;&gt;
 &lt;/iframe&gt;
 &lt;p&gt;Interestingly, the clang-compiled &lt;code class=&quot;highlighter-rouge&quot;&gt;spectre&lt;/code&gt; binary is not able to read
 the secret data! (At least not in gem5. It is able to read the secret
 data on my native machine.)&lt;/p&gt;

 &lt;p&gt;We can look into the two traces to see the difference between the clang
 version and the GCC version.&lt;/p&gt;

 &lt;p&gt;The main difference is that in the clang version, the load generated by
 the instruction at &lt;code class=&quot;highlighter-rouge&quot;&gt;0x400af0&lt;/code&gt; never completes (and thus, must not have
 been issued to the memory system).&lt;/p&gt;

 &lt;p&gt;I’m not sure the exact cause of this difference. It could be that the
 instruction uses a different addressing mode
 (&lt;code class=&quot;highlighter-rouge&quot;&gt;movzbl 0x6cf240(,%rax,1),%ecx&lt;/code&gt; in clang vs &lt;code class=&quot;highlighter-rouge&quot;&gt;movzbl (%rdx,%rax,1),%edx&lt;/code&gt;
 in GCC). If you have ideas, please leave a comment!&lt;/p&gt;

 &lt;p&gt;Either way, minor differences in the code generated can have large
 impacts on the speculative execution!&lt;/p&gt;

 &lt;h3 id=&quot;effects-of-branch-predictor&quot;&gt;Effects of branch predictor&lt;/h3&gt;

 &lt;p&gt;When I was first playing around with Spectre and gem5, I ran into a
 problem where I could only &lt;em&gt;sometimes&lt;/em&gt; get Spectre to “work” with the
 out of order CPU. After significant digging, I found that the branch
 predictor chosen makes a big difference to how quickly the vulnerability
 happens. The trace below (with the same code as GCC 4.8 above) shows
 what happens when using the tournament branch predictor.&lt;/p&gt;

 &lt;iframe height=&quot;500&quot; src=&quot;/assets/img/gcc-static-tourn.html&quot; frameborder=&quot;0&quot;&gt;
 &lt;/iframe&gt;
 &lt;p&gt;Here, we see that the original branch misprediction comes much earlier
 than the jump instruction in &lt;code class=&quot;highlighter-rouge&quot;&gt;victim_function&lt;/code&gt; that is at address
 &lt;code class=&quot;highlighter-rouge&quot;&gt;0x401072&lt;/code&gt;. Thus, by the time the load instructions in &lt;code class=&quot;highlighter-rouge&quot;&gt;victim_function&lt;/code&gt;
 are executed, the ROB and load-store queue resources have been taken by
 other instructions and the rogue loads are not issued to memory. There
 are still a few times that the two loads are executed speculatively, but
 it is much more rare than with the TAGE predictor. When using the TAGE
 branch predictor, only the exact branch that the attacker wants to
 mispredict is mispredicted.&lt;/p&gt;

 &lt;p&gt;This interestingly shows that a “smarter” system is actually &lt;em&gt;more&lt;/em&gt;
 vulnerable to speculation-based attacks!&lt;/p&gt;
 </description>
         <pubDate>Fri, 01 Jun 2018 00:00:00 -0700</pubDate>
         <link>http://localhost:4000/2018/06/01/gem5-spectre.html</link>
         <guid isPermaLink="true">http://localhost:4000/2018/06/01/gem5-spectre.html</guid>


       </item>

       <item>
         <title>Setting up gem5 full system</title>
         <description>&lt;p&gt;This is partially a followup to &lt;a href=&quot;http://www.lowepower.com/jason/creating-disk-images-for-gem5.html&quot;&gt;Creating disk images for
 gem5&lt;/a&gt;
 and partially how to setup x86 full system for gem5. In this post, I’ll
 discuss how to create a disk image from scratch and start using it with
 gem5.&lt;/p&gt;

 &lt;p&gt;It is important for computer architecture research to use the most
 up-to-date software on the systems we are simulating. Too much computer
 architecture research reports results using kernels from 5+ years ago or
 ancient system software Hopefully, this post will help others be able to
 keep up with the ever-changing system software. This way, researchers
 can use up-to-date versions of Linux and easily update their kernels.&lt;/p&gt;

 &lt;p&gt;This post takes a different approach than &lt;a href=&quot;http://www.lowepower.com/jason/creating-disk-images-for-gem5.html&quot;&gt;Creating disk images for
 gem5&lt;/a&gt;.
 Instead of using the gem5 tools, this post uses qemu to create, edit,
 and set up the disk for gem5 usage.&lt;/p&gt;

 &lt;p&gt;This post assumes that you have installed qemu on your system. In
 Ubuntu, this can be done with&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sudo apt-get install qemu-kvm libvirt-bin ubuntu-vm-builder bridge-utils
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;I also assume you have downloaded and built gem5. All of the full system
 examples use the simple full system scripts that are covered in
 &lt;a href=&quot;http://learning.gem5.org/book/part3/index.html&quot;&gt;Learning gem5&lt;/a&gt;.&lt;/p&gt;

 &lt;h2 id=&quot;step-1-create-an-empty-disk&quot;&gt;Step 1: Create an empty disk&lt;/h2&gt;

 &lt;p&gt;Using the qemu disk tools, create a blank raw disk image. In this case,
 I chose to create a disk named “ubuntu-test.img” that is 8GB.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;qemu-img create ubuntu-test.img 8G
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h2 id=&quot;step-2-install-ubuntu-with-qemu&quot;&gt;Step 2: Install ubuntu with qemu&lt;/h2&gt;

 &lt;p&gt;Now that we have a blank disk, we are going to use qemu to install
 Ubuntu on the disk. I would encourage you to use the server version of
 Ubuntu since gem5 does not have great support for displays. Thus, the
 desktop environment isn’t very useful.&lt;/p&gt;

 &lt;p&gt;First, you need to download the installation CD image from the &lt;a href=&quot;https://www.ubuntu.com/download/server&quot;&gt;Ubuntu
 website&lt;/a&gt;.&lt;/p&gt;

 &lt;p&gt;Next, use qemu to boot off of the CD image, and set the disk in the
 system to be the blank disk you created above. Ubuntu needs at least 1GB
 of memory to install correctly, so be sure to configure qemu to use at
 least 1GB memory.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;qemu-system-x86_64 -hda ../gem5-fs-testing/ubuntu-test.img -cdrom ubuntu-16.04.1-server-amd64.iso -m 1024 -enable-kvm -boot d
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;With this, you can simply follow the on-screen directions to install
 Ubuntu to the disk image. The only gotcha in the installation is that
 gem5’s IDE drivers don’t seem to play nicely with logical paritions.
 Thus, during the Ubuntu install, be sure to manually partition the disk
 and remove any logical partitions. You don’t need any swap space on the
 disk anyway, unless you’re doing something specifically with swap space.&lt;/p&gt;

 &lt;h2 id=&quot;step-3-boot-up-and-install-needed-software&quot;&gt;Step 3: Boot up and install needed software&lt;/h2&gt;

 &lt;p&gt;Once you have installed Ubuntu on the disk, quit qemu and remove the
 &lt;code class=&quot;highlighter-rouge&quot;&gt;-boot d&lt;/code&gt; option so that you are not booting off of the CD anymore. Now,
 you can again boot off of the main disk image you have installed Ubuntu
 on.&lt;/p&gt;

 &lt;p&gt;Since we’re using qemu, you should have a network connection (although
 &lt;a href=&quot;http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29&quot;&gt;ping won’t
 work&lt;/a&gt;).
 When booting in qemu, you can just use &lt;code class=&quot;highlighter-rouge&quot;&gt;sudo apt-get install&lt;/code&gt; and
 install any software you need on your disk.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;qemu-system-x86_64 -hda ../gem5-fs-testing/ubuntu-test.img -cdrom ubuntu-16.04.1-server-amd64.iso -m 1024 -enable-kvm
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h2 id=&quot;step-4-build-a-kernel&quot;&gt;Step 4: Build a kernel&lt;/h2&gt;

 &lt;p&gt;Next, you need to build a Linux kernel. Unfortunately, the
 out-of-the-box Ubuntu kernel doesn’t play well with gem5. See the
 error below_.&lt;/p&gt;

 &lt;p&gt;First, you need to download latest kernel from
 &lt;a href=&quot;https://www.kernel.org/&quot;&gt;kernel.org&lt;/a&gt;. Then, to build the kernel, you
 are going to want to start with a known-good config file.
 The config file that I’m used for kernel version 4.8.13 can be
 downloaded &lt;a href=&quot;{filename}files/config&quot;&gt;here&lt;/a&gt;. Then, you need to move the
 good config to &lt;code class=&quot;highlighter-rouge&quot;&gt;.config&lt;/code&gt; and the run &lt;code class=&quot;highlighter-rouge&quot;&gt;make oldconfig&lt;/code&gt; which starts the
 kernel configuration process with an existing config file.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;mv &amp;lt;good config&amp;gt; .config
 make oldconfig
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;At this point you can select any extra drivers you want to build into
 the kernel. Note: You cannot use any kernel modules unless you are
 planning on copying the modules onto the guest disk at the correct
 location. All drivers must be built into the kernel binary.&lt;/p&gt;

 &lt;p&gt;It may be possible to use modules by compiling the binary on the guest
 disk via qemu, but I have not tested this.&lt;/p&gt;

 &lt;p&gt;Finally, you need to build the kernel.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;make -j5
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h2 id=&quot;step-5-update-init-script&quot;&gt;Step 5: Update init script&lt;/h2&gt;

 &lt;p&gt;By default, gem5 expects a modified init script which loads a script off
 of the host to execute in the guest. To use this feature, you need to
 follow the steps below.&lt;/p&gt;

 &lt;p&gt;Alternatively, you can install the precompiled binaries for x86 found on
 my website: From qemu, you can run the following, which completes the
 above steps for you.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;wget http://cs.wisc.edu/~powerjg/files/gem5-guest-tools-x86.tgz
 tar xzvf gem5-guest-tools-x86.tgz
 cd gem5-guest-tools/
 sudo ./install
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;Now, you can use the &lt;code class=&quot;highlighter-rouge&quot;&gt;system.readfile&lt;/code&gt; parameter in your Python config
 scripts. This file will automatically be loaded (by the &lt;code class=&quot;highlighter-rouge&quot;&gt;gem5init&lt;/code&gt;
 script) and executed.&lt;/p&gt;

 &lt;h3 id=&quot;manually-installing-the-gem5-init-script&quot;&gt;Manually installing the gem5 init script&lt;/h3&gt;

 &lt;p&gt;First, build the m5 binary on the host.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;cd util/m5
 make -f Makefile.x86
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;Then, copy this binary to the guest and put it in &lt;code class=&quot;highlighter-rouge&quot;&gt;/sbin&lt;/code&gt;. Also, create
 a link from &lt;code class=&quot;highlighter-rouge&quot;&gt;/sbin/gem5&lt;/code&gt;.&lt;/p&gt;

 &lt;p&gt;Then, to get the init script to execute when gem5 boots, create file
 /lib/systemd/system/gem5.service with the following:&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[Unit]
 Description=gem5 init script
 Documentation=http://gem5.org
 After=getty.target

 [Service]
 Type=idle
 ExecStart=/sbin/gem5init
 StandardOutput=tty
 StandardInput=tty-force
 StandardError=tty

 [Install]
 WantedBy=default.target
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;Enable the gem5 service and disable the ttyS0 service.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;systemctl enable gem5.service
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;Finally, create the init script that is executed by the service. In
 &lt;code class=&quot;highlighter-rouge&quot;&gt;/sbin/gem5init&lt;/code&gt;:&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;#!/bin/bash -&lt;/span&gt;

 &lt;span class=&quot;nv&quot;&gt;CPU&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; /proc/cpuinfo | &lt;span class=&quot;nb&quot;&gt;grep &lt;/span&gt;vendor_id | head &lt;span class=&quot;nt&quot;&gt;-n&lt;/span&gt; 1 | cut &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;' '&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-f2-&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;
 &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Got CPU type: &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$CPU&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;

 &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$CPU&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;M5 Simulator&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
 &lt;span class=&quot;k&quot;&gt;then
     &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Not in gem5. Not loading script&quot;&lt;/span&gt;
     &lt;span class=&quot;nb&quot;&gt;exit &lt;/span&gt;0
 &lt;span class=&quot;k&quot;&gt;fi&lt;/span&gt;

 &lt;span class=&quot;c&quot;&gt;# Try to read in the script from the host system&lt;/span&gt;
 /sbin/m5 readfile &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; /tmp/script
 chmod 755 /tmp/script
 &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-s&lt;/span&gt; /tmp/script &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
 &lt;span class=&quot;k&quot;&gt;then&lt;/span&gt;
     &lt;span class=&quot;c&quot;&gt;# If there is a script, execute the script and then exit the simulation&lt;/span&gt;
     su root &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;'/tmp/script'&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# gives script full privileges as root user in multi-user mode&lt;/span&gt;
     sync
     sleep 10
     /sbin/m5 &lt;span class=&quot;nb&quot;&gt;exit
 &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fi
 &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;No script found&quot;&lt;/span&gt;
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h2 id=&quot;problems-and-some-solutions&quot;&gt;Problems and (some) solutions&lt;/h2&gt;

 &lt;h3 id=&quot;failed-to-early-mount-api-filesystems&quot;&gt;Failed to early mount API filesystems&lt;/h3&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Write protecting the kernel read-only data: 8192k
 Freeing unused kernel memory: 1956K (ffff880001417000 - ffff880001600000)
 Freeing unused kernel memory: 456K (ffff88000178e000 - ffff880001800000)
 [!!!!!!] Failed to early mount API filesystems, freezing.
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;Solutions tried: Enable cgroups in the kernel. I think. Nope! I think
 this is the same as the problem below mount-problem_.&lt;/p&gt;

 &lt;h3 id=&quot;cant-mount-dev&quot;&gt;Can’t mount /dev&lt;/h3&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Failed to mount devtmpfs at /dev: No such device
 Freezing execution.
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;Something like the above (this was taken from arch linux boot). The
 problem is that that the right devfs is not compiled into the kernel.
 You need to make sure that devtmpfs is enabled.&lt;/p&gt;

 &lt;h3 id=&quot;panic-kvm-unexpected-exit-exit_reason-8&quot;&gt;panic: KVM: Unexpected exit (exit_reason: 8)&lt;/h3&gt;

 &lt;p&gt;Exit reason 8 is “shutdown”. See
 &lt;a href=&quot;http://lxr.free-electrons.com/source/include/uapi/linux/kvm.h#L188&quot;&gt;http://lxr.free-electrons.com/source/include/uapi/linux/kvm.h#L188&lt;/a&gt;.
 This seems to happen when there is a triple fault:
 &lt;a href=&quot;http://lxr.free-electrons.com/source/arch/x86/kvm/x86.c#L6498&quot;&gt;http://lxr.free-electrons.com/source/arch/x86/kvm/x86.c#L6498&lt;/a&gt;&lt;/p&gt;

 &lt;p&gt;I get this error every time I try to boot the unmodified Ubuntu kernel.
 I don’t know how to solve this problem. Instead of trying to solve the
 problem, I used a different config file for “oldconfig” when I compiled
 the kernel from scratch.&lt;/p&gt;

 &lt;h3 id=&quot;slow-boot&quot;&gt;Slow boot&lt;/h3&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[ TIME ] Timed out waiting for device dev-di...\x2da115\x2de3f263d7b53a.device.
 [DEPEND] Dependency failed for /dev/disk/by-...382-f41d-4c99-a115-e3f263d7b53a.
 [DEPEND] Dependency failed for Swap.
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;This may happen if you have changed the disk without updating the fstab
 on the disk. To fix it, you can boot the disk in qemu and update fstab
 with the correct UUID.&lt;/p&gt;

 &lt;p&gt;I ran into this when I was resizing the disk.&lt;/p&gt;

 &lt;h3 id=&quot;disk-is-too-small-for-what-you-want-to-do&quot;&gt;Disk is too small for what you want to do&lt;/h3&gt;

 &lt;p&gt;Resizing an iso is pretty easy. You can use the same method you would if
 you wanted to resize a partition on a regular hard drive.&lt;/p&gt;

 &lt;p&gt;First, you need to resize the iso with qemu-image:&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;qemu-img resize ubuntu-test.img +8G
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;Now, you have a disk that has 8 GB of free space at the end of the disk.
 You need to resize the partitions to use this free space. To do this, I
 suggest using gparted just like you would for a real hard drive.&lt;/p&gt;

 &lt;p&gt;You can download a gparted ISO from &lt;a href=&quot;http://gparted.org/livecd.php&quot;&gt;http://gparted.org/livecd.php&lt;/a&gt;.
 Once you download the ISO, you can boot it with qemu the same way as we
 booted the installation CD. Then, once its booted you can select the
 disk you want to modify and follow the howto
 (&lt;a href=&quot;http://gparted.org/display-doc.php%3Fname%3Dhelp-manual&quot;&gt;http://gparted.org/display-doc.php%3Fname%3Dhelp-manual&lt;/a&gt;).&lt;/p&gt;
 </description>
         <pubDate>Fri, 13 Jan 2017 00:00:00 -0800</pubDate>
         <link>http://localhost:4000/tools/2017/01/13/gem5-fs.html</link>
         <guid isPermaLink="true">http://localhost:4000/tools/2017/01/13/gem5-fs.html</guid>


         <category>tools</category>

       </item>

       <item>
         <title>Creating disk images for gem5</title>
         <description>&lt;p&gt;When using gem5 in full-system mode, you have to have a disk image with
 the operating system and all of your data on it. This is just like
 having a physical disk in a physical machine. In this post, I’m going to
 walk through how to create a new disk and install a (semi-)current
 version of Ubuntu on the disk. By the end of this post, you should be
 able to create your own disk with whatever extra data and applications
 you want.&lt;/p&gt;

 &lt;p&gt;This post assumes that you have already checked out a version of gem5
 and can build and run gem5 in full-system mode. The &lt;a href=&quot;http://www.lowepower.com/jason/learning_gem5/&quot;&gt;Learning
 gem5&lt;/a&gt; documentation is a
 good place to start. This post uses the x86 ISA for gem5, and is mostly
 applicable to other ISAs. More details on setting up ARM systems can be
 found on the gem5 wiki:
 &lt;a href=&quot;http://gem5.org/Ubuntu_Disk_Image_for_ARM_Full_System&quot;&gt;http://gem5.org/Ubuntu_Disk_Image_for_ARM_Full_System&lt;/a&gt;.&lt;/p&gt;

 &lt;p&gt;In the future, this post may be folded into &lt;a href=&quot;http://www.lowepower.com/jason/learning_gem5/&quot;&gt;Learning
 gem5&lt;/a&gt;.&lt;/p&gt;

 &lt;h2 id=&quot;creating-a-blank-disk-image&quot;&gt;Creating a blank disk image&lt;/h2&gt;

 &lt;p&gt;The first step is to create a blank disk image (usually a .img file).
 Luckily, the gem5 developers have already made this easy with a tool
 that is simple to use. To create a blank disk image, which is formatted
 with ext2 by default, simply run the following.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; util/gem5img.py init ubuntu-14.04.img 4096
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;This command creates a new image, called “ubuntu-14.04.img” that is 4096
 MB. This command may require you to enter the sudo password, if you
 don’t have permission to create loopback devices. &lt;em&gt;You should never run
 commands as the root user that you don’t understand! You should look at
 the file util/gem5img.py and ensure that it isn’t going to do anything
 malicious to your computer!&lt;/em&gt;&lt;/p&gt;

 &lt;p&gt;We will be using util/gem5img.py heavily throughout this post, so you
 may want to understand it better. If you just run &lt;code class=&quot;highlighter-rouge&quot;&gt;util/gem5img.py&lt;/code&gt;, it
 displays all of the possible commands.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Usage: %s [command] &amp;lt;command arguments&amp;gt;
 where [command] is one of
     init: Create an image with an empty file system.
     mount: Mount the first partition in the disk image.
     umount: Unmount the first partition in the disk image.
     new: File creation part of &quot;init&quot;.
     partition: Partition part of &quot;init&quot;.
     format: Formatting part of &quot;init&quot;.
 Watch for orphaned loopback devices and delete them with
 losetup -d. Mounted images will belong to root, so you may need
 to use sudo to modify their contents
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h2 id=&quot;copying-root-files-to-the-disk&quot;&gt;Copying root files to the disk&lt;/h2&gt;

 &lt;p&gt;Now that we have created a blank disk, we need to populate it with all
 of the OS files. Ubuntu distributes a set of files explicitly for this
 purpose. You can find the &lt;a href=&quot;https://wiki.ubuntu.com/Core&quot;&gt;Ubuntu core&lt;/a&gt;
 distribution for 14.04 at
 &lt;a href=&quot;http://cdimage.ubuntu.com/ubuntu-core/releases/14.04/release/&quot;&gt;http://cdimage.ubuntu.com/ubuntu-core/releases/14.04/release/&lt;/a&gt; Since I
 am simulating an x86 machine, I chose the file
 &lt;code class=&quot;highlighter-rouge&quot;&gt;ubuntu-core-14.04-core-amd64.tar.gz&lt;/code&gt;. Download whatever image is
 appropriate for the system you are simulating.&lt;/p&gt;

 &lt;p&gt;Next, we need to mount the blank disk and copy all of the files onto the
 disk.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;mkdir mnt
 ../../util/gem5img.py mount ubuntu-14.04.img mnt
 wget http://cdimage.ubuntu.com/ubuntu-core/releases/14.04/release/ubuntu-core-14.04-core-amd64.tar.gz
 sudo tar xzvf ubuntu-core-14.04-core-amd64.tar.gz -C mnt
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;The next step is to copy a few required files from your working system
 onto the disk so we can chroot into the new disk. We need to copy
 &lt;code class=&quot;highlighter-rouge&quot;&gt;/etc/resolv.conf&lt;/code&gt; onto the new disk.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sudo cp /etc/resolv.conf mnt/etc/
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h2 id=&quot;setting-up-gem5-specific-files&quot;&gt;Setting up gem5-specific files&lt;/h2&gt;

 &lt;h3 id=&quot;create-a-serial-terminal&quot;&gt;Create a serial terminal&lt;/h3&gt;

 &lt;p&gt;By default, gem5 uses the serial port to allow communication from the
 host system to the simulated system. To use this, we need to create a
 serial tty. Since Ubuntu uses upstart to control the init process, we
 need to add a file to /etc/init which will initialize our terminal.
 Also, in this file, we will add some code to detect if there was a
 script passed to the simulated system. If there is a script, we will
 execute the script instead of creating a terminal.&lt;/p&gt;

 &lt;p&gt;Put the following code into a file called /etc/init/tty-gem5.conf&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# ttyS0 - getty
 #
 # This service maintains a getty on ttyS0 from the point the system is
 # started until it is shut down again, unless there is a script passed to gem5.
 # If there is a script, the script is executed then simulation is stopped.

 start on stopped rc RUNLEVEL=[12345]
 stop on runlevel [!12345]

 console owner
 respawn
 script
    # Create the serial tty if it doesn't already exist
    if [ ! -c /dev/ttyS0 ]
    then
       mknod /dev/ttyS0 -m 660 /dev/ttyS0 c 4 64
    fi

    # Try to read in the script from the host system
    /sbin/m5 readfile &amp;gt; /tmp/script
    chmod 755 /tmp/script
    if [ -s /tmp/script ]
    then
       # If there is a script, execute the script and then exit the simulation
       exec su root -c '/tmp/script' # gives script full privileges as root user in multi-user mode
       /sbin/m5 exit
    else
       # If there is no script, login the root user and drop to a console
       # Use m5term to connect to this console
       exec /sbin/getty --autologin root -8 38400 ttyS0
    fi
 end script
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h3 id=&quot;setup-localhost&quot;&gt;Setup localhost&lt;/h3&gt;

 &lt;p&gt;We also need to set up the localhost loopback device if we are going to
 use any applications that use it. To do this, we need to add the
 following to the &lt;code class=&quot;highlighter-rouge&quot;&gt;/etc/hosts&lt;/code&gt; file.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;127.0.0.1 localhost
 ::1 localhost ip6-localhost ip6-loopback
 fe00::0 ip6-localnet
 ff00::0 ip6-mcastprefix
 ff02::1 ip6-allnodes
 ff02::2 ip6-allrouters
 ff02::3 ip6-allhosts
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h3 id=&quot;update-fstab&quot;&gt;Update fstab&lt;/h3&gt;

 &lt;p&gt;Next, we need to create an entry in &lt;code class=&quot;highlighter-rouge&quot;&gt;/etc/fstab&lt;/code&gt; for each partition we
 want to be able to access from the simulated system. Only one partition
 is absolutely required (&lt;code class=&quot;highlighter-rouge&quot;&gt;/&lt;/code&gt;); however, you may want to add additional
 partitions, like a swap partition.&lt;/p&gt;

 &lt;p&gt;The following should appear in the file &lt;code class=&quot;highlighter-rouge&quot;&gt;/etc/fstab&lt;/code&gt;.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# /etc/fstab: static file system information.
 #
 # Use 'blkid' to print the universally unique identifier for a
 # device; this may be used with UUID= as a more robust way to name devices
 # that works even if disks are added and removed. See fstab(5).
 #
 # &amp;lt;file system&amp;gt;    &amp;lt;mount point&amp;gt;   &amp;lt;type&amp;gt;  &amp;lt;options&amp;gt;   &amp;lt;dump&amp;gt;  &amp;lt;pass&amp;gt;
 /dev/hda1      /       ext3        noatime     0 1
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h3 id=&quot;copy-the-m5-binary-to-the-disk&quot;&gt;Copy the &lt;code class=&quot;highlighter-rouge&quot;&gt;m5&lt;/code&gt; binary to the disk&lt;/h3&gt;

 &lt;p&gt;gem5 comes with an extra binary application that executes
 pseudo-instructions to allow the simulated system to interact with the
 host system. To build this binary, run &lt;code class=&quot;highlighter-rouge&quot;&gt;make -f Makefile.&amp;lt;isa&amp;gt;&lt;/code&gt; in the
 &lt;code class=&quot;highlighter-rouge&quot;&gt;gem5/m5&lt;/code&gt; directory, where &lt;code class=&quot;highlighter-rouge&quot;&gt;&amp;lt;isa&amp;gt;&lt;/code&gt; is the ISA that you are simulating
 (e.g., x86). After this, you should have an &lt;code class=&quot;highlighter-rouge&quot;&gt;m5&lt;/code&gt; binary file. Copy this
 file to /sbin on your newly created disk.&lt;/p&gt;

 &lt;p&gt;After updating the disk with all of the gem5-specific files, unless you
 are going on to add more applications or copying additional files,
 unmount the disk image.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; util/gem5img.py umount mnt
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h2 id=&quot;install-new-applications&quot;&gt;Install new applications&lt;/h2&gt;

 &lt;p&gt;The easiest way to install new applications on to your disk, is to use
 &lt;code class=&quot;highlighter-rouge&quot;&gt;chroot&lt;/code&gt;. This program logically changes the root directory (“/”) to a
 different directory, mnt in this case. Before you can change the root,
 you first have to set up the special directories in your new root. To do
 this, we use &lt;code class=&quot;highlighter-rouge&quot;&gt;mount -o bind&lt;/code&gt;.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; sudo /bin/mount -o bind /sys mnt/sys
 &amp;gt; sudo /bin/mount -o bind /dev mnt/dev
 &amp;gt; sudo /bin/mount -o bind /proc mnt/proc
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;After binding those directories, you can now &lt;code class=&quot;highlighter-rouge&quot;&gt;chroot&lt;/code&gt;:&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; sudo /usr/sbin/chroot mnt /bin/bash
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;At this point you will see a root prompt and you will be in the &lt;code class=&quot;highlighter-rouge&quot;&gt;/&lt;/code&gt;
 directory of your new disk.&lt;/p&gt;

 &lt;p&gt;You should update your repository information.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; apt-get update
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;You may want to add the universe repositories to your list with the
 following commands. Note: The first command is require in 14.04.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; apt-get install software-properties-common
 &amp;gt; add-apt-repository universe
 &amp;gt; apt-get update
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;Now, you are able to install any applications you could install on a
 native Ubuntu machine via &lt;code class=&quot;highlighter-rouge&quot;&gt;apt-get&lt;/code&gt;.&lt;/p&gt;

 &lt;p&gt;Remember, after you exit you need to unmount all of the directories we
 used bind on.&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; sudo /bin/umount mnt/sys
 &amp;gt; sudo /bin/umount mnt/proc
 &amp;gt; sudo /bin/umount mnt/dev
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
 </description>
         <pubDate>Tue, 24 Nov 2015 00:00:00 -0800</pubDate>
         <link>http://localhost:4000/jekyll/update/2015/11/24/gem5-disks.html</link>
         <guid isPermaLink="true">http://localhost:4000/jekyll/update/2015/11/24/gem5-disks.html</guid>


         <category>jekyll</category>

         <category>update</category>

       </item>

       <item>
         <title>gem5 Horrors and what we can do about it</title>
         <description>&lt;p&gt;&lt;img src=&quot;/assets/img/gem5-horrors.png&quot; alt=&quot;image&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;This post is a post which mostly follows the talk that I am giving at
 the &lt;a href=&quot;http://gem5.org/User_workshop_2015&quot;&gt;gem5 Users Workshop&lt;/a&gt;. This post
 contains some more details on problems that I skipped in my talk and
 some references that I was not able to include in a presentation. You
 can view my presentation on Google Drive
 &lt;a href=&quot;https://docs.google.com/presentation/d/1QGA5UVaVJkkMITF2TXCY_KlwmfWef1KBzfDP6ocbj7I/pub?start=false&amp;amp;loop=false&amp;amp;delayms=3000&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

 &lt;h2 id=&quot;i-3-gem5&quot;&gt;I &amp;lt;3 gem5&lt;/h2&gt;

 &lt;p&gt;Before I get into the negative aspects of &lt;a href=&quot;http://gem5.org&quot;&gt;gem5&lt;/a&gt;, I
 first want to point out that it is a great tool. gem5 is used by a large
 number of computer architecture researchers, both in industry and in
 academia. Here at Wisconsin, and at other universities, gem5 is used in
 the classroom to teach students about computer architecture and how to
 do computer architecture research.&lt;/p&gt;

 &lt;p&gt;gem5 is, without a doubt, the most full-featured architecture simulator.
 It leverages execute-at-execute semantics for high-fidelity
 cycle-by-cycle simulation. gem5 can boot a mostly unmodified Linux
 image. It has multiple different CPU and memory models. gem5 has a
 modular design which makes it simple to embed and extend. This has
 allowed gem5 to be used a large number of projects (see
 &lt;a href=&quot;http://gem5.org/Projects&quot;&gt;http://gem5.org/Projects&lt;/a&gt; and &lt;a href=&quot;http://gem5.org/Publications&quot;&gt;http://gem5.org/Publications&lt;/a&gt;).&lt;/p&gt;

 &lt;p&gt;However, as great as gem5 is, its growth has not been without pain. Now
 that gem5 is nearing 15 years of development (if you include the
 original m5 and GEMS project from which gem5 was born), I believe it’s
 time to look at some of its deficiencies and talk about what we can do
 to mitigate them.&lt;/p&gt;

 &lt;h2 id=&quot;gem5-horrors&quot;&gt;gem5 horrors&lt;/h2&gt;

 &lt;p&gt;Below, I discuss a few specific pain points with that I and others have
 experienced with gem5. However, before I get to that, I’d like to talk
 about what I think the root of these issues are. gem5 has two main
 problems&lt;/p&gt;

 &lt;p&gt;1)  There is no formal governance model.
 2)  The gem5 developers do not think of the user first.&lt;/p&gt;

 &lt;p&gt;Later, after I give some examples of specific problems, I will discuss
 what I think can be done to fix these to issues.&lt;/p&gt;

 &lt;p&gt;Next I discuss four specific “gem5 horrors” that either I have
 personally experienced or I have talked to others who have experienced
 them. These issues are deeper that just bugs, even if sometimes they can
 be solved with simple changes. After describing each issue, I will also
 quickly discuss a possible way to mitigate the problem.&lt;/p&gt;

 &lt;h3 id=&quot;horror-1-merges&quot;&gt;Horror 1: Merges&lt;/h3&gt;

 &lt;p&gt;There are a number of projects that build on top of gem5. In fact, I
 would argue that this is the main use case for gem5. Everyone that I
 know who uses gem5 for research, takes the mainline gem5 and builds
 their own changes on top of it.&lt;/p&gt;

 &lt;p&gt;The problem with this model, where people build on top of gem5, is that
 when new features are added or bugs are fixed in the mainline,
 downstream users have to consume these changes. If the downstream users
 do a good job managing their patch queues, this should be a
 straightforward thing to do. However, I have found that even when
 careful development practices are followed, merging gem5 changes is
 incredibly difficult.&lt;/p&gt;

 &lt;p&gt;Below I discuss a few specific problems that I have run into when
 merging new changes in gem5. I believe the problems can be summed up
 with two high-level issued we currently have in gem5.&lt;/p&gt;

 &lt;p&gt;1)  There is no well-defined static API. The interface to different
     modules is constantly in a state of flux.
 2)  The regression suite we have in gem5 has poor coverage. There are
     many features that users depend on that are not covered by the
     regression tester.&lt;/p&gt;

 &lt;h4 id=&quot;merge-headache-1-pointless-code-changes&quot;&gt;Merge headache #1: Pointless code changes&lt;/h4&gt;

 &lt;p&gt;Examples from Ruby and Slicc and packet.&lt;/p&gt;

 &lt;h4 id=&quot;merge-headache-2-features-break-between-versions&quot;&gt;Merge headache #2: Features break between versions&lt;/h4&gt;

 &lt;p&gt;Ruby backing store, checkpointing&lt;/p&gt;

 &lt;h4 id=&quot;merge-headache-3-apis-are-a-moving-target&quot;&gt;Merge headache #3: APIs are a moving target&lt;/h4&gt;

 &lt;p&gt;Example with the minimal gem5 script.&lt;/p&gt;

 &lt;h4 id=&quot;how-to-mitigate&quot;&gt;How to mitigate&lt;/h4&gt;

 &lt;p&gt;I believe that there are two things we can can do as the gem5
 development community to make merging upstream changes much easier.
 First, we need a stable set of APIs. Second, we need a robust testing
 and regression structure. I discuss some specifics of these two
 characteristics below.&lt;/p&gt;

 &lt;h4 id=&quot;stable-apis&quot;&gt;Stable APIs&lt;/h4&gt;

 &lt;p&gt;Today in gem5, it is just as easy to change widely used interfaces, like
 the port interface, as it is to change the implementation of a rarely
 used function. We need to change this. I think that we need to choose a
 set of interfaces and make them stable. This is similar to how the Linux
 kernel operates.&lt;/p&gt;

 &lt;p&gt;Once we have chosen a set of stable interfaces, I’m not suggesting that
 they never change, only that it should be more onerous to change stable
 APIs than other things. Additionally, this has the added benefit that
 “gem5-stable” can actually mean something. We can now have a stable
 version, which has non-changing APIs, and a dev version that we can’t
 necessarily count on to have constant APIs.&lt;/p&gt;

 &lt;p&gt;I personally do not know what the API should be. I would like to see the
 community come together and talk about what they see as important
 interfaces. Then, once we find these interfaces, we can architect these
 interfaces and hopefully make gem5 easier to use.&lt;/p&gt;

 &lt;h4 id=&quot;testing-structure&quot;&gt;Testing structure&lt;/h4&gt;

 &lt;p&gt;I do not think that this is a very controversial issue, but gem5 needs a
 better regression structure. If all of the features that we used in
 gem5-gpu had been part of the regression suite, then we would have had
 many less problems.&lt;/p&gt;

 &lt;p&gt;Again, I do not know exactly how to make the regression suite better,
 but I do think a good idea would be to require new features, and bug
 fixes, to include a unit-test or something like that. We really need a
 softeare engineer to sit down and architect a new regression system.
 This would be a great project for someone who is new to the gem5
 codebase.&lt;/p&gt;

 &lt;h3 id=&quot;horror-2-configuration-files&quot;&gt;Horror 2: Configuration files&lt;/h3&gt;

 &lt;p&gt;gem5 has an incredibly flexible configuration system. But with
 flexibility often come complexity. In fact, I ran SLOCcount on the
 configs directory and found there was more than 4000 lines of Python
 code. According to the SLOCcount tool, this means there was 16
 person-months and a quarter of a million dollars worth of code here!&lt;/p&gt;

 &lt;p&gt;All of this complexity causes a number of issues. In my talk, I touched
 on the fact that the defaults are confusing, and in some cases
 inconsistent.&lt;/p&gt;

 &lt;h4 id=&quot;how-to-mitigate-1&quot;&gt;How to mitigate&lt;/h4&gt;

 &lt;p&gt;Since the m5 and GEMS integration, I have noticed a trend that the
 number of command line parameters has continued to grow significantly.
 It seems that every time a new feature has been added, we have added
 some new command line parameters as well. I think this is the wrong way
 to do it.&lt;/p&gt;

 &lt;p&gt;There is an amazing C++-Python wrapper in gem5. We should be taking
 advantage of the scripting capabilities of Python.&lt;/p&gt;

 &lt;p&gt;I have created a simple script that is under 30 lines of Python. I think
 we need to encourage our users to script in Python instead of adding
 more and more command line parameters. Which, in my experience, really
 just leads to scripting in bash instead of in Python anyway.&lt;/p&gt;

 &lt;h3 id=&quot;horror-3-unexpected-results&quot;&gt;Horror 3: Unexpected results&lt;/h3&gt;

 &lt;p&gt;This was a very surprising error that I ran into while working on
 creating a homework assignment for a graduate-level computer
 architecture course. The point of the homework was to compare the
 performance of instruction latency versus instruction throughput. I
 wanted the students to take a particular instruction and change the
 number of execution units, the latency, and how much the units were
 pipelined. To do this, we looked at the divide instruction, since it is
 a long latency instruction. Below is the code that we used:&lt;/p&gt;

 &lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;for (int i = 0; i &amp;lt; N; i++) {
       Y[i] = X[i] / alpha + Y[i];
 }
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;In this code, every divide is totally independent from every other.
 Therefore, we would expect that with he out-of-order CPU, that if the
 divide is pipelined it the code will speedup by how much the divide unit
 is pipelined.&lt;/p&gt;

 &lt;p&gt;To test this, I looked at two different configurations, a 10 cycle
 latency divide with &lt;em&gt;no&lt;/em&gt; pipelining, and a 10 cycle latency divide that
 is fully pipelined. Below is the data I found for ARM and x86. I only
 changed the “obvious” options. Each functional unit has an option for
 the execution latency and issue latency. If the issue latency is 1, then
 the functional unit is fully pipelined. (Now this is a boolean flag.)
 All of the data is relative to x86 with no pipelining.&lt;/p&gt;

 &lt;p&gt;Configuration   Latency     Issue lat.   x86 Perf   ARM Perf
   ————— ———– ———— ———- ————-
   No Pipeline     10 cycles   10 cycles    1.0x       8.0x
   Full Pipeline   10 cycles   1 cycle      1.0x       9.6x (1.2x)&lt;/p&gt;

 &lt;p&gt;There are two very weird results in this data. First, when we fully
 pipelined the divide unit, there was no performance change (at all!!) in
 x86. Second, when running the exact same cod with ARM, there was a 8x
 speedup compared to x86! I find it very hard to believe that the ARM ISA
 is inherently better at divide than x86.&lt;/p&gt;

 &lt;h4 id=&quot;how-to-mitigate-2&quot;&gt;How to mitigate&lt;/h4&gt;

 &lt;p&gt;This is a much harder problem to mitigate than the others on this list.
 Nilay Vaish has taken a step in the right directions with these two
 patches on reviewboard &lt;a href=&quot;http://reviews.gem5.org/r/2744/&quot;&gt;http://reviews.gem5.org/r/2744/&lt;/a&gt; and
 &lt;a href=&quot;http://reviews.gem5.org/r/2744/&quot;&gt;http://reviews.gem5.org/r/2744/&lt;/a&gt;, which have been incorporated in gem5.&lt;/p&gt;

 &lt;p&gt;The underlying problem is that the implementation for ARM and x86 are
 totally distinct. It is not clear to me what the right way to unify the
 ISA implementation are. As a stop-gap, developers who are working on
 implementing x86 features, need to make sure that they perform similarly
 to ARM features. Maybe a solution is to have a single set of C programs
 which exercise all ISAs and compare the performance across ISAs. There
 should be some performance differences, but not an order of magnitude.&lt;/p&gt;

 &lt;h3 id=&quot;horror-4-lack-of-new-user-support&quot;&gt;Horror 4: Lack of new-user support&lt;/h3&gt;

 &lt;h4 id=&quot;how-to-mitigate-3&quot;&gt;How to mitigate&lt;/h4&gt;

 &lt;p&gt;What I think we need to do is to create a “gem5 for Dummies” book or a
 “Learning gem5” book. This book would be similar to Learning Python or
 Learning Mercurial. The book would be open source for anyone to
 contribute to. In fact, it should be required to update the book if a
 developer makes an API-breaking change.&lt;/p&gt;

 &lt;p&gt;An initial implementation of this book, which currently only includes
 about a chapter of “getting started” and is in fact already out of date
 can be found here:
 &lt;a href=&quot;http://pages.cs.wisc.edu/~david/courses/cs752/Spring2015/gem5-tutorial/index.html&quot;&gt;gem5-tutorial&lt;/a&gt;.
 I began working on this in conjunction with the graduate computer
 architecture class at Wisconsin, so it may currently have some
 Wisconsin-specific text. I hope to continue working on this in my
 &lt;em&gt;copious free time&lt;/em&gt;.&lt;/p&gt;

 &lt;p&gt;There are many other horrors that other people experience as well. Here
 I only discussed some of the horrors that I have heard people
 discussing. The purpose in presenting these horrors is not to say that
 gem5 is a bad simulator! The purpose is to highlight how there are
 currently issues that need to be addressed by the gem5 development
 community.&lt;/p&gt;

 &lt;h2 id=&quot;what-can-we-do-about-it&quot;&gt;What can we do about it?&lt;/h2&gt;

 &lt;p&gt;A lot of the problems that I have discussed above come down to poor
 software engineering. And yes, we are architects, not software
 engineers, and there are a lot of things we could do better if we just
 focused on software engineering. However, I do not think that this is
 the underlying issue.&lt;/p&gt;

 &lt;p&gt;I believe these four horror stem from two systemic problems in the gem5
 development community.&lt;/p&gt;

 &lt;p&gt;1)  There is no formal governance model.
 2)  The gem5 developers do not think of the user first.&lt;/p&gt;

 &lt;p&gt;I believe that if we start to solve these high-level issues, gem5 will
 be a much better tool for everyone. Next, I discuss one possible way to
 address these two points.&lt;/p&gt;

 &lt;h2 id=&quot;gem5-foundation&quot;&gt;gem5 Foundation&lt;/h2&gt;

 &lt;p&gt;First, I want to say that I do not believe this is the only way, or the
 right way, to move gem5 forward. This is one possibility that I believe
 will make gem5 a better tool. I hope that this is a place to begin the
 discussion and I am sure that others in our community can come up with
 even better suggestions that this!&lt;/p&gt;

 &lt;p&gt;&lt;em&gt;I think we should create a gem5 Foundation.&lt;/em&gt; The gem5 Foundation will
 be the center for the gem5 community. It will be a formal way for the
 community to set goals and push gem5 forward.&lt;/p&gt;

 &lt;p&gt;There are two main things I think the gem5 Foundation can help us with.
 It can set up a formal governance structure and be a place for outside
 interests to contribute money towards making gem5 better for everyone.&lt;/p&gt;

 &lt;h3 id=&quot;formalizing-a-governance-structure&quot;&gt;Formalizing a governance structure&lt;/h3&gt;

 &lt;p&gt;First, we need a governance structure. This is a document which defines
 how decisions are made in the community, what matters to the community,
 how to contribute to the community, etc.&lt;/p&gt;

 &lt;p&gt;There is a lot of documentation on how to write governance models and
 what they are. &lt;a href=&quot;http://oss-watch.ac.uk/&quot;&gt;OSS-Watch&lt;/a&gt; is a great source
 for this. Here is a link to a definition of a governance model, which
 does a much better job that I can explaining it.
 &lt;a href=&quot;http://oss-watch.ac.uk/resources/governancemodels&quot;&gt;http://oss-watch.ac.uk/resources/governancemodels&lt;/a&gt; Additionally, here
 is a link to an example governance model from an academic open-source
 project:
 &lt;a href=&quot;http://www.taverna.org.uk/about/legal-stuff/taverna-governance-model/&quot;&gt;http://www.taverna.org.uk/about/legal-stuff/taverna-governance-model/&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;money-money-money&quot;&gt;Money, Money, Money&lt;/h3&gt;

 &lt;p&gt;What I think the main solution to all of these problems is to pay
 software developers &lt;em&gt;not computer architects!&lt;/em&gt; to solve some of these
 problems. Already, within ARM and AMD there are a number of people who
 get paid to work on gem5. However, these companies do not have gem5’s
 best interests as their key focus. Their focus is what ARM and AMD find
 interesting.&lt;/p&gt;

 &lt;p&gt;So, I think that if we have something like the gem5 Foundation, these
 companies and academia, can donate money towards coding things that are
 good for the community as a whole. The gem5 Foundation can hire software
 engineers to work on the parts of gem5 that grad students and
 researchers do not want to do. If you look at other academic
 communities, they often hire non researchers to do the “grunt work”.
 Overall, I think this is a good idea for computer architects too, and
 specifically for gem5.&lt;/p&gt;

 &lt;p&gt;I recognize that this may be a crazy idea. I would love to hear what
 others think. I am sure we will have some interesting discussion at the
 gem5 workshop, and hopefully I will write another post with what other
 people thought! Feel free to leave comments below.&lt;/p&gt;
 </description>
         <pubDate>Tue, 09 Jun 2015 00:00:00 -0700</pubDate>
         <link>http://localhost:4000/2015/06/09/gem5-horrors.html</link>
         <guid isPermaLink="true">http://localhost:4000/2015/06/09/gem5-horrors.html</guid>


       </item>

   </channel>
 </rss>