Diff - f8578e4b050198303d67f3cda6fe3d9cf5093b2e^! - public/gem5

commit	f8578e4b050198303d67f3cda6fe3d9cf5093b2e	[log] [tgz]
author	Kyle Roarty <kyleroarty1716@gmail.com>	Wed Jul 14 15:50:48 2021 -0500
committer	Matt Sinclair <mattdsinclair@gmail.com>	Sat Jul 24 17:27:02 2021 +0000
tree	5d33839c4b59a855754a5b9a3a1604dd92f9c866
parent	a10106e94a0b895a25df56f0e3bf95f072c3f91b [diff]

gpu-compute: Fix TLB coalescer starvation

Currently, we are storing coalesced accesses in
an std::unordered_map indexed by a tick index, i.e.
issue tick / coalescing window. If there are
multiple coalesced requests, at different tick
indexes, to the same virtual address, then the
TLB coalescer will issue just the first one.

However, std::unordered_map is not a sorted
container and we issue coalesced requests by iterating
through such container. This means that the coalesced
request sent in TLBCoalescer::processProbeTLBEvent is
not necessarly the oldest one. Because of this, in
cases of high contention the oldest coalesced request
will have a huge TLB access latency.

To fix this issue, we will use an std::map which is
a sorted container and therefore guarantees the
oldest coalesced request will be sent first.

Change-Id: I9c7ab32c038d5e60f6b55236266a27b0cae8bfb0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48340
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>

diff --git a/src/gpu-compute/tlb_coalescer.hh b/src/gpu-compute/tlb_coalescer.hh
index b97801b..fce8740 100644
--- a/src/gpu-compute/tlb_coalescer.hh
+++ b/src/gpu-compute/tlb_coalescer.hh

@@ -100,7 +100,7 @@
      * option is to change it to curTick(), so we coalesce based
      * on the receive time.
      */
-    typedef std::unordered_map<int64_t, std::vector<coalescedReq>>
+    typedef std::map<int64_t, std::vector<coalescedReq>>
         CoalescingFIFO;
 
     CoalescingFIFO coalescerFIFO;