cpu: Move fetch stats from simple and minor to base

This summarizes a series of changes to move general Simple, Minor,
O3 CPU stats to BaseCPU. This commit focuses on moving numBranches
from SimpleCPU to the FetchCPUStats in the BaseCPU, and
numFetchSuspends from MinorCPU into FetchCPUStats.  More general
information about this relation chain is below

1. Summary:
Moved general CPU stats found across Simple, Minor, and O3 CPU models
into BaseCPU through new stat groups. The stat groups are
FetchCPUStats, ExecuteCPUStats, and CommitCPUStats. Implemented the
committedControl stat vector found in MinorCPU for Simple and O3 CPU.
Implemented the numStoreInsts stat found in SimpleCPU for O3CPU. IPC
and CPI stats are now tracked at the core and thread level in BaseCPU
and are made universal for simple, minor, o3, and kvm CPUs. Duplicate
stats across the models are merged into a single stat in BaseCPU under
the same stat name. This change does not implement every general level
stat moved to BaseCPU for every model.

2. Stat API Changes
a. SimpleCPU:
statExecutedInstType vector unified into committedInstType
numCondCtrlInsts unified into committedControl::isControl

b. O3CPU:
i. Fetch Stage
branches in fetch unified into with numBranches
rate renamed to fetchRate
insts unified into with numInsts

ii. Execute Stage
Regfile stats unified into base with use of Simple's stat naming
numRefs in IEW unified into numMemRefs
numRate from IEW renamed to instRate

iii. Commit Stage
committedInsts is renamed to numInstsNotNOP
committedOps is renamed to numOpsNotNOP
instsCommitted is unified into numInsts
opsCommitted is unified into numOps
branches is unified into committedControl::isControl
floating is unified into numFpInsts
integer is unified into numIntInsts
loads is unified into numLoadInsts
memRefs is renamed to numMemRefs
vectorInstructions is unified into numVecInsts

3. Details:
Created three stat groups in BaseCPU. FetchCPUStats track statistics
related to the fetch stage. ExecuteCPUStats track statistics related
to the execute stage. CommitCPUStats track statistics related to the
commit stage.

There are three vectors in Base that store unique pointers to per
thread instances of these stat groups. The stat group pointer for
thread i is accessible at index i of one of these vectors. For example,
stat numCCRegReads of the execute stage for thread 0 can be accessed
with executeStats[0]->numCCRegReads. The stats.txt output will print the
thread ID of the stat group. For example, numVecRegReads on thread 0
of a single core prints as
"board.processor.cores.core.executeStats0.numVecRegReads".
NOTE: Multithreading in gem5 is untested. Therefore per thread stats
output in stats.txt is not currently guaranteed to be correctly
formatted.

For FetchCPUStats, the stats moved from  SimpleCPU are numBranches
and numInsts. From MinorCPU, the stat moved is numFetchSuspends. From
O3CPU, the stats moved are from the O3 fetch stage: Stat branches is
unified into numBranches, stat rate is renamed to fetchRate in Base,
stat insts is unified into numInsts, stat icacheStallCycles keeps the
same name in Base.

For ExecuteCPUStats, the stats moved from SimpleCPU are
dcacheStallCycles, numCCRegReads, numCCRegWrites,
numFpAluAccesses, numFpRegReads, numFpRegWrites, numIntAluAccesses,
numIntRegReads, numIntRegWrites, numMemRefs, numMiscRegReads,
numMiscRegWrites, numVecAluAccesses, numVecPredRegReads,
numVecPredRegWrites, numVecRegReads, numVecRegWrites. The stat moved
from MinorCPU is numDiscardedOps. From O3, the Regfile stats in CPU are
unified into the reg stats in Base and use the names found originally
in SimpleCPU. From O3 IEW stage, numInsts keeps the same name in
Base, numBranches is unified into numBranches in base, numNop keeps
the same name in Base, numRefs is unified into numMemRefs in Base,
numLoadInsts and numStoreInsts are moved into Base, numRate is renamed
to instRate in base.

For CommitCPUStats, the stats moved from SimpleCPU are
numCondCtrlInsts, numFpInsts, numIntInsts, numLoadInsts, numStoreInsts,
numVecInsts. The stats moved from MinorCPU are numInsts,
committedInstType, and committedControl. statExecutedInstType of
SimpleCPU is unified with committedInstType of MinorCPU. Implemented
committedControl stats from MinorCPU in Simple and O3 CPU. In MinorCPU,
this stat was a 2D vector, where the first dimension is the thread ID.
In base it is now a 1D vector that is tied to a thread ID via the
commitStats vector that the object is accessible through. From the O3
commit stage, committedInsts is renamed to numInstsNotNOP, committedOps
is renamed to numOpsNotNOP, instsCommitted is unified into numInsts,
opsCommitted is renamed to numOps, committedInstType is unified into
committedInstType from Minor, branches is removed because it duplicates
committedControl::IsControl, floating is unified into numFpInsts,
interger is unified into numIntInsts, loads is unified into
numLoadInsts, numStoreInsts is implemented for tracking in O3, memRefs
is renamed to numMemRefs, vectorInstructions is unified into
numVecInsts. Note that numCondCtrlInsts of Simple is unified into
committedControl::IsCondCtrl.

Implemented IPC and CPI tracking inside BaseCPU.
In BaseCPU::BaseCPUStats, numInsts and numOps track per CPU core
committed instructions and operations.
In BaseCPU::FetchCPUStats, numInsts and numOps track per thread
fetched instructions and operations.
In BaseCPU::CommitCPUStats, numInsts tracks per thread executed
instructions.
In BaseCPU::CommitCPUStats, numInsts and numOps track per thread
committed instructions and operations.
In BaseSimpleCPU, the countInst() function has been split into
countInst(), countFetchInst(), and countCommitInst(). The stat count
incrementation step of countInst() has been removed and delegated to the
other two functions. countFetchInst() increments numInsts and numOps
of the FetchCPUStats group for a thread. countCommitInst() increments
the numInsts and numOps of the CommitCPUStats group for a thread and
of the BaseCPUStats group for a CPU core. These functions are called
in the appropriate stage within timing.cc and atomic.cc. The call to
countInst() is left unchanged. countFetchInst() is called in
preExecute(). countCommitInst() is called in postExecute().
For MinorCPU, only the commit level numInsts and numOps stats have been
implemented.
IPC and CPI stats have been added to BaseCPUStats (core level) and
CommitCPUStats (thread level). The formulas for the IPC and CPI stats
in CommitCPUStats are set in the BaseCPU constructor, after the
CommitCPUStats stat group object has been created. These replace IPC,
CPI, totalIpc, and totalCpi stats in O3.

Replaced committedInsts stats of KVM CPU with commitStats.numInsts
of BaseCPU. This results in IPC and CPI printing in stats.txt for
KVM simulations.

This change does not implement most general stats found in one or two
model for all others.

Jira Ticket: https://gem5.atlassian.net/browse/GEM5-1304

Change-Id: I3c852f8dba3268c71b7a3415480fb63d8dc30cb7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66031
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
7 files changed