Student: Thomas E. Hansen (teh6, 150015673)
Supervisor: Dr. John Thomson
I personally found it helps to have a Python 2 virtual environment running when
setting up gem5. The requirements for it can be installed from the
venv-reqs-gem5.txt file.
Clone and build the gem5 Simulator. Then, copy the files from the
gem5-custom-config-scripts to the gem5/configs/example/arm/ directory.
$ cd gem5
$ export M5_PATH=path/to/linux/filesBoth the commands below can further be customised by the flags:
--big-cpus N--little-cpus N--cpu-type=<cpu-type>
Full system simulation without power:
$ ./build/ARM/gem5.opt configs/example/arm/fs_bL_extended.py \
--caches
--kernel=$M5_PATH/binaries/<kernel-name> \
--disk=$M5_PATH/disks/<disk-image-name>.img \
--bootloader=$M5_PATH/binaries/<bootloader> \
--bootscript=path/to/bootscript.rcSFull system simulation with power:
$ ./build/ARM/gem5.opt configs/example/arm/fs_bL_extended.py \
--caches \
--kernel=$M5_PATH/binaries/<kernel-name> \
--disk=$M5_PATH/disks/<disk-image-name>.img \
--bootloader=$M5_PATH/binaries/<bootloader> \
--bootscript=path/to/bootscript.rcS \
--example-powerSince the complete data for this project totalled 120GB in size, it is not
included here. However, in the extracted-data directory, there are two files:
roi-out.csv and roi-out_cfg-totpow.csv. These files contain the data
matching several PMU events and were constructed using the data-aggregate.py
script. Both files should theoretically work as inputs to the scripts, but the
roi-out_cfg-totpow.csv file (which contains configs and the total power, in
addition to the stats found in roi-out.csv) is probably safer to use with most
of the scripts.
Optionally, create a Python 3 virtualenv and activate it.
Install the requirements found in venv-reqs-dataproc.txt.
Each of the scripts use argparse and so should provide a usage message. Please
refer to this for detailed usage instructions.
Create a Python 2 virtualenv and install the requirements found in
venv-reqs-gemstone-applypower.txt
cd into gemstone-applypower and activate the venv
For simulating Cortex A15
$ ./gemstone_create_equation.py -p models/gs-A15.params -m maps/gem5-A15.map -o gem5-A15
For simulating Cortex A7
$ ./gemstone_create_equation.py -p models/gs-A7.params -m maps/gem5-A7.map -o gem5-A7
- power: Fix regStats for PowerModel and PowerModelState
- sim-power: Fix power model to work with stat groups
Applied using git-cherry-pick
A working index can be found on the old m5sim page. These files should then be retrieved from dist.gem5.org/dist/current/arm/
- Create a new file of (in this case, 1024B*1024 = 1GiB) zeros using
(you may need to be root or use
$ dd if=/dev/zero of=path/to/file.img bs=1024 count=1024
sudofor the next couple of steps) - Find the next available loopback device
$ losetup -f
- Set up the device returned (e.g.
/dev/loop0) with the image file at offset 32256 (63 * 512 bytes; something to do with tracks, see this)$ losetup -o 32256 /dev/loop0 path/to/file.img
- Format the device
$ mke2fs /dev/loop0
- Detach the loopback device
$ losetup -d /dev/loop0
Done. The image can now be mounted and manipulated using
$ mount -o loop,offset=32256 path/to/file.img path/to/mountpoint*IMPORTANT: remember to copy the GNU/NIX binaries necessary for the system you'll be emulating to their appropriate locations on the new disk
Some details about what to do next can be found here:
The gem5 devs/website openly admits that the DVFS documentation is outdated, leading the user to having to manually read their way through source code and example config scripts to try to figure out how to construct the relevant components. This my attempt at documenting and understanding how it works.
Voltage Domains dictate the voltage values the system can use. It seems gem5
always simulates voltage in FS mode, but simply sets it to 1.0V if the user does
not care about voltage simulation (see src/sim/VoltageDomain.py)
To create a voltage domain, either a voltage value or a list of voltage values
must be given. But not just to the VoltageDomain constructor, no that would
be too simple, but instead as a keyword-argument (kwarg), i.e. voltage. To my
knowledge, this is not documented anywhere, nor is it easily discoverable from
the src/sim/{VoltageDomain.py, voltage_domain.hh, voltage_domain.cc} files.
The example voltage domains I've used are (note that the values have to be specified in descending order):
For the big cluster:
odroid_n2_voltages = [ '0.981000V'
, '0.891000V'
, '0.861000V'
, '0.821000V'
, '0.791000V'
, '0.771000V'
, '0.771000V'
, '0.751000V'
]
odroid_n2_voltage_domain = VoltageDomain(voltage=odroid_n2_voltages)For the LITTLE cluster:
odroid_n2_voltages = [ '0.981000V'
, '0.861000V'
, '0.831000V'
, '0.791000V'
, '0.761000V'
, '0.731000V'
, '0.731000V'
, '0.731000V'
]
odroid_n2_voltage_domain = VoltageDomain(voltage=odroid_n2_voltages)These numbers were obtained by examining the changes in the sysfs files
/sys/class/regulator/regulator.{1,2}/microvolts when using the userspace
frequency governor and varying the frequency of the big and LITTLE clusters
(respectively) using the cpupower command-line tool.
NOTE: In gem5 (and, as far as I know, on real hardware) voltage domains
apply to CPU sockets. So make sure that the big and LITTLE clusters in the
simulator are on different sockets if they need to have different voltage
domains (you can inspect the socket through the socket_id value associated
with the clusters)
Clock domains dictate what frequencies the CPU(s) can be clocked at (what steps are available for the DVFS handler) and are associated with a Voltage Domain. I am uncertain as to what precisely the requirements are for the relationship between these two, especially as the constructor does not seem to complain if there is a different number of values in the available clocks and voltages.
I obtained the following clock values from the Odroid N2 board using the
cpupower command-line tool:
For the big cluster:
odroid_n2_clocks = [ '1800MHz'
, '1700MHz'
, '1610MHz'
, '1510MHz'
, '1400MHz'
, '1200MHz'
, '1000MHz'
, '667MHz'
]
odroid_n2_clk_domain = SrcClockDomain(clock=odroid_n2_clocks,
voltage_domain=odroid_n2_voltage_domain
)For the LITTLE cluster:
odroid_n2_clocks = [ '1900MHz'
, '1700MHz'
, '1610MHz'
, '1510MHz'
, '1400MHz'
, '1200MHz'
, '1000MHz'
, '667MHz'
]
odroid_n2_clk_domain = SrcClockDomain(clock=odroid_n2_clocks,
voltage_domain=odroid_n2_voltage_domain
)The statements below, whilst possibly correct, seem to go against the way things
are done in the example scripts. As such, here is a "better" way of doing
things: It turns out that the --big-cpu-clock value(s), when passed on to a
CpuCluster sub-class, creates a new SrcClockDomain according to that value.
Therefore, there are 2 solutions (of which I have only tested the first):
-
Create sub-classes of the
CpuCluster. Similar to the existingBigClusterandLittleClustersub-classes, these will extendCpuCluster. However, in addition to the config that these classes specify in their body, also define the two lists of values for the voltage and clock domains respectively. Then, simply pass these lists as the appropriate arguments to thesupercall at the end of the sub-class's__init__declaration (3rd and 4th argument at the time of writing, but double-check with your<gem5-root>/configs/example/arm/devices.pyfile). If you want to add DVFS to theAtomicClusteras well, simply extend this class in a similar manner. FINALLY, make sure to add an entry to thecpu_typesdictionary near the end of the file. The entry should have a name for the--cpu-typeflag to refer to your classes by, and a 2-tuple (a pair) of clusters for it to instantiate (i.e. put your new DVFS-capable classes here). Your specified DVFS values will now be run when using those clusters. -
As mentioned previously, the value(s) passed to the
--big-cpu-clockflag is used to create a newSrcClockDomaininternally. Hence, another (possibly more flexible) solution is to add a--big-cpu-voltageflag, wire up its values in the configuration script (e.g.<gem5-root>/configs/example/arm/fs_bigLITTLE.py), and pass a list of values for each of the four flags (both voltage and clock for both big and LITTLE cpus).