Test Environment¶
Physical Testbeds¶
FD.io CSIT performance tests are executed in physical testbeds hosted by LF for FD.io project. Two physical testbed topology types are used:
- 3-Node Topology: Consisting of two servers acting as SUTs (Systems Under Test) and one server as TG (Traffic Generator), all connected in ring topology.
- 2-Node Topology: Consisting of one server acting as SUTs and one server as TG both connected in ring topology.
Tested SUT servers are based on a range of processors including Intel Xeon Haswell-SP, Intel Xeon Skylake-SP, Arm, Intel Atom. More detailed description is provided in Physical Testbeds. Tested logical topologies are described in Logical Topologies.
Server Specifications¶
Complete technical specifications of compute servers used in CSIT physical testbeds are maintained in FD.io CSIT repository: FD.io CSIT testbeds - Xeon Skylake, Arm, Atom and FD.io CSIT Testbeds - Xeon Haswell.
Pre-Test Server Calibration¶
Number of SUT server sub-system runtime parameters have been identified as impacting data plane performance tests. Calibrating those parameters is part of FD.io CSIT pre-test activities, and includes measuring and reporting following:
- System level core jitter – measure duration of core interrupts by Linux in clock cycles and how often interrupts happen. Using CPU core jitter tool.
- Memory bandwidth – measure bandwidth with Intel MLC tool.
- Memory latency – measure memory latency with Intel MLC tool.
- Cache latency at all levels (L1, L2, and Last Level Cache) – measure cache latency with Intel MLC tool.
Measured values of listed parameters are especially important for repeatable zero packet loss throughput measurements across multiple system instances. Generally they come useful as a background data for comparing data plane performance results across disparate servers.
Following sections include measured calibration data for Intel Xeon Haswell and Intel Xeon Skylake testbeds.
Calibration Data - Haswell¶
Following sections include sample calibration data measured on t1-sut1 server running in one of the Intel Xeon Haswell testbeds as specified in FD.io CSIT Testbeds - Xeon Haswell.
Calibration data obtained from all other servers in Haswell testbeds shows the same or similar values.
Linux cmdline¶
$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.15.0-36-generic root=UUID=5d2ecc97-245b-4e94-b0ae-c3548567de19 ro isolcpus=1-17,19-35 nohz_full=1-17,19-35 rcu_nocbs=1-17,19-35 numa_balancing=disable intel_pstate=disable intel_iommu=on iommu=pt nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable tsc=reliable mce=off console=tty0 console=ttyS0,115200n8
Linux uname¶
$ uname -a
Linux t1-tg1 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
System-level Core Jitter¶
$ sudo taskset -c 3 /home/testuser/pma_tools/jitter/jitter -i 30
Linux Jitter testing program version 1.8
Iterations=30
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Timings are in CPU Core cycles
Inst_Min: Minimum Excution time during the display update interval(default is ~1 second)
Inst_Max: Maximum Excution time during the display update interval(default is ~1 second)
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of interest
last_Exec: The Excution time of last iteration just before the display update
Abs_Min: Absolute Minimum Excution time since the program started or statistics were reset
Abs_Max: Absolute Maximum Excution time since the program started or statistics were reset
tmp: Cumulative value calcualted by the dummy function
Interval: Time interval between the display updates in Core Cycles
Sample No: Sample number
Inst_Min Inst_Max Inst_jitter last_Exec Abs_min Abs_max tmp Interval Sample No
160024 172636 12612 160028 160024 172636 1573060608 3205463144 1
160024 188236 28212 160028 160024 188236 958595072 3205500844 2
160024 185676 25652 160028 160024 188236 344129536 3205485976 3
160024 172608 12584 160024 160024 188236 4024631296 3205472740 4
160024 179260 19236 160028 160024 188236 3410165760 3205502164 5
160024 172432 12408 160024 160024 188236 2795700224 3205452036 6
160024 178820 18796 160024 160024 188236 2181234688 3205455408 7
160024 172512 12488 160028 160024 188236 1566769152 3205461528 8
160024 172636 12612 160028 160024 188236 952303616 3205478820 9
160024 173676 13652 160028 160024 188236 337838080 3205470412 10
160024 178776 18752 160028 160024 188236 4018339840 3205481472 11
160024 172788 12764 160028 160024 188236 3403874304 3205492336 12
160024 174616 14592 160028 160024 188236 2789408768 3205474904 13
160024 174440 14416 160028 160024 188236 2174943232 3205479448 14
160024 178748 18724 160024 160024 188236 1560477696 3205482668 15
160024 172588 12564 169404 160024 188236 946012160 3205510496 16
160024 172636 12612 160024 160024 188236 331546624 3205472204 17
160024 172480 12456 160024 160024 188236 4012048384 3205455864 18
160024 172740 12716 160028 160024 188236 3397582848 3205464932 19
160024 179200 19176 160028 160024 188236 2783117312 3205476012 20
160024 172480 12456 160028 160024 188236 2168651776 3205465632 21
160024 172728 12704 160024 160024 188236 1554186240 3205497204 22
160024 172620 12596 160028 160024 188236 939720704 3205466972 23
160024 172640 12616 160028 160024 188236 325255168 3205471216 24
160024 172484 12460 160028 160024 188236 4005756928 3205467388 25
160024 172636 12612 160028 160024 188236 3391291392 3205482748 26
160024 179056 19032 160024 160024 188236 2776825856 3205467152 27
160024 172672 12648 160024 160024 188236 2162360320 3205483268 28
160024 176932 16908 160024 160024 188236 1547894784 3205488536 29
160024 172452 12428 160028 160024 188236 933429248 3205440636 30
Memory Bandwidth¶
$ sudo /home/testuser/mlc --bandwidth_matrix
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --bandwidth_matrix
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Numa node
Numa node 0 1
0 57935.5 30265.2
1 30284.6 58409.9
$ sudo /home/testuser/mlc --peak_injection_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --peak_injection_bandwidth
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads : 115762.2
3:1 Reads-Writes : 106242.2
2:1 Reads-Writes : 103031.8
1:1 Reads-Writes : 87943.7
Stream-triad like: 100048.4
$ sudo /home/testuser/mlc --max_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --max_bandwidth
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Maximum Memory Bandwidths for the system
Will take several minutes to complete as multiple injection rates will be tried to get the best bandwidth
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads : 115782.41
3:1 Reads-Writes : 105965.78
2:1 Reads-Writes : 103162.38
1:1 Reads-Writes : 88255.82
Stream-triad like: 105608.10
Memory Latency¶
$ sudo /home/testuser/mlc --latency_matrix
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --latency_matrix
Using buffer size of 200.000MB
Measuring idle latencies (in ns)...
Numa node
Numa node 0 1
0 101.0 132.0
1 141.2 98.8
$ sudo /home/testuser/mlc --idle_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --idle_latency
Using buffer size of 200.000MB
Each iteration took 227.2 core clocks ( 99.0 ns)
$ sudo /home/testuser/mlc --loaded_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --loaded_latency
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject Latency Bandwidth
Delay (ns) MB/sec
==========================
00000 294.08 115841.6
00002 294.27 115851.5
00008 293.67 115821.8
00015 278.92 115587.5
00050 246.80 113991.2
00100 206.86 104508.1
00200 123.72 72873.6
00300 113.35 52641.1
00400 108.89 41078.9
00500 108.11 33699.1
00700 106.19 24878.0
01000 104.75 17948.1
01300 103.72 14089.0
01700 102.95 11013.6
02500 102.25 7756.3
03500 101.81 5749.3
05000 101.46 4230.4
09000 101.05 2641.4
20000 100.77 1542.5
L1/L2/LLC Latency¶
$ sudo /home/testuser/mlc --c2c_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --c2c_latency
Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency 42.1
Local Socket L2->L2 HITM latency 47.0
Remote Socket L2->L2 HITM latency (data address homed in writer socket)
Reader Numa Node
Writer Numa Node 0 1
0 - 108.0
1 106.9 -
Remote Socket L2->L2 HITM latency (data address homed in reader socket)
Reader Numa Node
Writer Numa Node 0 1
0 - 107.7
1 106.6 -
Spectre and Meltdown Checks¶
Following section displays the output of a running shell script to tell if system is vulnerable against the several “speculative execution” CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github.
- CVE-2017-5753 [bounds check bypass] aka ‘Spectre Variant 1’
- CVE-2017-5715 [branch target injection] aka ‘Spectre Variant 2’
- CVE-2017-5754 [rogue data cache load] aka ‘Meltdown’ aka ‘Variant 3’
- CVE-2018-3640 [rogue system register read] aka ‘Variant 3a’
- CVE-2018-3639 [speculative store bypass] aka ‘Variant 4’
- CVE-2018-3615 [L1 terminal fault] aka ‘Foreshadow (SGX)’
- CVE-2018-3620 [L1 terminal fault] aka ‘Foreshadow-NG (OS)’
- CVE-2018-3646 [L1 terminal fault] aka ‘Foreshadow-NG (VMM)’
$ sudo ./spectre-meltdown-checker.sh --no-color
Spectre and Meltdown mitigation detection tool v0.40
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64
CPU is Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
* Indirect Branch Restricted Speculation (IBRS)
* SPEC_CTRL MSR is available: YES
* CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
* Indirect Branch Prediction Barrier (IBPB)
* PRED_CMD MSR is available: YES
* CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
* Single Thread Indirect Branch Predictors (STIBP)
* SPEC_CTRL MSR is available: YES
* CPU indicates STIBP capability: YES (Intel STIBP feature bit)
* Speculative Store Bypass Disable (SSBD)
* CPU indicates SSBD capability: YES (Intel SSBD)
* L1 data cache invalidation
* FLUSH_CMD MSR is available: YES
* CPU indicates L1D flush capability: YES (L1D flush feature bit)
* Enhanced IBRS (IBRS_ALL)
* CPU indicates ARCH_CAPABILITIES MSR availability: NO
* ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
* CPU explicitly indicates not being vulnerable to Meltdown (RDCL_NO): NO
* CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
* CPU/Hypervisor indicates L1D flushing is not necessary on this system: NO
* Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
* CPU supports Software Guard Extensions (SGX): NO
* CPU microcode is known to cause stability problems: NO (model 0x3f family 0x6 stepping 0x2 ucode 0x3d cpuid 0x306f2)
* CPU microcode is the latest known available version: YES (latest version is 0x3d dated 2018/04/20 according to builtin MCExtractor DB v84 - 2018/09/27)
* CPU vulnerability to the speculative execution attack variants
* Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
* Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
* Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): YES
* Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
* Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
* Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
* Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
* Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
CVE-2017-5753 aka 'Spectre Variant 1, bounds check bypass'
* Mitigated according to the /sys interface: YES (Mitigation: __user pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: __user pointer sanitization)
CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB, IBRS_FW)
* Mitigation 1
* Kernel is compiled with IBRS support: YES
* IBRS enabled and active: YES (for kernel and firmware code)
* Kernel is compiled with IBPB support: YES
* IBPB enabled and active: YES
* Mitigation 2
* Kernel has branch predictor hardening (arm): NO
* Kernel compiled with retpoline option: YES
* Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline compilation)
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka 'Variant 3, Meltdown, rogue data cache load'
* Mitigated according to the /sys interface: YES (Mitigation: PTI)
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI)
CVE-2018-3640 aka 'Variant 3a, rogue system register read'
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka 'Variant 4, speculative store bypass'
* Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
* Kernel supports speculation store bypass: YES (found in /proc/self/status)
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka 'Foreshadow (SGX), L1 terminal fault'
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka 'Foreshadow-NG (OS), L1 terminal fault'
* Mitigated according to the /sys interface: YES (Mitigation: PTE Inversion)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: YES
> STATUS: NOT VULNERABLE (Mitigation: PTE Inversion)
CVE-2018-3646 aka 'Foreshadow-NG (VMM), L1 terminal fault'
* Information from the /sys interface: VMX: conditional cache flushes, SMT disabled
* This system is a host running an hypervisor: NO
* Mitigation 1 (KVM)
* EPT is disabled: NO
* Mitigation 2
* L1D flush is supported by kernel: YES (found flush_l1d in /proc/cpuinfo)
* L1D flush enabled: YES (conditional flushes)
* Hardware-backed L1D flush supported: YES (performance impact of the mitigation will be greatly reduced)
* Hyper-Threading (SMT) is enabled: NO
> STATUS: NOT VULNERABLE (this system is not running an hypervisor)
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-2018-3615:OK CVE-2018-3620:OK CVE-2018-3646:OK
Need more detailed information about mitigation options? Use --explain
A false sense of security is worse than no security at all, see --disclaimer
Calibration Data - Skylake¶
Following sections include sample calibration data measured on s11-t31-sut1 server running in one of the Intel Xeon Skylake testbeds as specified in FD.io CSIT testbeds - Xeon Skylake, Arm, Atom.
Calibration data obtained from all other servers in Skylake testbeds shows the same or similar values.
Linux cmdline¶
$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.15.0-23-generic root=UUID=759ad671-ad46-441b-a75b-9f54e81837bb ro isolcpus=1-27,29-55,57-83,85-111 nohz_full=1-27,29-55,57-83,85-111 rcu_nocbs=1-27,29-55,57-83,85-111 numa_balancing=disable intel_pstate=disable intel_iommu=on iommu=pt nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable tsc=reliable mce=off console=tty0 console=ttyS0,115200n8
Linux uname¶
$ uname -a
Linux s5-t22-sut1 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 18:02:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
System-level Core Jitter¶
$ sudo taskset -c 3 /home/testuser/pma_tools/jitter/jitter -i 20
Linux Jitter testing program version 1.8
Iterations=20
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Timings are in CPU Core cycles
Inst_Min: Minimum Excution time during the display update interval(default is ~1 second)
Inst_Max: Maximum Excution time during the display update interval(default is ~1 second)
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of interest
last_Exec: The Excution time of last iteration just before the display update
Abs_Min: Absolute Minimum Excution time since the program started or statistics were reset
Abs_Max: Absolute Maximum Excution time since the program started or statistics were reset
tmp: Cumulative value calcualted by the dummy function
Interval: Time interval between the display updates in Core Cycles
Sample No: Sample number
Inst_Min Inst_Max Inst_jitter last_Exec Abs_min Abs_max tmp Interval Sample No
160022 171330 11308 160022 160022 171330 2538733568 3204142750 1
160022 167294 7272 160026 160022 171330 328335360 3203873548 2
160022 167560 7538 160026 160022 171330 2412904448 3203878736 3
160022 169000 8978 160024 160022 171330 202506240 3203864588 4
160022 166572 6550 160026 160022 171330 2287075328 3203866224 5
160022 167460 7438 160026 160022 171330 76677120 3203854632 6
160022 168134 8112 160024 160022 171330 2161246208 3203874674 7
160022 169094 9072 160022 160022 171330 4245815296 3203878798 8
160022 172460 12438 160024 160022 172460 2035417088 3204112010 9
160022 167862 7840 160030 160022 172460 4119986176 3203856800 10
160022 168398 8376 160024 160022 172460 1909587968 3203854192 11
160022 167548 7526 160024 160022 172460 3994157056 3203847442 12
160022 167562 7540 160026 160022 172460 1783758848 3203862936 13
160022 167604 7582 160024 160022 172460 3868327936 3203859346 14
160022 168262 8240 160024 160022 172460 1657929728 3203851120 15
160022 169700 9678 160024 160022 172460 3742498816 3203877690 16
160022 170476 10454 160026 160022 172460 1532100608 3204088480 17
160022 167798 7776 160024 160022 172460 3616669696 3203862072 18
160022 166540 6518 160024 160022 172460 1406271488 3203836904 19
160022 167516 7494 160024 160022 172460 3490840576 3203848120 20
Memory Bandwidth¶
$ sudo /home/testuser/mlc --bandwidth_matrix
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --bandwidth_matrix
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Numa node
Numa node 0 1
0 107947.7 50951.5
1 50834.6 108183.4
$ sudo /home/testuser/mlc --peak_injection_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --peak_injection_bandwidth
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads : 215733.9
3:1 Reads-Writes : 182141.9
2:1 Reads-Writes : 178615.7
1:1 Reads-Writes : 149911.3
Stream-triad like: 159533.6
$ sudo /home/testuser/mlc --max_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --max_bandwidth
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Maximum Memory Bandwidths for the system
Will take several minutes to complete as multiple injection rates will be tried to get the best bandwidth
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads : 216875.73
3:1 Reads-Writes : 182615.14
2:1 Reads-Writes : 178745.67
1:1 Reads-Writes : 149485.27
Stream-triad like: 180057.87
Memory Latency¶
$ sudo /home/testuser/mlc --latency_matrix
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --latency_matrix
Using buffer size of 2000.000MB
Measuring idle latencies (in ns)...
Numa node
Numa node 0 1
0 81.4 131.1
1 131.1 81.3
$ sudo /home/testuser/mlc --idle_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --idle_latency
Using buffer size of 2000.000MB
Each iteration took 202.0 core clocks ( 80.8 ns)
$ sudo /home/testuser/mlc --loaded_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --loaded_latency
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject Latency Bandwidth
Delay (ns) MB/sec
==========================
00000 282.66 215712.8
00002 282.14 215757.4
00008 280.21 215868.1
00015 279.20 216313.2
00050 275.25 216643.0
00100 227.05 215075.0
00200 121.92 160242.9
00300 101.21 111587.4
00400 95.48 85019.7
00500 94.46 68717.3
00700 92.27 49742.2
01000 91.03 35264.8
01300 90.11 27396.3
01700 89.34 21178.7
02500 90.15 14672.8
03500 89.00 10715.7
05000 82.00 7788.2
09000 81.46 4684.0
20000 81.40 2541.9
L1/L2/LLC Latency¶
$ sudo /home/testuser/mlc --c2c_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --c2c_latency
Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency 53.7
Local Socket L2->L2 HITM latency 53.7
Remote Socket L2->L2 HITM latency (data address homed in writer socket)
Reader Numa Node
Writer Numa Node 0 1
0 - 113.9
1 113.9 -
Remote Socket L2->L2 HITM latency (data address homed in reader socket)
Reader Numa Node
Writer Numa Node 0 1
0 - 177.9
1 177.6 -
Spectre and Meltdown Checks¶
Following section displays the output of a running shell script to tell if system is vulnerable against the several “speculative execution” CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github.
- CVE-2017-5753 [bounds check bypass] aka ‘Spectre Variant 1’
- CVE-2017-5715 [branch target injection] aka ‘Spectre Variant 2’
- CVE-2017-5754 [rogue data cache load] aka ‘Meltdown’ aka ‘Variant 3’
- CVE-2018-3640 [rogue system register read] aka ‘Variant 3a’
- CVE-2018-3639 [speculative store bypass] aka ‘Variant 4’
- CVE-2018-3615 [L1 terminal fault] aka ‘Foreshadow (SGX)’
- CVE-2018-3620 [L1 terminal fault] aka ‘Foreshadow-NG (OS)’
- CVE-2018-3646 [L1 terminal fault] aka ‘Foreshadow-NG (VMM)’
$ sudo ./spectre-meltdown-checker.sh --no-color
Spectre and Meltdown mitigation detection tool v0.40
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 18:02:16 UTC 2018 x86_64
CPU is Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
* Indirect Branch Restricted Speculation (IBRS)
* SPEC_CTRL MSR is available: YES
* CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
* Indirect Branch Prediction Barrier (IBPB)
* PRED_CMD MSR is available: YES
* CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
* Single Thread Indirect Branch Predictors (STIBP)
* SPEC_CTRL MSR is available: YES
* CPU indicates STIBP capability: YES (Intel STIBP feature bit)
* Speculative Store Bypass Disable (SSBD)
* CPU indicates SSBD capability: NO
* L1 data cache invalidation
* FLUSH_CMD MSR is available: NO
* CPU indicates L1D flush capability: NO
* Enhanced IBRS (IBRS_ALL)
* CPU indicates ARCH_CAPABILITIES MSR availability: NO
* ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
* CPU explicitly indicates not being vulnerable to Meltdown (RDCL_NO): NO
* CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
* CPU/Hypervisor indicates L1D flushing is not necessary on this system: NO
* Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
* CPU supports Software Guard Extensions (SGX): NO
* CPU microcode is known to cause stability problems: NO (model 0x55 family 0x6 stepping 0x4 ucode 0x2000043 cpuid 0x50654)
* CPU microcode is the latest known available version: NO (latest version is 0x200004d dated 2018/05/15 according to builtin MCExtractor DB v84 - 2018/09/27)
* CPU vulnerability to the speculative execution attack variants
* Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
* Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
* Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): YES
* Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
* Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
* Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
* Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
* Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
CVE-2017-5753 aka 'Spectre Variant 1, bounds check bypass'
* Mitigated according to the /sys interface: YES (Mitigation: __user pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: __user pointer sanitization)
CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB, IBRS_FW)
* Mitigation 1
* Kernel is compiled with IBRS support: YES
* IBRS enabled and active: YES (for kernel and firmware code)
* Kernel is compiled with IBPB support: YES
* IBPB enabled and active: YES
* Mitigation 2
* Kernel has branch predictor hardening (arm): NO
* Kernel compiled with retpoline option: YES
* Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline compilation)
* Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka 'Variant 3, Meltdown, rogue data cache load'
* Mitigated according to the /sys interface: YES (Mitigation: PTI)
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI)
CVE-2018-3640 aka 'Variant 3a, rogue system register read'
* CPU microcode mitigates the vulnerability: NO
> STATUS: VULNERABLE (an up-to-date CPU microcode is needed to mitigate this vulnerability)
CVE-2018-3639 aka 'Variant 4, speculative store bypass'
* Mitigated according to the /sys interface: NO (Vulnerable)
* Kernel supports speculation store bypass: YES (found in /proc/self/status)
> STATUS: VULNERABLE (Your CPU doesn't support SSBD)
CVE-2018-3615 aka 'Foreshadow (SGX), L1 terminal fault'
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka 'Foreshadow-NG (OS), L1 terminal fault'
* Kernel supports PTE inversion: NO
* PTE inversion enabled and active: UNKNOWN (sysfs interface not available)
> STATUS: VULNERABLE (Your kernel doesn't support PTE inversion, update it)
CVE-2018-3646 aka 'Foreshadow-NG (VMM), L1 terminal fault'
* This system is a host running an hypervisor: NO
* Mitigation 1 (KVM)
* EPT is disabled: NO
* Mitigation 2
* L1D flush is supported by kernel: NO
* L1D flush enabled: UNKNOWN (can't find or read /sys/devices/system/cpu/vulnerabilities/l1tf)
* Hardware-backed L1D flush supported: NO (flush will be done in software, this is slower)
* Hyper-Threading (SMT) is enabled: YES
> STATUS: NOT VULNERABLE (this system is not running an hypervisor)
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:KO CVE-2018-3639:KO CVE-2018-3615:OK CVE-2018-3620:KO CVE-2018-3646:OK
Need more detailed information about mitigation options? Use --explain
A false sense of security is worse than no security at all, see --disclaimer
Calibration Data - Denverton¶
Following sections include sample calibration data measured on Denverton server at Intel SH labs.
And VPP-18.10 2-Node Atom Denverton testing took place at Intel Corporation carefully adhering to FD.io CSIT best practices.
Linux cmdline¶
$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.15.0-36-generic root=UUID=d3cfffd0-1e77-423a-a53a-a117199b6025 ro intel_iommu=on iommu=pt isolcpus=1-11 nohz_full=1-11 rcu_nocbs=1-11 default_hugepagesz=1G hugepagesz=1G hugepages=8 intel_pstate=disable nmi_watchdog=0 numa_balancing=disable tsc=reliable nosoftlockup quiet splash vt.handoff=7
Linux uname¶
$ uname -a
Linux 4.15.0-36-generic #39~16.04.1-Ubuntu SMP Tue Sep 25 08:59:23 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
System-level Core Jitter¶
$ sudo taskset -c 2 /home/testuser/pma_tools/jitter/jitter -c 2 -i 20
Linux Jitter testing program version 1.9
Iterations=20
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Thread affinity will be set to core_id:2
Timings are in CPU Core cycles
Inst_Min: Minimum Excution time during the display update interval(default is ~1 second)
Inst_Max: Maximum Excution time during the display update interval(default is ~1 second)
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of interest
last_Exec: The Excution time of last iteration just before the display update
Abs_Min: Absolute Minimum Excution time since the program started or statistics were reset
Abs_Max: Absolute Maximum Excution time since the program started or statistics were reset
tmp: Cumulative value calcualted by the dummy function
Interval: Time interval between the display updates in Core Cycles
Sample No: Sample number
Inst_Min Inst_Max Inst_jitter last_Exec Abs_min Abs_max tmp Interval Sample No
177530 196100 18570 177530 177530 196100 4156751872 3556820054 1
177530 200784 23254 177530 177530 200784 321060864 3556897644 2
177530 196346 18816 177530 177530 200784 780337152 3556918674 3
177530 195962 18432 177530 177530 200784 1239613440 3556847928 4
177530 195960 18430 177530 177530 200784 1698889728 3556860214 5
177530 198824 21294 177530 177530 200784 2158166016 3556854934 6
177530 198522 20992 177530 177530 200784 2617442304 3556862410 7
177530 196362 18832 177530 177530 200784 3076718592 3556851636 8
177530 199114 21584 177530 177530 200784 3535994880 3556870846 9
177530 197194 19664 177530 177530 200784 3995271168 3556933584 10
177530 198272 20742 177536 177530 200784 159580160 3556869044 11
177530 197586 20056 177530 177530 200784 618856448 3556903482 12
177530 196072 18542 177530 177530 200784 1078132736 3556825540 13
177530 196354 18824 177530 177530 200784 1537409024 3556881664 14
177530 195906 18376 177530 177530 200784 1996685312 3556839924 15
177530 199066 21536 177530 177530 200784 2455961600 3556860220 16
177530 196968 19438 177530 177530 200784 2915237888 3556871890 17
177530 195896 18366 177530 177530 200784 3374514176 3556855338 18
177530 196020 18490 177530 177530 200784 3833790464 3556839820 19
177530 196030 18500 177530 177530 200784 4293066752 3556889196 20
Memory Bandwidth¶
$ sudo /home/testuser/mlc --bandwidth_matrix
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --bandwidth_matrix
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Memory node
Socket 0
0 28157.2
$ sudo /home/testuser/mlc --peak_injection_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --peak_injection_bandwidth
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads : 28150.0
3:1 Reads-Writes : 27425.0
2:1 Reads-Writes : 27565.4
1:1 Reads-Writes : 27489.3
Stream-triad like: 26878.2
$ sudo /home/testuser/mlc --max_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --max_bandwidth
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Maximum Memory Bandwidths for the system
Will take several minutes to complete as multiple injection rates will be tried to get the best bandwidth
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads : 30032.40
3:1 Reads-Writes : 27450.88
2:1 Reads-Writes : 27567.46
1:1 Reads-Writes : 27501.90
Stream-triad like: 27124.82
Memory Latency¶
$ sudo /home/testuser/mlc --latency_matrix
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --latency_matrix
Using buffer size of 2000.000MB
Intel(R) Memory Latency Checker - v3.5
Measuring idle latencies (in ns)...
Memory node
Socket 0
0 93.1
$ sudo /home/testuser/mlc --idle_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --idle_latency
Using buffer size of 200.000MB
Each iteration took 186.7 core clocks ( 93.4 ns)
$ sudo /home/testuser/mlc --loaded_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --loaded_latency
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject Latency Bandwidth
Delay (ns) MB/sec
==========================
00000 135.35 27186.0
00002 135.47 27176.9
00008 134.97 27063.3
00015 134.41 26825.6
00050 139.83 28419.1
00100 124.28 22616.4
00200 109.40 14139.8
00300 104.56 10275.1
00400 102.02 8120.0
00500 100.38 6751.4
00700 98.30 5124.9
01000 96.56 3852.7
01300 95.65 3149.0
01700 95.06 2585.4
02500 94.43 1988.8
03500 94.16 1621.1
05000 93.95 1343.1
09000 93.65 1052.6
20000 93.43 851.7
L1/L2/LLC Latency¶
$ sudo /home/testuser/mlc --c2c_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --c2c_latency
Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency 8.8
Local Socket L2->L2 HITM latency 8.8
Spectre and Meltdown Checks¶
Following section displays the output of a running shell script to tell if system is vulnerable against the several “speculative execution” CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github.
- CVE-2017-5753 [bounds check bypass] aka ‘Spectre Variant 1’
- CVE-2017-5715 [branch target injection] aka ‘Spectre Variant 2’
- CVE-2017-5754 [rogue data cache load] aka ‘Meltdown’ aka ‘Variant 3’
- CVE-2018-3640 [rogue system register read] aka ‘Variant 3a’
- CVE-2018-3639 [speculative store bypass] aka ‘Variant 4’
- CVE-2018-3615 [L1 terminal fault] aka ‘Foreshadow (SGX)’
- CVE-2018-3620 [L1 terminal fault] aka ‘Foreshadow-NG (OS)’
- CVE-2018-3646 [L1 terminal fault] aka ‘Foreshadow-NG (VMM)’
$ sudo ./spectre-meltdown-checker.sh --no-color
Spectre and Meltdown mitigation detection tool v0.40
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-36-generic #39~16.04.1-Ubuntu SMP Tue Sep 25 08:59:23 UTC 2018 x86_64
CPU is Intel(R) Atom(TM) CPU C3858 @ 2.00GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
* Indirect Branch Restricted Speculation (IBRS)
* SPEC_CTRL MSR is available: YES
* CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
* Indirect Branch Prediction Barrier (IBPB)
* PRED_CMD MSR is available: YES
* CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
* Single Thread Indirect Branch Predictors (STIBP)
* SPEC_CTRL MSR is available: YES
* CPU indicates STIBP capability: YES (Intel STIBP feature bit)
* Speculative Store Bypass Disable (SSBD)
* CPU indicates SSBD capability: YES (Intel SSBD)
* L1 data cache invalidation
* FLUSH_CMD MSR is available: NO
* CPU indicates L1D flush capability: NO
* Enhanced IBRS (IBRS_ALL)
* CPU indicates ARCH_CAPABILITIES MSR availability: YES
* ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
* CPU explicitly indicates not being vulnerable to Meltdown (RDCL_NO): YES
* CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
* CPU/Hypervisor indicates L1D flushing is not necessary on this system: YES
* Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
* CPU supports Software Guard Extensions (SGX): NO
* CPU microcode is known to cause stability problems: NO (model 0x5f family 0x6 stepping 0x1 ucode 0x24 cpuid 0x506f1)
* CPU microcode is the latest known available version: YES (latest version is 0x24 dated 2018/05/11 according to builtin MCExtractor DB v84 - 2018/09/27)
* CPU vulnerability to the speculative execution attack variants
* Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
* Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
* Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): NO
* Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
* Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
* Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
* Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
* Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
CVE-2017-5753 aka 'Spectre Variant 1, bounds check bypass'
* Mitigated according to the /sys interface: YES (Mitigation: __user pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: __user pointer sanitization)
CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB, IBRS_FW)
* Mitigation 1
* Kernel is compiled with IBRS support: YES
* IBRS enabled and active: YES (for kernel and firmware code)
* Kernel is compiled with IBPB support: YES
* IBPB enabled and active: YES
* Mitigation 2
* Kernel has branch predictor hardening (arm): NO
* Kernel compiled with retpoline option: YES
* Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline compilation)
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka 'Variant 3, Meltdown, rogue data cache load'
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: NO
* Reduced performance impact of PTI: NO (PCID/INVPCID not supported, performance impact of PTI will be significant)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3640 aka 'Variant 3a, rogue system register read'
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka 'Variant 4, speculative store bypass'
* Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
* Kernel supports speculation store bypass: YES (found in /proc/self/status)
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka 'Foreshadow (SGX), L1 terminal fault'
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka 'Foreshadow-NG (OS), L1 terminal fault'
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: NO
> STATUS: NOT VULNERABLE (Not affected)
CVE-2018-3646 aka 'Foreshadow-NG (VMM), L1 terminal fault'
* Information from the /sys interface:
* This system is a host running an hypervisor: NO
* Mitigation 1 (KVM)
* EPT is disabled: NO
* Mitigation 2
* L1D flush is supported by kernel: YES (found flush_l1d in kernel image)
* L1D flush enabled: UNKNOWN (unrecognized mode)
* Hardware-backed L1D flush supported: NO (flush will be done in software, this is slower)
* Hyper-Threading (SMT) is enabled: NO
> STATUS: NOT VULNERABLE (this system is not running an hypervisor)
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-2018-3615:OK CVE-2018-3620:OK CVE-2018-3646:OK
Need more detailed information about mitigation options? Use --explain
A false sense of security is worse than no security at all, see --disclaimer
SUT Settings - Linux¶
System provisioning is done by combination of PXE boot unattented install and Ansible described in CSIT Testbed Setup.
Below a subset of the running configuration:
- Xeon Haswell - Ubuntu 18.04.1 LTS
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.1 LTS
Release: 18.04
Codename: bionic
- Xeon Skylake - Ubuntu 18.04 LTS
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04 LTS
Release: 18.04
Codename: bionic
Linux Boot Parameters¶
- isolcpus=<cpu number>-<cpu number> used for all cpu cores apart from first core of each socket used for running VPP worker threads and Qemu/LXC processes https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt
- intel_pstate=disable - [X86] Do not enable intel_pstate as the default scaling driver for the supported processors. Intel P-State driver decide what P-state (CPU core power state) to use based on requesting policy from the cpufreq core. [X86 - Either 32-bit or 64-bit x86] https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt
- nohz_full=<cpu number>-<cpu number> - [KNL,BOOT] In kernels built with CONFIG_NO_HZ_FULL=y, set the specified list of CPUs whose tick will be stopped whenever possible. The boot CPU will be forced outside the range to maintain the timekeeping. The CPUs in this range must also be included in the rcu_nocbs= set. Specifies the adaptive-ticks CPU cores, causing kernel to avoid sending scheduling-clock interrupts to listed cores as long as they have a single runnable task. [KNL - Is a kernel start-up parameter, SMP - The kernel is an SMP kernel]. https://www.kernel.org/doc/Documentation/timers/NO_HZ.txt
- rcu_nocbs - [KNL] In kernels built with CONFIG_RCU_NOCB_CPU=y, set the specified list of CPUs to be no-callback CPUs, that never queue RCU callbacks (read-copy update). https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt
- numa_balancing=disable - [KNL,X86] Disable automatic NUMA balancing.
- intel_iommu=enable - [DMAR] Enable Intel IOMMU driver (DMAR) option.
- iommu=on, iommu=pt - [x86, IA-64] Disable IOMMU bypass, using IOMMU for PCI devices.
- nmi_watchdog=0 - [KNL,BUGS=X86] Debugging features for SMP kernels. Turn hardlockup detector in nmi_watchdog off.
- nosoftlockup - [KNL] Disable the soft-lockup detector.
- tsc=reliable - Disable clocksource stability checks for TSC. [x86] reliable: mark tsc clocksource as reliable, this disables clocksource verification at runtime, as well as the stability checks done at bootup. Used to enable high-resolution timer mode on older hardware, and in virtualized environment.
- hpet=disable - [X86-32,HPET] Disable HPET and use PIT instead.
Hugepages Configuration¶
Huge pages are namaged via sysctl configuration located in /etc/sysctl.d/90-csit.conf on each testbed. Default huge page size is 2M. The exact amount of huge pages depends on testbed. All the values are defined in Ansible inventory - hosts files.
Applied Boot Cmdline¶
- Xeon Haswell - Ubuntu 18.04.1 LTS
$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.15.0-36-generic root=UUID=5d2ecc97-245b-4e94-b0ae-c3548567de19 ro isolcpus=1-17,19-35 nohz_full=1-17,19-35 rcu_nocbs=1-17,19-35 numa_balancing=disable intel_pstate=disable intel_iommu=on iommu=pt nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable tsc=reliable mce=off console=tty0 console=ttyS0,115200n8
- Xeon Skylake - Ubuntu 18.04 LTS
$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.15.0-23-generic root=UUID=3fa246fd-1b80-4361-bb90-f339a6bbed51 ro isolcpus=1-27,29-55,57-83,85-111 nohz_full=1-27,29-55,57-83,85-111 rcu_nocbs=1-27,29-55,57-83,85-111 numa_balancing=disable intel_pstate=disable intel_iommu=on iommu=pt nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable tsc=reliable mce=off console=tty0 console=ttyS0,115200n8
Linux CFS Tunings¶
Linux CFS scheduler tunings are applied to all QEMU vCPU worker threads (the ones handling testpmd PMD threads) and VPP data plane worker threads. List of VPP data plane threads can be obtained by running:
$ for psid in $(pgrep vpp)
$ do
$ for tid in $(ps -Lo tid --pid $psid | grep -v TID)
$ do
$ echo $tid
$ done
$ done
Or:
$ cat /proc/`pidof vpp`/task/*/stat | awk '{print $1" "$2" "$39}'
CFS round-robin scheduling with highest priority is applied using:
$ for psid in $(pgrep vpp)
$ do
$ for tid in $(ps -Lo tid --pid $psid | grep -v TID)
$ do
$ chrt -r -p 1 $tid
$ done
$ done
More information about Linux CFS can be found in Sched manual pages.
Host Writeback Affinity¶
Writebacks are pinned to core 0. The same configuration is applied in host Linux and guest VM.
$ echo 1 | sudo tee /sys/bus/workqueue/devices/writeback/cpumask
DUT Settings - VPP¶
VPP Version¶
VPP-19.04.2 release
VPP Compile Parameters¶
VPP Install Parameters¶
$ dpkg -i --force-all vpp*
VPP Startup Configuration¶
VPP startup configuration vary per test case, with different settings for $$CORELIST_WORKERS, $$NUM_RX_QUEUES, $$UIO_DRIVER, $$NUM- MBUFS and $$NO_MULTI_SEG parameter. Default template is provided below:
ip
{
heap-size 4G
}
statseg
{
size 4G
}
unix
{
cli-listen /run/vpp/cli.sock
log /tmp/vpe.log
nodaemon
}
ip6
{
heap-size 4G
hash-buckets 2000000
}
heapsize 4G
plugins
{
plugin default
{
disable
}
plugin dpdk_plugin.so
{
enable
}
}
cpu
{
corelist-workers $$CORELIST_WORKERS
main-core 1
}
dpdk
{
num-mbufs $$NUM-MBUFS
uio-driver $$UIO_DRIVER
$$NO_MULTI_SEG
log-level debug
dev default
{
num-rx-queues $$NUM_RX_QUEUES
}
socket-mem 1024,1024
no-tx-checksum-offload
dev $$DEV_1
dev $$DEV_2
}
Description of VPP startup settings used in CSIT is provided in Test Methodology.
TG Settings - TRex¶
TG Version¶
TRex v2.35
DPDK Version¶
DPDK v17.11
TG Build Script Used¶
TG Startup Configuration¶
$ cat /etc/trex_cfg.yaml
- port_limit : 2
version : 2
interfaces : ["0000:0d:00.0","0000:0d:00.1"]
port_info :
- dest_mac : [0x3c,0xfd,0xfe,0x9c,0xee,0xf5]
src_mac : [0x3c,0xfd,0xfe,0x9c,0xee,0xf4]
- dest_mac : [0x3c,0xfd,0xfe,0x9c,0xee,0xf4]
src_mac : [0x3c,0xfd,0xfe,0x9c,0xee,0xf5]
TG Startup Command¶
$ sh -c 'cd <t-rex-install-dir>/scripts/ && sudo nohup ./t-rex-64 -i -c 7 --iom 0 > /tmp/trex.log 2>&1 &'> /dev/null