{"id":79284,"date":"2022-09-27T04:56:04","date_gmt":"2022-09-27T04:56:04","guid":{"rendered":"https:\/\/harchi90.com\/linux-6-0-merges-the-amd-performance-fix-for-the-old-dummy-wait-workaround\/"},"modified":"2022-09-27T04:56:04","modified_gmt":"2022-09-27T04:56:04","slug":"linux-6-0-merges-the-amd-performance-fix-for-the-old-dummy-wait-workaround","status":"publish","type":"post","link":"https:\/\/harchi90.com\/linux-6-0-merges-the-amd-performance-fix-for-the-old-dummy-wait-workaround\/","title":{"rendered":"Linux 6.0 Merges The AMD Performance Fix For The Old “Dummy Wait” Workaround"},"content":{"rendered":"
\n
<\/div>\n

This morning I called attention to some pending work around a 20 year old chipset workaround in the Linux kernel had been hurting modern AMD systems by erroneously still applying the change to modern hardware. , that patch has now been picked up by Linus Torvalds in time for the Linux 6.0 kernel expected luckily for its stable debut next weekend.<\/p>\n

As outlined in that earlier article, since 2002 when ACPI support was added to the kernel there was a “dummy wait” operation added due to some chipsets at the time where STPCLK# wasn’t asserted in time along the idle path in the kernel. The dummy I\/O read delays further instruction processing until the CPU is fully stopped. But an AMD engineer recently noticed this behavior being applied on modern AMD Zen 3 hardware and finding that it could lead to performance issues for workloads rapidly switching between busy and idle phases and especially for larger core count systems like Ryzen Threadripper and EPYC platforms.
\n<\/p>\n


\n
Sampling certain workloads with IBS on AMD Zen3 system shows that a significant amount of time is spent in the dummy op, which incorrectly gets accounted as C-State residency. A large C-State residency value can prime the cpuidle governor to recommend a deeper C-State during the subsequent idle instances, starting a vicious cycle, leading to performance degradation on workloads that rapidly switch between busy and idle phases.<\/p>\n

One such workload is tbench where a massive performance degradation can be observed during certain runs.<\/p>\n<\/blockquote>\n

AMD engineer K Prateek Nayak showed the significant performance impact that this erroneous workaround for modern hardware can have on AMD systems. Intel systems meanwhile don’t use this code path for modern hardware and thus unaffected.<\/p>\n

An AMD patch was originally suggested but then cleaned-up\/simplified by Intel engineer Dave Hansen. That patch simply doesn’t apply this “dummy wait” workaround except for older (pre-Nehalem) Intel systems and thus AMD systems will now forego this operation that can degrade performance on modern systems. With it mostly impacting workloads switching often between busy and idle states plus more noticeable for larger core count systems, AMD EPYC server performance with this patch should be quite interesting especially for web server \/ database workloads and other type rapid tests. I’ll be firing up a complete set of wide-ranging benchmarks evaluating this patch tomorrow.
\n<\/p>\n

<\/p>\n

The patch was mainlined this evening as part of the “x86\/urgent” fixes sent in as part of this pull ahead of Linux 6.0’s expected stable release on 2 October. Great to see it land quickly and stay tuned for some benchmarks.<\/div>\n

.<\/p>\n","protected":false},"excerpt":{"rendered":"

This morning I called attention to some pending work around a 20 year old chipset workaround in the Linux kernel had been hurting modern AMD systems by erroneously still applying the change to modern hardware. , that patch has now been picked up by Linus Torvalds in time for the Linux 6.0 kernel expected luckily …<\/p>\n

Linux 6.0 Merges The AMD Performance Fix For The Old “Dummy Wait” Workaround<\/span> Read More »<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"default","ast-global-header-display":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[4],"tags":[1111,1110,1108,1107,1114,1112,1109,1113,1106,1117,1115,1116],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":78541,"url":"https:\/\/harchi90.com\/a-20-year-old-chipset-workaround-has-been-hurting-modern-amd-linux-systems\/","url_meta":{"origin":79284,"position":0},"title":"A 20 Year Old Chipset Workaround Has Been Hurting Modern AMD Linux Systems","date":"September 26, 2022","format":false,"excerpt":"AMD engineer K Prateek Nayak recently uncovered that a ~20 year old chipset workaround in the Linux kernel still being applied to modern AMD systems is responsible in some cases for hurting performance on modern Zen hardware. Fortunately, a fix is \u200b\u200bon the way for limiting that workaround to old\u2026","rel":"","context":"In "Technology"","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":33086,"url":"https:\/\/harchi90.com\/linux-6-0-has-some-big-scheduler-changes-including-improved-numa-balancing-for-amd-zen\/","url_meta":{"origin":79284,"position":1},"title":"Linux 6.0 Has Some Big Scheduler Changes, Including Improved NUMA Balancing For AMD Zen","date":"August 2, 2022","format":false,"excerpt":"Ingo Molnar today submitted the main set of kernel scheduler updates for the in-development Linux 6.0 (nee 5.20). The scheduler updates contain some notable changes that will be interesting to benchmark in the days ahead. First up, there is improved NUMA balancing on AMD Zen systems for affine workloads. That\u2026","rel":"","context":"In "Technology"","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":45704,"url":"https:\/\/harchi90.com\/linux-6-0-rc1-released-with-exciting-performance-optimizations-new-hardware-support\/","url_meta":{"origin":79284,"position":2},"title":"Linux 6.0-rc1 Released With Exciting Performance Optimizations, New Hardware Support","date":"August 15, 2022","format":false,"excerpt":"After the two week long merge window, Linus Torvalds released this afternoon the first release candidate of Linux 6.0. Over the next roughly two months the Linux 6.0 kernel will stabilize but already from my early testing on various systems it is in nice shape and the features and performance\u2026","rel":"","context":"In "Technology"","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":85186,"url":"https:\/\/harchi90.com\/the-most-interesting-new-features-of-linux-6-0\/","url_meta":{"origin":79284,"position":3},"title":"The Most Interesting New Features Of Linux 6.0","date":"October 3, 2022","format":false,"excerpt":"We Need Your Support: This site is primarily supported by advertisements. Ads are what have allowed this site to be maintained on a daily basis for the past 18+ years. We do our best to ensure only clean, relevant ads are shown, when any nasty ads are detected, we work\u2026","rel":"","context":"In "Technology"","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":31643,"url":"https:\/\/harchi90.com\/linux-5-19-released-linus-torvalds-released-it-from-an-apple-silicon-macbook\/","url_meta":{"origin":79284,"position":4},"title":"Linux 5.19 Released – Linus Torvalds Released It From An Apple Silicon MacBook","date":"August 1, 2022","format":false,"excerpt":"Linus Torvalds just released Linux 5.19 as stable for the newest version of the Linux kernel. He also mentioned this is the first time he released the new Linux kernel from an ARM64 laptop in the form of an Apple MacBook running an AArch64 Apple M1 SoC. Linux 5.19 brings\u2026","rel":"","context":"In "Technology"","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":46620,"url":"https:\/\/harchi90.com\/linux-6-0-supporting-new-intel-amd-hardware-performance-improvements-much-more\/","url_meta":{"origin":79284,"position":5},"title":"Linux 6.0 Supporting New Intel\/AMD Hardware, Performance Improvements & Much More","date":"August 16, 2022","format":false,"excerpt":"Yesterday marked the release of Linux 6.0-rc1 and as such the merge window is now over and no more feature work is set to land in this kernel version. Here is my write-up of all the interesting new features and changes\/improvements coming for Linux 6.0. This kernel was originally going\u2026","rel":"","context":"In "Technology"","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"fifu_image_url":"https:\/\/www.phoronix.net\/image.php?id=2022&image=dummy_op_fix","_links":{"self":[{"href":"https:\/\/harchi90.com\/wp-json\/wp\/v2\/posts\/79284"}],"collection":[{"href":"https:\/\/harchi90.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/harchi90.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/harchi90.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/harchi90.com\/wp-json\/wp\/v2\/comments?post=79284"}],"version-history":[{"count":0,"href":"https:\/\/harchi90.com\/wp-json\/wp\/v2\/posts\/79284\/revisions"}],"wp:attachment":[{"href":"https:\/\/harchi90.com\/wp-json\/wp\/v2\/media?parent=79284"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/harchi90.com\/wp-json\/wp\/v2\/categories?post=79284"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/harchi90.com\/wp-json\/wp\/v2\/tags?post=79284"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}