AMD working on 64-core, 128-thread processor

Johnatan56

Honorary Master
Joined
Aug 23, 2013
Messages
24,572
#43
Windows 10 v 1903 handles Ryzen processors better than 1809, as the scheduler has had changes done.

That being said, a lot of people (especially those with ThreadRipper) prefer using Process Lasso to set specific tasks to certain cores and to avoid having one process split across two core complexes
Supposedly the new third gen will try and keep more threads together in one core complex rather than splitting if they need to interact with each other.
 

cguy

Expert Member
Joined
Jan 2, 2013
Messages
4,631
#44
My point is that Intel is bringing out more cores per chip on the same nm process which is why they are forced to reduce clockspeed per core while AMD actually is managing to reduce nm process which would mean they could actually keep clockspeed high and stack up the same or higher core counts.

INTEL is falling behind without shrinking their dies they simply cannot accomodate the higher clock speeds per core and increase the core count.

I am hoping AMD stick to the high clocks per core on these new monster multicore server chips since it would force Intel to wake the fk up and get their dies shrunk because if they actually are forced to match AMD nm process the Intel chips will likely again outperform AMD but if Intel shrug this off and only continue to add more cores on their current outdated nm process then AMD will aggressively own the new server lineups.
A process advantage is nice to have (not disputing that), and Intel is definitely falling behind in this area. Thus far though, Intel’s chips have all been vastly superior to AMD’s for most of our workloads. Even with this current generation, Intel has 2x the flops per core per clock vs this processor.

The caches on Intel are also much better from a bandwidth, latency and size perspective. Also, there are less NUMA effects due to the non-chiplet design. I haven’t yet measured this new 64-core AMD chip, but I did measure the last 32-core one and the results were pretty grim.

Intel has lost the process advantage for now, but they still have an architectural advantage.
 

Johnatan56

Honorary Master
Joined
Aug 23, 2013
Messages
24,572
#45
A process advantage is nice to have (not disputing that), and Intel is definitely falling behind in this area. Thus far though, Intel’s chips have all been vastly superior to AMD’s for most of our workloads. Even with this current generation, Intel has 2x the flops per core per clock vs this processor.

The caches on Intel are also much better from a bandwidth, latency and size perspective. Also, there are less NUMA effects due to the non-chiplet design. I haven’t yet measured this new 64-core AMD chip, but I did measure the last 32-core one and the results were pretty grim.

Intel has lost the process advantage for now, but they still have an architectural advantage.
The 32 core had known issues with the scheduler, don't think MS has released a patch, Linux got one in Jan. This gen of the top of the line epyc shouldn't have the same issue and afaik will have better single core and multi core performance.

The changes in thread handling should also help.

I was actually thinking of asking if you could test if you get your hands on one when I was reading the Anand tech article? Remember we had a discussion on it on a previous mybb thread.
 

cguy

Expert Member
Joined
Jan 2, 2013
Messages
4,631
#46
The 32 core had known issues with the scheduler, don't think MS has released a patch, Linux got one in Jan. This gen of the top of the line epyc shouldn't have the same issue and afaik will have better single core and multi core performance.

The changes in thread handling should also help.

I was actually thinking of asking if you could test if you get your hands on one when I was reading the Anand tech article? Remember we had a discussion on it on a previous mybb thread.
Yup, I remember. We will probably get a prerelease at some point. Looking forward to testing, since the other one was pretty meh. The thread scheduling won’t make a difference (we bind the threads ourselves), and don’t oversubscribe the cores.

The NUMA effects are inherent in the chiplet design, but I am hoping for better cache performance than before. The Skylake’s 1mb L2 is a real beast to compete with. It won’t compete with peak flops, but we do have some things that don’t benefit from AVX512, where this may not matter.
 
Top