AMD working on 64-core, 128-thread processor

DrJohnZoidberg

Honorary Master
Joined
Jul 24, 2006
Messages
21,008
#21
At the end of the day, until core/thread utilization is handled by something more suitable than the specific application, large core numbers won't be properly utilized in most use cases.
Well that's why these processors exist, for people that have specific workloads that utilize many cores.

My 8 core / 16 thread 2700x gets really long in the tooth the 17th time I'm compiling the same thing - give me moar cores!!

My poor CPU right now:

1560508714477.png
 

gamer16

Executive Member
Joined
Nov 3, 2013
Messages
7,863
#22
Well that's why these processors exist, for people that have specific workloads that utilize many cores.

My 8 core / 16 thread 2700x gets really long in the tooth the 17th time I'm compiling the same thing - give me moar cores!!

My poor CPU right now:

View attachment 672595
:ROFL: Imagine how good it would be if windows worked in such a way that any application utilities all cores.
 

PhireSide

Executive Member
Joined
Dec 31, 2006
Messages
8,411
#24
:ROFL: Imagine how good it would be if windows worked in such a way that any application utilities all cores.
Windows 10 v 1903 handles Ryzen processors better than 1809, as the scheduler has had changes done.

That being said, a lot of people (especially those with ThreadRipper) prefer using Process Lasso to set specific tasks to certain cores and to avoid having one process split across two core complexes
 

DrJohnZoidberg

Honorary Master
Joined
Jul 24, 2006
Messages
21,008
#25
You are an ideal candidate for the new Ryzen 9 3950x. Double the cores, double the threads and (probably) double the cost :p
You know it! I will be upgrading, pity it's not coming out in July with the rest of the 3000 series. Will just have to hold out until then.
 

DrJohnZoidberg

Honorary Master
Joined
Jul 24, 2006
Messages
21,008
#26
:ROFL: Imagine how good it would be if windows worked in such a way that any application utilities all cores.
That's all dependent on the app developer, if an app isn't written to utilize multiple threads then there isn't much Windows can do about it.
 

cguy

Expert Member
Joined
Jan 2, 2013
Messages
4,617
#27
You know it! I will be upgrading, pity it's not coming out in July with the rest of the 3000 series. Will just have to hold out until then.
We have a distributed build setup on our supercomputer, so we get maximum parallelism every time. Our 56 core boxes were too slow! :)
 

PhireSide

Executive Member
Joined
Dec 31, 2006
Messages
8,411
#29
You know it! I will be upgrading, pity it's not coming out in July with the rest of the 3000 series. Will just have to hold out until then.
I myself will be looking at getting maybe a 2600x or 2700x once Zen 2 drops, to replace my 1600.

A 3700x would be a lovely drop-in but I will wait for the prices to drop a little more.
 

Swa

Honorary Master
Joined
May 4, 2012
Messages
20,165
#31
New chipset name.

At the end of the day, until core/thread utilization is handled by something more suitable than the specific application, large core numbers won't be properly utilized in most use cases.
Threadripper's strength is the quad channel memory and not the core count. Indeed the multi cpu parts didn't perform so well.
 

Sapphiron

Centadel
Company Rep
Joined
Jan 29, 2004
Messages
2,004
#32
Even with inefficient apps there is an advantage of so many cores.

While video's are rendering, I am likely to watch some 4K Youtube, while playing a strategy game. All on the same CPU
 

CT_Biker

Expert Member
Joined
Sep 10, 2016
Messages
1,388
#33
AMD have come a long way since the Athlon X2(This is was the last AMD CPU I used - Intel are still too pricey)

LOL those old Athlons never had a lot of cache, so it was a nightmare to do more than one thing on them.
 

John Tempus

Expert Member
Joined
Aug 8, 2017
Messages
2,007
#35
Would love these higher core/thread server processors as long as they sustain high clocks per core.

All these Intel 36/72 etc. server chips are so downclocked per core it becomes nearly pointless and you lose the entire economics of scale with it.

Example, the 72 core linked to in this thread is running a lightspeed shattering 1.5Ghz per core, what the actual fk. Absolutely pointless.
 

cguy

Expert Member
Joined
Jan 2, 2013
Messages
4,617
#36
Would love these higher core/thread server processors as long as they sustain high clocks per core.

All these Intel 36/72 etc. server chips are so downclocked per core it becomes nearly pointless and you lose the entire economics of scale with it.

Example, the 72 core linked to in this thread is running a lightspeed shattering 1.5Ghz per core, what the actual fk. Absolutely pointless.
It’s a Xeon Phi. It has multiple AVX512 pipes, so it’s FLOPS per core is way higher than those of higher clocked Xeons. Especially at the time it was released, which was long before Skylake-X. Still not as fast as competing GPUs though. It also has higher bandwidth, and special instructions.
 
Last edited:

John Tempus

Expert Member
Joined
Aug 8, 2017
Messages
2,007
#37
It’s a Xeon Phi. It has multiple AVX512 pipes, so it’s FLOPS per core is way higher than those of higher clocked Xeons. Especially at the time it was released, which was long before Skylake-X. Still not as fast as competing GPUs though. It also has higher bandwidth, and special instructions.
Yes Im aware this example is not exactly my point.

Lets look at the top end 28ore/56 threads Xeon chip.

https://ark.intel.com/content/www/u...inum-8180-processor-38-5m-cache-2-50-ghz.html

2.5ghz effectively per core, turbo speed is pointless if you use server chips fully utilized so meaningless to look at. Thats 36cores @ 2.5Ghz = 70Ghz.

Then look at the Intel Xeon 10core/20 threads.

https://ark.intel.com/content/www/u...-processor-e5-2690-v2-25m-cache-3-00-ghz.html

Each core base is running at 3ghz. Thats 10 cores @ 3Ghz = 30Ghz

So this is what I mean with each iteration of more cores per chip they are dropping the base core clocks and this removes much of the scaling economics that you are after when utilization more cores per chip.

You get 2.33x times the clock with 2.8x the available cores. Also the available lvl3 cache is not at all proportional with the 28core you get 38.5MB cache vs 25MB on the 10core.

Their next iterations will start to come out at 2Ghz per core or even lower as they have already done with other Xeon chips.
 

cguy

Expert Member
Joined
Jan 2, 2013
Messages
4,617
#38
Yes Im aware this example is not exactly my point.

Lets look at the top end 28ore/56 threads Xeon chip.

https://ark.intel.com/content/www/u...inum-8180-processor-38-5m-cache-2-50-ghz.html

2.5ghz effectively per core, turbo speed is pointless if you use server chips fully utilized so meaningless to look at. Thats 36cores @ 2.5Ghz = 70Ghz.

Then look at the Intel Xeon 10core/20 threads.

https://ark.intel.com/content/www/u...-processor-e5-2690-v2-25m-cache-3-00-ghz.html

Each core base is running at 3ghz. Thats 10 cores @ 3Ghz = 30Ghz

So this is what I mean with each iteration of more cores per chip they are dropping the base core clocks and this removes much of the scaling economics that you are after when utilization more cores per chip.

You get 2.33x times the clock with 2.8x the available cores. Also the available lvl3 cache is not at all proportional with the 28core you get 38.5MB cache vs 25MB on the 10core.

Their next iterations will start to come out at 2Ghz per core or even lower as they have already done with other Xeon chips.
I’m not sure what your point is - generally, they are trying to achieve better perf/W, perf/$ or perf/rack-unit by going to higher core counts. In some cases they actually do have better absolute flops (you are comparing old AVX machines to AVX512 in some cases). If you are talking integer, scalar single threaded performance, then for sure the higher clocked, lower core count machines win in this situation, but for many loads, more cores, more flops/clock, etc. win out over single threaded performance.

Also be aware that Skylake-X uses a non-inclusive caching hierarchy, so L3 size is not directly comparable to earlier architectures.
 

John Tempus

Expert Member
Joined
Aug 8, 2017
Messages
2,007
#39
I’m not sure what your point is - generally, they are trying to achieve better perf/W, perf/$ or perf/rack-unit by going to higher core counts. In some cases they actually do have better absolute flops (you are comparing old AVX machines to AVX512 in some cases). If you are talking integer, scalar single threaded performance, then for sure the higher clocked, lower core count machines win in this situation, but for many loads, more cores, more flops/clock, etc. win out over single threaded performance.

Also be aware that Skylake-X uses a non-inclusive caching hierarchy, so L3 size is not directly comparable to earlier architectures.
My point is that Intel is bringing out more cores per chip on the same nm process which is why they are forced to reduce clockspeed per core while AMD actually is managing to reduce nm process which would mean they could actually keep clockspeed high and stack up the same or higher core counts.

INTEL is falling behind without shrinking their dies they simply cannot accomodate the higher clock speeds per core and increase the core count.

I am hoping AMD stick to the high clocks per core on these new monster multicore server chips since it would force Intel to wake the fk up and get their dies shrunk because if they actually are forced to match AMD nm process the Intel chips will likely again outperform AMD but if Intel shrug this off and only continue to add more cores on their current outdated nm process then AMD will aggressively own the new server lineups.
 

Swa

Honorary Master
Joined
May 4, 2012
Messages
20,165
#40
My point is that Intel is bringing out more cores per chip on the same nm process which is why they are forced to reduce clockspeed per core while AMD actually is managing to reduce nm process which would mean they could actually keep clockspeed high and stack up the same or higher core counts.

INTEL is falling behind without shrinking their dies they simply cannot accomodate the higher clock speeds per core and increase the core count.

I am hoping AMD stick to the high clocks per core on these new monster multicore server chips since it would force Intel to wake the fk up and get their dies shrunk because if they actually are forced to match AMD nm process the Intel chips will likely again outperform AMD but if Intel shrug this off and only continue to add more cores on their current outdated nm process then AMD will aggressively own the new server lineups.
What people don't know is that Intel has given up on 10nm and no matter what they claim their only roadmap is laptop processors on what they have and maybe a few 2 or 4 core processors later the year or next year. There isn't going to be any proper competitor to AMD's 7nm chips until at least 2021.
 
Top