August 21, 2001

Review: AMD Duron 'Morgan' 1GHz CPU

Author: JT Smith

- By Jeff Field -

AMD has again released its Duron value CPU at a higher clock speed, 1GHz. This time around AMD has also updated the core with some new features, and of course, with new features comes a new codename. Read on to find out how the AMD Duron "Morgan" handles the Penguin.
(Editor's note: NewsForge normally doesn't run two hardware reviews on the same day, but the AMD embargo on reviewing the Morgan has just been lifted, and we thought readers would appreciate this timely review.)

The chip

Here is a photo comparison of the two version of the Duron CPU:

Duron Spitfire Duron Morgan
Duron 'Spitfire' Duron 'Morgan'

Below is my original summary of the Spitfire-core Duron and its comparison to the Intel Celeron:

The Intel Celeron is essentially a Pentium III-Coppermine, with half of the L2 cache disabled, running a 66MHz front side bus speed. This is good for Intel because if a chip has problems with the L2 cache, Intel can disable half of the cache and repackage the chip as a Celeron. Where this becomes a problem is that the cache goes from being eight-way set associative, to four-way. Set associative cache is used to make the L2 cache more efficient so the CPU spends less time waiting for data from system RAM. By cutting the set associative in half for the L2 cache on the Celeron, Intel cut the cache hit rate, which is crucial. With a low hit rate, it means more time wasted finding data in system RAM.

The Duron, however, is very much its own CPU. Designed from the ground up as a low-end processor, it is not "crippled" like a Celeron. It has half the L2 cache of a Celeron, 64KB, but features 128KB of L1 cache compared to 32KB on a Celeron. The L2 cache on the Duron is 16-way set associative, allowing it to have a high hit-rate for the L2 cache. This proves crucial, because it allows the Duron to outperform a comparably clocked Celeron by large amounts.

The chip itself is very similar to the Thunderbird CPUs, because it uses the same "flipchip" configuration (with the CPU core on the top) and the same interface -- Socket A. The Duron, however, differs from the Thunderbird greatly. It has a much smaller core than the Thunderbird, using 12 million fewer transistors (25 million on the Duron versus 37 million on the Thunderbird), thanks to the reduction of the amount of cache on the chip. This greatly decreases the heat produced by the CPU, which is always good for a CPU targeted toward lower-end machines, which tend not to be as spacious and well cooled as machines on the higher end.

The original Duron, codenamed "Spitfire," is the chip I discuss in the paragraph above. A lot of the information there holds true with the new Duron, codenamed "Morgan." However, AMD made several architectural changes to the Duron, which while not revolutionary increases, do help performance.

The differences between the Morgan and the Spitfire core Durons are as important as the changes between the Thunderbird and the Palomino. First, the Morgan core supports new processor technologies, such as SSE (which AMD calls 3DNow! Professional), which is Intel's on-chip solution targeted at increasing 3D/floating point performance. This means that all software looking for SSE instructions that an Intel chip has will also find them and use them on the Morgan core Durons and Palomino core Athlons.

The other significant change is also present on the new Palomino Athlons, as well as all versions of the Intel Pentium IV. This feature is hardware data pre-fetch. Essentially, what hardware pre-fetch does is similar to how the cache mechanisms on a hard disk work -- the CPU uses an algorithm to try to figure out what data a program will need next, and place that data is in the CPU cache. In certain operations, where data is linear, this can really help increase performance. Such operations would be something like a database, where data might be pulled in sequential order from memory, and therefore easy to predict what will be needed next.

These features are the reason for a slight increase in die size and transistor count from the Spitfire core Duron to the Morgan core Duron. The original Spitfire Duron had a transistor count of 25 million. The new Morgan Duron has 25.18 million. These 180,000 extra transistors come from the added hardware pre-fetch and SSE instructions. This leads to an overall increase in die size of 6 mm^2, to 106mm^2. The increase in size and speed without a decrease in the process used leads to an increase in power and heat, of course. In this case, the Duron 1000 has a typical power dissipation of 41.2 watts, versus 37.2 watts for the Duron 950. The maximum power dissipation also increases, from 41.5 watts for the Duron 950 to 46.1 watts for the Duron 1000. Also, the voltage increases from 1.6V to 1.75V. All of these changes lead to the need for BIOS updates to your Socket-A motherboards so they can properly recognize the new processors and enable all of their new features. In my case, I am using an Asus A7VI-VM with a beta BIOS.

Some may be wondering, if the Palomino is also know as the AthlonMP, AMD's multiprocessor certified CPU, and the Morgan core is the Duron equivilent of a Palomino core on an Athlon, is this the DuronMP? The answer here is yes and no -- these Durons are not the "MP certified" DuronMPs, but they have all the features and specifications of ones, simply without the name and testing AMD does on its MP CPUs. So if you want to pick up a couple Duron 1000s and put them in an SMP Athlon board, you can do so without worry, as long as the motherboard supports it.

Performance
System Specifications

AMD Duron 950MHz or AMD Duron 1000MHz

256 Megs PC133 SDRAM from Crucial.com

Asus A7VI-VM KM133-based motherboard
Western Digital 7200 RPM 10.2 Gig Hard Drive

3Com 3C905TX-C 10/100 NIC (PCI)

300 Watt AMD-Approved ATX Power Supply
Gigabyte GF3000 GeForce 3 64MB AGP

Slackware 8.0 with Kernel 2.4.9 and XFree 4.1.0

For performance comparison purposes, similarly configured systems are used, where only the memory type (PC133), the processor type and the motherboard are different.

Kernel compiles

In order to test both the board's stability and speed, I ran three sets of Linux kernel compiles on this board. One is a normal, "uniprocessor" make, or make -j1, which is the default. This uses one process, and does not always maximize system usage. I then did make -j2, which spawns a second process. The last test I run is with make -j3, spawning two extra processes. I do this for several reasons -- to find the "sweet spot" for the board/CPU, as well as to stress the system as much as possible when trying to rate its stability. Also, the kernel is extremely useful as a measure of integer performance. In order to compile the kernel, I untarred kernel 2.4.6, ran "make config" and used the default values. (In other words, I hold down the enter key.)

Kernel 2.4.6 Compile Times (Minutes:Seconds)
Lower numbers are better
Board -j1 -j2 -j3
Duron Spitfire 950 5:34 5:29 5:25
Duron Morgan 1000 5:27 5:24 5:24
Athlon 1.4 - PC133 4:44 4:39 4:41

Here we see that kernel compiles do not benefit from the added features of the Morgan core. SSE would not apply here, but hardware pre-fetch is something that could affect compiles. In this case, it would seem that the increase in speed is simply related to the increase in MHz.

POVRay Benchmarks

POVRay is a multi-platform raytracing program. It is a very floating point intensive task and serves well to help measure the floating point performance of a CPU. For more information on this benchmark, head to the official POVBENCH homepage. The command to run for this benchmark, once you obtain POVRay, you run povray -i skyvase.pov +v1 +ft -x +mb25 +a0.300 +j1.000 +r3 -q9 -w640 -H480 -S1 -E480 -k0.000 -mv2.0 +b1000 from the command prompt. Results are in seconds.

POVRay (seconds)
Lower numbers are better
Board Result
Duron 950 22 seconds
Duron 1000 21 seconds
Athlon 1.4 - PC133 15

POVRay is another task that simply gets an increase in speed from the increase in clock speed of the Duron 1000.

Quake III Timedemos
Quake 3 Timedemos are perhaps the best way to measure 3D Gaming performance under Linux. Timedemos used the four.dm_66 demo included with the latest version of Quake 3 Arena. To run a timedemo, hit the '~' key, type timedemo 1, followed by demo four.dm_66 - once this completes, hit '~' again to see your results. High quality results were done by turning texture and color depth to 32-bit, filtering to trilinear and texture detail to its highest setting. 640x480, 800x600, 1024x768, 1280x1024 and 1600x1200 are the screen resolutions at which the tests were run.

Quake 3 Arena Timedemos (Frames Per Second)
Higher numbers are better
Board 640x480 800x600 1024x768 1200x1024 1600x1200
Default Quality
Duron 950 122.0 122.1 119.5 111.1 88.1
Duron 1000 131.8 131.3 130.4 116.1 88.8
Athlon 1.4 - DDR 186.5 185.0 172.4 125.9 90.7
Highest Quality
Duron 950 121.0 119.5 117.9 94.7 70.8
Duron 1000 131.3 131.5 125.6 95.5 70.8
Athlon 1.4 - DDR 184.1 179.1 145.6 97.9 71.6

Here we see a decent increase at 640x480 -- 10 frames per second. I'm not saying you should run out and buy a Duron 1000 to upgrade your Duron 950, but it is more of a performance increase than we saw in the kernel compiles. In SSE optimized games, you might see even further increases. Also, we see here the scores for an Athlon 1.4 with DDR memory. Seeing this, you might think the Duron isn't a very good gaming CPU. What you must realize is that past something like 60 to 70 FPS you won't really see a difference, and even at the highest resolution the Duron with a GeForce3 has no trouble pulling greater than 60 frames per second, because at those resolutions the CPU is not the limiting factor, but rather the video card.

Distributed.net Client Benchmark
Distributed.net is a distributed computing network that works on various distributed computing contests. The contests use primarily integer numbers while performing their tasks, and therefore serve as an excellent benchmark for overall integer performance of properly optimized software.

Distributed.net Client Benchmarks
Higher numbers are better
CPU RC5 Core 6 OGR Core 0
Duron 950 3,397,778 keys/sec 7,225,433 nodes/sec
Duron 1000 3,583,449 keys/sec 7,590,113 nodes/sec

Here we see the Duron 1000 again gaining clock speed. However, the cores were manually selected and were not optimized for the changes in the Morgan core.

Conclusions
The Morgan core is not a huge step for the Duron, but incremental steps should not be overlooked. As the Morgan core drops in price, people will see the benefits, however small, from these additions. AMD is not marketing this CPU like the next generation of low-end CPUs, but rather as a minor update to a good line of CPUs.

If you are looking for good performance at low cost, the Duron 1000 will be a definite option once the price drops. The price is now around $100, and you could almost buy a 1.4GHz Thunderbird for that price at this point, with 1.4GHz thunderbirds available for about $120 with shipping. The Duron 1000MHz is available for $89 in lots of 1000, so expect them to have a bit of a markup, and remember if you are looking for value you can pick up a Duron 750 for $28 on Pricewatch, where prices should be available for the 1GHz Duron soon as well..

For discussion of this review and any other hardware-related topics, please join #Hardware on OpenProjects.net.

Category:

  • Unix
Click Here!