Hd 4870 comparison. The Radeon HD4870 graphics card is the new king in the top-middle class. Game tests: Call of Juarez

Leadership in the Hi-End class of graphics accelerators for the companies producing them is a matter of prestige. Despite the relatively low demand for the fastest video cards, the cost of which in the last two or three years is almost always higher than the $ 600 or even $ 700 mark, they are a kind of benchmark for most users who are more or less interested in games. It is video cards of this class that demonstrate a kind of reference performance in modern games. And if products based on graphics processors from ATi or from NVIDIA occupy this "throne", then the company itself is considered a leader in a certain period of time, which undoubtedly affects the sales growth of video cards in all price segments and the manufacturer's rating rise.

In my opinion, from the moment the NVIDIA GeForce 8800 GTX / Ultra appeared on the market and up until the release of the GeForce GTX 260/280 line, the leadership in the top class of video cards belonged to NVIDIA. For its part, ATi (and later AMD) made an attempt to regain the first place by releasing a two-chip Radeon HD 3870 X2. However, this was not done in full measure: problems with the drivers, the AFR rendering mode that had no alternative for ATi, the presence of the GeForce 9800 GX2 on the market, and the new GPU from NVIDIA G200, which arrived very quickly, did not allow HD 3870 X2 to win the fame of the fastest graphics card.

advertising

Nevertheless, having released the certainly successful mid-range video cards Radeon HD 4850 and HD 4870, ATi's graphics division, now owned by AMD, announced on August 12, 2008 a dual-chip Hi-End video card - Radeon HD 4870 X2 with a suggested value of $ 549 and clear claims to absolute leadership. A little later, its younger sister, the Radeon HD 48, should appear on the market. 50 X2 with a recommended value of up to $ 400, which we will also study and test later. Well, today we present to your attention a review and "speed test" of the Radeon HD 4870 X2, released under the Hightech Information System Limited (HIS) label.

1. Review of HIS Radeon HD 4870 X2 2x1 GB

The design style and dimensions of the box in which the HIS Radeon HD 4870 X2 is supplied have not changed in comparison with the earlier reviewed Radeon HD 4850 and HD 4870 from the same manufacturer. Unless the color scheme is now predominantly green:

On the front side of the box, you can find information about the video card model, its interface, volume and type of installed video memory. The downside is replete with lines of motherboard specifications, descriptions of system requirements, as well as information about the features of graphics technologies embedded in the GPU. Here are the awards for print and electronic publications (of which there are already more than 600) received by HIS products. The video card is produced in China.

HIS RADEON HD 4870 512MB PCI-E

Connection to analog monitors with d-Sub (VGA) is made through special adapters-adapters DVI-to-d-Sub. DVI-to-HDMI adapters are also supplied (we remember that these accelerators support full-fledged video and sound transmission to an HDMI receiver), so there should be no problems with such monitors either.

Maximum resolutions and frequencies:

  • 240 Hz Max Refresh Rate
  • 2048 × 1536 × 32bit x85Hz Max - via analog interface
  • 2560 × 1600 @ 60Hz Max - digital interface (all DVI jacks with Dual-Link)

As for the capabilities of video cards for playing MPEG2 (DVD-Video), back in 2002 we studied this issue, since then little has changed. Depending on the movie, the CPU load during playback on modern video cards does not rise above 25%.

About HDTV. One of the studies has also been carried out and can be viewed.

Unfortunately, at the moment the RivaTuner utility (by A. Nikolaychuk AKA Unwinder) does not support new series, and therefore there is no monitoring.

Equipment.

The basic package should include: user manual, CD with drivers and utilities, DVI-to-VGA adapter, CrossFire bridge, DVI-to-HMDI adapter, component output adapter (TV-out), and external power splitters ... Next, we will show what is offered in addition to the map.

Packaging.

Installation and drivers

Test bench configuration:

  • Intel Core2 (775 Socket) based computer
    • intel Core2 Extreme QX9650 processor (3000 MHz);
    • zotac 790i Ultra motherboard based on Nvidia nForce 790i Ultra chipset;
    • rAM 2 GB DDR3 SDRAM Corsair 2000MHz (CAS (tCL) \u003d 5; RAS to CAS delay (tRCD) \u003d 5; Row Precharge (tRP) \u003d 5; tRAS \u003d 15);
    • hard drive WD Caviar SE WD1600JD 160GB SATA.
    • tagan TG900-BZ 900W power supply unit.
  • operating system Windows Vista 32bit SP1; DirectX 10.1;
  • dell 3007WFP monitor (30 ").
  • aTI drivers CATALYST 8.6; Nvidia versions 175.16 (9xxx series) and 177.34 (GTX 2xx).

VSync is disabled.

Synthetic tests

The synthetic test packages we use can be downloaded here:

  • D3D RightMark Beta 4 (1050) with a description on the website 3d.rightmark.org
  • D3D RightMark Pixel Shading 2 and D3D RightMark Pixel Shading 3 - tests of pixel shaders versions 2.0 and 3.0 link.
  • RightMark3D 2.0 from brief description: ,

RightMark3D 2.0 requires the MS Visual Studio 2005 runtime installed and the latest DirectX runtime update.

Synthetic tests were carried out on the following video cards:

  • RADEON HD 4870 HD4870)
  • RADEON HD 4850 with standard parameters (hereinafter HD4850)
  • RADEON HD 3870 X2 with standard parameters (hereinafter HD3870X2)
  • RADEON HD 3870 with standard parameters (hereinafter HD3870)
  • Nvidia Geforce GTX 260 with standard parameters (hereinafter GTX260)
  • Nvidia Geforce 9800 GTX with standard parameters (hereinafter GF9800GTX)

To compare the results of the new RADEON HD 4870 video card, these models of video cards were chosen for the following reasons: it will be interesting to compare it with the RADEON HD 3870 X2, as with a two-chip solution from AMD based on the previous GPU architecture, in order to evaluate the impact of architecture improvements and the difference in performance. Comparative performance of RADEON HD 4850 is interesting in order to find out the contribution of increased GPU frequencies and the use of a new type of GDDR5 memory. Geforce 9800 GTX, although not a direct competitor, is as interesting as the previous generation of Nvidia chips, and the price of the HD 4870 is not so far from its accelerated version GTX +. And Geforce GTX 260 is already a direct competitor to RADEON HD 4870, this comparison will be the main battle.

Direct3D 9: Pixel Filling Tests

The test determines the peak texel rate in FFP mode for a different number of textures applied to one pixel:

Nothing new or interesting, everything corresponds to the difference in frequencies. As usual, video cards do not reach theoretical values. The results of synthetics do not reach theory, the HD 3870, based on the RV670, is the closest to them. But for all new video cards from Nvidia and AMD, in this test the theoretical maximum is not reached. RV770 in our test selects about 26-27 texels per clock from 32-bit textures with bilinear filtering, not reaching 40 theoretical ones. The efficiency of Nvidia cards is even lower - 35-37 texels per clock with a theoretical 64.

As regards the comparison of the HD 4870 with the direct competitor to the GTX 260, they are very close in this test, but both fall short of the GeForce 9800 GTX. The new AMD card is significantly ahead of the old one and outperforms the lower model of the HD 4800 line in accordance with the frequencies. Interestingly, in the test with one texture, the HD 4870 lags slightly behind the HD 3870, this is due to the theoretically higher performance of the ROP units in the latter with a 32-bit framebuffer without antialiasing. In the case of a large number of textures per pixel, the ability of ROP units does not prevent a card based on RV770 from showing better results. Let's look at the results in the fill rate test:

The second synthetic test measures the fill rate, and in it we see the same situation, but taking into account the number of pixels written to the frame buffer. In cases with 0 and 1 textures applied, the RADEON HD 4870 still achieves the same slightly lower result than the HD 3870, which is due to the operating frequency of the ROPs. But as in the previous diagram, in situations with a lot of textures per pixel, the new video card comes out ahead.

Direct3D 9: Geometry Processing Speed \u200b\u200bTests

Let's look at a couple of extreme geometry tests, and first we will have the simplest vertex shader, showing the maximum throughput across triangles:

All modern chips are based on unified architectures, their universal execution units in this test are busy only with geometrical work, and solutions show good results, which clearly rests not on the peak performance of unified units, but on the performance of other units, for example, triangle setup.

The results show this - RV670 and RV770 are very close at similar frequencies. The results of AMD solutions are traditionally higher than those of Nvidia cards. The RADEON HD 4870 outperforms both Nvidia cards and its counterparts in this test. Since we have removed from consideration the intermediate tests for the processing speed of geometry with one light source, we turn to the consideration of the most complex geometric problem with three light sources, including static and dynamic transitions:

In this variant, the difference between AMD and Nvidia solutions is better seen, the gap widened slightly, video cards from the second company “sagged”. HD 4870 and HD 3870 are approximately equal at similar frequencies, they are again limited by something like triangle setup, since the numbers have hardly changed since the last test.

Again, all AMD video cards outperform the Geforce 9800 GTX and GTX 260. In real applications, the universal shader processors are mainly busy with pixel calculations, and we are now going to examine their performance.

Direct3D 9: Pixel Shaders Tests

The first group of pixel shaders that we are considering is very simple for modern video chips; it includes various versions of pixel programs of relatively low complexity: 1.1, 1.4 and 2.0.

Although the tests are too simple for modern architectures and do not show their true strength, they are interesting to watch when changing architectures. In simple tests, the performance is limited by the speed of texture fetching, and in the RV770 chip the texturing performance is just improved. This made it possible to achieve victory on all fronts, the HD 4870 outperforms both Nvidia cards in all the considered tasks and is sometimes up to two times faster than the HD 3870.

In more complex tests, the RADEON HD 4870 also shows excellent results, significantly outperforming its predecessor and competitors. But the Geforce GTX 260 is not impressive due to the lower texturing speed, slightly outperforming the 9800 GTX in only the two most difficult tests. Let's look at the test results of more complex pixel programs of intermediate versions:

Great result for the RADEON HD 4870! In the highly dependent texture rendering test Water, which uses dependent texture fetching from large nesting levels, and maps are ranked by texture rendering speed, the new model significantly outperforms both Nvidia cards, and the difference with the HD 3870 is simply striking.

The second test is more intensive in computing units, and it is better suited for AMD architectures with a large number of stream processors. In it, the new AMD solution again shows the best result, faster and Geforce GTX 260 and 9800 GTX by 1.5-2 times! And again, compared to the previous generation, the new board has more than doubled speed. The difference with the HD 4850 corresponds to the difference in GPU frequencies.

Direct3D 9: Tests of Pixel Shaders New Pixel Shaders

These DirectX 9 pixel shader tests are even more complex and fall into two categories. Let's start with the simpler shaders version 2.0:

  • Parallax Mapping - the texture mapping method familiar from most modern games, described in detail in the article
  • Frozen Glass - complex procedural texture of frozen glass with controlled parameters

There are two variants of these shaders: with a focus on mathematical calculations, and with a preference for fetching values \u200b\u200bfrom textures. Consider mathematically intensive options that are more promising in terms of future applications:

These are mathematical tests that depend on the frequency of shader units and texturing speed, the balance of the chip is important here. The performance of video cards in the Frozen Glass test is limited not only by mathematics, but also by the speed of texture fetching, so the old RADEONs show the weakest result. But the new ones ... See for yourself, they are noticeably faster than the previous one. And the HD 4870 being reviewed today is even ahead of the GeForce 9800 GTX and GTX 260.

In the second test "Parallax Mapping", the new products from AMD are even stronger. While the HD 4850 scores slightly above the GTX 260, the HD 4870 is well ahead of both Nvidia models. Improvements in the TMU significantly strengthened the results of the HD 4800 line, in these tests they became the new leaders. Let's consider the same tests in a modification with a preference for samples from textures to mathematical calculations, where the results can turn out to be even more interesting:

The results for RADEON HD 4850 and Geforce 9800 GTX are quite close, but HD 4870 is expected to outperform both due to the higher frequency of the chip. The relative position of the cards has changed a bit, the emphasis on the speed of texture units is noticeable. And both cards based on RV770 outperform the previous top one by two or more times. But the GTX 260 showed very weak results in this case, even lagging behind its predecessor.

Let's consider the results of two more pixel shader tests - version 3.0, the most difficult of our pixel shader tests for Direct3D 9. The tests differ in that they heavily load ALUs and texture units, both shader programs are complex, long, and include a large number of branches:

  • Steep Parallax Mapping - much more "heavy" kind of parallax mapping technique, also described in the article
  • Fur - procedural shader rendering fur

AMD's new architecture in these tests shows itself from the best side, in contrast to the previous solutions, which were outperformed by Nvidia cards. HD 4870 outperforms all competitors by a large margin, the difference with HD 3870 is simply enormous. And Geforce 9800 GTX with Geforce GTX 260 are far behind.

Once again, we see excellent results from the redesigned AMD architecture in our DirectX 9 benchmarks. But what will happen in DX10, because in past studies, things were clearly worse there. Now let's find out by comparing it with the previous generation dual-chip card, since everything is clear with the single-chip RV670 for a long time ...

Direct3D 10: PS 4.0 Pixel Shader Tests (Texturing, Loops)

The new version of RightMark3D 2.0 includes two familiar PS 3.0 tests for Direct3D 9, which were rewritten for DirectX 10, as well as two more completely new tests. The first pair adds the ability to enable self-shadowing and shader supersampling, which additionally increases the load on video chips.

These tests measure the performance of executing pixel shaders with loops, with a large number of texture samples (in the heaviest mode, up to several hundred samples per pixel!) And a relatively low ALU load. In other words, they measure the texture sampling rate and branching efficiency in a pixel shader.

The first pixel shader test will be Fur. At the lowest settings, it uses 15 to 30 texture samples from the heightmap and two samples from the main texture. Effect detail - “High” mode increases the number of samples up to 40-80, enabling “shader” supersampling - up to 60-120 samples, and “High” mode together with SSAA differs in maximum “severity” - from 160 to 320 samples from the height map.

Let's first check the modes without supersampling enabled, they are relatively simple, and the ratio of the results in the “Low” and “High” modes should be approximately the same.

Performance in this test depends not only on the number and speed of TMUs, but also on the fill rate and memory bandwidth. As we expected, in Direct3D 10 tests of procedural rendering of fur with a large number of texture fetches, nothing has changed much - the same huge advantage of Nvidia solutions over AMD. Let's see what happens next, AMD cards always fail this test.

Although the HD 4870 lost to both Nvidia cards, it showed an advantage over the younger model of the line, corresponding to the frequency difference. And the two-GPU RADEON HD 3870 X2 outperformed the new HD 4870 only in the heavy mode. A very good result if you don't look at the Nvidia numbers. Let's look at the result of the same test, but with the enabled "shader" supersampling, which increases the work by four times, perhaps in such a situation something will change, and the memory bandwidth with fill rate will have less effect:

Enabling supersampling theoretically quadruples the load, this time the overwhelming advantage of Nvidia cards has not gone away either, although the new AMD video cards are already clearly closer to the Geforce 9800 GTX. Otherwise, as the complexity of the shader and the load on the video chip increase, the difference between HD 4870 and dual-GPU HD 3870 X2 is almost the same, they are close to each other.

The second test that measures the performance of complex pixel shaders with loops with a large number of texture fetches is called Steep Parallax Mapping. At low settings, it uses 10 to 50 texture samples from the heightmap and three samples from the main textures. When you turn on the heavy mode with self-shadowing, the number of samples doubles, and supersampling increases this number four times. The most complex test mode with supersampling and self-shadowing selects from 80 to 400 texture values, that is, eight times more than the simple mode. Check first simple options without supersampling:

This test is more interesting from a practical point of view, because varieties of parallax mapping have long been used in games, and heavy options like our steep parallax mapping are used in some projects, for example, in Crysis and Lost Planet. In addition, in our test, in addition to supersampling, you can enable self-shadowing, which increases the load on the video chip by about two times, this mode is called “High”.

The relative position of the cards from the previous test is repeated. Although AMD solutions were strong in Direct3D 9 parallax mapping tests, in the updated D3D10 version without supersampling they cannot cope with our task at the level of Geforce video cards, and enabling self-shadowing causes too much performance drop on AMD products. The RADEON HD 4870 we are reviewing lags behind both Geforce video cards and is very close to the dual-GPU HD 3870 X2. Let's see what will change the inclusion of supersampling, in the last test it caused a larger drop in speed on Nvidia cards.

When supersampling and self-shadowing are turned on, the task becomes more difficult, the simultaneous inclusion of two options increases the load on the cards almost eight times, causing a large drop in performance. The difference between the speed of different video cards is already different, the inclusion of supersampling affects the same as in the previous case - AMD cards improve their performance relative to Nvidia solutions. And the new HD 4800s, although they continue to lag behind the Geforce, the HD 4870 is close to the HD 3870 X2 and almost caught up with at least the Geforce 9800 GTX. It is far from a direct competitor to the GTX 260, of course.

Direct3D 10: PS 4.0 Pixel Shader Benchmarks (Compute)

The next couple of pixel shader tests contain the minimum number of texture fetches to reduce the impact of TMU performance. They use a large number of arithmetic operations, and they measure exactly the mathematical performance of video chips, the speed of execution of arithmetic instructions in a pixel shader.

The first math test is Mineral. This is a complex procedural texturing test that uses only two texture data samples and 65 instructions like sin and cos.

When analyzing the results of our synthetic tests, we always note that in computationally complex tasks, modern AMD architectures perform better than competitors from Nvidia. And now Mineral HD 4870 simply tore the competitors apart. The top video card based on one RV770 chip outperforms the previous generation card based on two RV670s, which is close to the difference in the number and frequency of stream processors. Also, the new video card is almost twice as fast as the direct competitor Geforce GTX 260, not to mention the Geforce 9800 GTX.

The second shader computation test is called Fire, and it is even harder for ALUs. It has only one texture fetch, and the number of instructions like sin and cos is doubled, to 130. Let's see what has changed with increasing load:

In this test, the rendering speed is limited solely by the performance of shader units, and the test suits AMD architectures very well, which is clearly noticeable after correcting a bug in AMD drivers. What can I say ... Complete defeat of Nvidia solutions. Just think, RADEON HD 4870 is more than twice as fast as Geforce GTX 260 and faster than dual-GPU HD 3870 X2. An amazing result, the RV770 is clearly the strongest GPU in general. By the way, the speed ratio between HD 4870 and HD 4850 exactly matches the difference in frequencies.

Direct3D 10: Geometry Shader Tests

The RightMark3D 2.0 package contains two tests of the speed of geometry shaders, the first version is called "Galaxy", the technique is similar to the "point sprites" from previous versions of Direct3D. It animates a particle system on a GPU, a geometric shader creates four vertices from each point, forming a particle. Similar algorithms should be widely used in future DirectX 10 games.

Balancing changes in geometry shader tests do not affect the final rendering result, the final image is always exactly the same, only the scene processing methods change. The "GS load" parameter determines in which shader the calculations are performed - in vertex or geometric. The number of calculations is always the same.

Let's consider the first variant of the Galaxy test, with calculations in a vertex shader, for three levels of geometric complexity:

The ratio of speeds with different geometric complexity of scenes is about the same, the performance corresponds to the number of points, with each step the FPS drop is about two times. The task for modern video cards is not very difficult and the limitation of the speed by the power of stream processors in the test is not obvious, the task is also limited by the memory bandwidth and fill rate.

Well, it turned out very interesting, the dual-GPU HD 3870 X2, the new HD 4870 and the competitor GTX 260 have extremely dense results. And the HD 4850 with the GeForce 9800 GTX is very tight. Interesting ... Perhaps, when transferring part of the calculations to the geometric shader, the situation will be even more interesting, let's see:

But no, the difference between the considered test variants is small, there were no significant changes. Is that the dual-GPU HD 3870 X2 came out on top in terms of the achieved frame rate. It's easier for him, the AFR multi-chip rendering algorithm forgives a lot. Nvidia video cards show identical results when changing the GS load parameter, which is responsible for transferring part of the calculations to the geometry shader, and the results of some AMD video cards have increased slightly. Let's see what changes in the next test, which assumes a heavy load on geometry shaders ...

Hyperlight is the second geometry shader test, demonstrating the use of several techniques at once: instancing, stream output, buffer load. It uses the dynamic creation of geometry using rendering in two buffers, as well as a new Direct3D 10 feature - stream output. The first shader generates the direction of the rays, the speed and direction of their growth, this data is placed in a buffer that is used by the second shader for rendering. For each point of the ray, 14 vertices are built in a circle, up to a million output points in total.

A new type of shader program is used to generate "rays", and with the "GS load" parameter set to "Heavy" - also for their rendering. That is, in the "Balanced" mode, geometry shaders are used only for creating and "growing" rays, the output is carried out using "instancing", and in the "Heavy" mode, the geometry shader is also involved in the output. Let's consider light mode first:

Relative results in different modes correspond to the load: in all cases the performance scales well and is close to the theoretical parameters, according to which each next level of "Polygon count" should be twice as slow. This time the RADEON 4850 and HD 4870 are faster than the dual-GPU solution based on the previous GPU architecture, but all AMD cards lag behind all Nvidia solutions, although the HD 4870 is close to them.

It looks like the results of the new maps were affected by the improved texturing capabilities. However, the numbers should change in the next diagram, in a test with more active use of geometry shaders. It will also be interesting to compare the results obtained in "Balanced" and "Heavy" modes with each other.

This time only the GeForce 9800 GTX failed, all other architectures withstood the blow. In both the RV770 and GT200, some optimizations have been made to improve geometry shader performance. And RADEON HD 4870 has now caught up with Geforce GTX 260, except for the simplest mode. The previous generation of AMD chips performs significantly worse in this test, even a dual-GPU video card lags behind.

As for the comparison of results in different modes, everything is as always, AMD video cards improve their performance when switching from instancing to a geometric shader in output, while old Nvidia video cards lose performance. Geforce card based on the G92 chip can compete only due to the speed in the "Balanced" mode, which is almost equal to the speed in the "Heavy" of the RADEON HD 4850. At the same time, the picture obtained in different modes does not visually differ.

Direct3D 10: Speed \u200b\u200bof fetching textures from vertex shaders

The Vertex Texture Fetch tests measure the speed of a large number of texture fetches from the vertex shader. The tests are similar in essence and the ratio between the results of the maps in the tests "Earth" and "Waves" should be approximately the same. In both tests it is used on the basis of texture sampling data, the only significant difference is that the Waves test uses conditional transitions, while the Earth does not.

Consider the first test "Earth", first in the "Effect detail Low" mode:

Judging by previous studies, the results of this test are influenced not only by the texturing speed, but also by the ROP performance and memory bandwidth, and the simpler the mode, the greater the effect on the speed they have. In all modes, except for the simple one, the leader is the top model of the HD 4800 series, which we are considering today. In idle mode, memory bandwidth affects, and multi-chip rendering shows itself well. GTX 260 performs only at the level of HD 4850. Let's look at the results of the same test with an increased number of texture fetches:

The situation has not changed too much, but texturing has a stronger effect on speed, as can be seen from a pair of Geforce. HD 4870 has lost ground and is not a leader, although it lags just a little behind the GeForce 9800 GTX in difficult modes. In idle time, the GTX 260 with a large memory bandwidth is in the lead. Interestingly, as the geometry becomes more complex, the difference between the HD 4870 and HD 3870 X2 changes.

Let's consider the results of the second test of texture fetching from vertex shaders. The Waves test has a smaller number of samples, but it uses conditional jumps. The number of bilinear texture samples in this case is up to 14 ("Effect detail Low") or up to 24 ("Effect detail High") for each vertex. The complexity of the geometry changes in the same way as in the previous test.

The second test of this section called "Waves" is more favorable to AMD products; the new model of the RADEON HD 4800 family looks very good, at the level of its two-GPU predecessor. And it also outperforms Nvidia graphics cards, except for the simplest, where the GTX 260 is slightly ahead. It seems that under such conditions, the TMU efficiency of the RV770 is higher than that of the Nvidia GPU. Let's consider the second variant of the same test:

And again we see very little new, although with an increase in the complexity of the test, the results of AMD video cards relative to the speed of Nvidia cards improved, the latter lost a little more due to changes in testing conditions. In the lightest mode, HD 3870 X2 and HD 4870 are ahead, in the rest the dual-GPU HD 3870 X2 has no equal. Well, among single-chip cards the hero of the review is the best, he is ahead of his younger brother HD 4850 according to the difference in frequencies. Nvidia cards are left behind this time.

3DMark Vantage: Feature tests

We decided to include synthetic tests from 3DMark Vantage into the RADEON HD 4870 review. The package is new, its feature tests are quite interesting and differ from ours. Probably, when analyzing the results of the maps in this package, we will draw some new and useful conclusions for ourselves.

Feature Test 1: Texture Fill

The first test is the texture sampling rate test. Used to fill a rectangle with values \u200b\u200bread from a small texture using multiple texture coordinates that change every frame.

The balance of results is broadly similar to what our tests show, using conditions where Nvidia cards do not benefit from a large number of TMUs. The old single-chip card from AMD lags far behind everyone, but the dual-GPU HD 3870 X2 and the new HD 4870 model significantly outperform both solutions from Nvidia. Geforce GTX 260 lags behind Geforce 9800 GTX, as it should be in theory. But why does the RV770-based card outperform both the G92 and GT200? Apparently, the point is in the very efficiency of texture units, which is higher in AMD cards.

Feature Test 2: Color Fill

Fill rate test. A very simple pixel shader is used with no performance limitation. The interpolated color value is written to an offscreen buffer (render target) using alpha blending. A 16-bit FP16 offscreen buffer is used, which is most often used in games that use HDR rendering, so this test is very timely.

The readings of this test correspond to what we get in our synthetic tests, taking into account the fact that we use an integer buffer with 8-bits per component, and in the Vantage test - 16-bit floating point. Therefore, all the numbers are half ours.

These numbers rather show not only the ROP performance, but also the amount of memory bandwidth (in the case of multichips, multiplied by the number of chips for AFR). The figures correspond to the theoretical ones and depend, first of all, on the width of the memory bus and its frequency. In this test, the new HD 4870 model, taking advantage of the improved ROP capabilities and large GDDR5 memory bandwidth, performs better than the dual-GPU HD 3870 X2 and GTX 260 with 448-bit memory bus.

Feature Test 3: Parallax Occlusion Mapping

One of the most interesting feature tests, as a similar technique is already used in games. It draws one quadrilateral (more precisely, two triangles) using a special technique called Parallax Occlusion Mapping, which simulates complex geometry. Quite resource-intensive ray tracing operations and high resolution depth map are used. Also this surface is shaded using the heavy Strauss algorithm. This is a test of a very complex and GPU-heavy pixel shader containing numerous texture selections for ray tracing, dynamic branches and complex lighting calculations using Strauss.

The test is interesting in that it does not depend only on shader power, branching efficiency and texture sampling rate, but on everything at once. That is, the balance of the chip and the card is important to achieve high speed. And the most important is the efficiency of branching in shaders, the so-called execution granularity.

The old cards from both manufacturers are far behind, even the dual-GPU HD 3870 X2 could not catch up with the HD 4870, although the dual-GPU rendering of this test is quite efficient. And here we see an interesting arrangement of the RADEON HD 4870 and Geforce GTX 260. Despite the fact that in the tests of texture sampling and mathematical calculations the AMD solution usually wins, in the POM test the Geforce is stronger than the RADEON. And the best efficiency of processing branches in shaders in GT200 is to blame for this.

Feature Test 4: GPU Cloth

The test is interesting in that it calculates physical interactions (tissue imitation) using a video chip. Vertex simulation is used, using a combined work of vertex and geometry shaders, with several passes. Use stream out to transfer vertices from one simulation pass to another. Thus, the performance of vertex and geometry shaders execution and stream out speed are tested.

In this test, we traditionally get strange results with dual-GPU cards, the HD 3870 X2 does not get any acceleration from its second GPU. In all other respects, we again see the lag in AMD solutions, even the relatively fast HD 4870 does not reach the Geforce 9800 GTX, not to mention the GTX260. It seems that speed does not depend on shader performance, but depends on stream out speed ...

Feature Test 5: GPU Particles

Physical simulation test of effects based on particle systems calculated using a video chip. Vertex simulation is also used, each vertex represents a single particle. Stream out is used for the same purpose as in the previous test. Several hundred thousand particles are calculated, all are animated separately, and their collisions with the height map are also calculated. Similar to one of the tests in our RightMark3D 2.0, particles are rendered using a geometry shader, which creates four vertices from each point, forming a particle. But the test most of all loads shader units with vertex calculations, stream out is also tested.

Here we see almost the same thing as in the previous case, only Geforce 9800 GTX lags behind, and AMD cards pulled up higher. But still, the Geforce GTX 260 remains the leader, today's hero HD 4870 is close to it. The dual-GPU card from AMD again did not go far from the old one-GPU card, and both are at the bottom of the list. And again, suppose speed is affected by stream out performance, memory bandwidth and texture performance at the same time.

Feature Test 6: Perlin Noise

This feature test can be considered a mathematically intensive test of a video chip, it calculates several octaves of the Perlin noise algorithm in a pixel shader. Each color channel uses its own noise function for more load on the video chip. Perlin noise is a standard algorithm often used in procedural texturing and is very complex mathematically.

The last feature test in Vantage shows the pure mathematical performance of video chips. The performance shown in it is quite consistent with what we see in our math tests from RightMark 2.0. AMD video cards naturally outperform their competitors from Nvidia, even the dual-GPU HD 3870 X2 outperforms the GTX 260. And the RADEON HD 4870 is the leader and outperforms its main competitor by more than 25%.

Conclusions on synthetic tests

Based on the results of the synthetic tests carried out, we confirm the conclusions made in the previous article. New solutions from AMD turned out to be very successful, many changes were made in the RV770 chip, in almost all synthetic tests it is several times faster than the previous generation video card. Thanks to the improved architecture of the RV770, which corrected the main drawbacks, the RADEON HD 4870 outperforms its main competitor Geforce GTX 260 in many tests. The RV770 has become more efficient and balanced, which is important for modern and future 3D applications with a large number of complex shaders.

The RV770 chip has a large number of execution units, supports the new GDDR5 memory, which made it possible to release the RADEON HD 4870 with a high memory bandwidth based on only a 256-bit memory bus. Small questions arise only about the lower efficiency of branch execution in shader programs, which affects the performance of the most complex parallax mapping algorithms. Well, in terms of stream out speed, new solutions from AMD are inferior to competitors from Nvidia. The rest of the new HD 4800 line is just fine! Especially the computational performance, in which they are far ahead.

The next part of the article contains tests of the new solution from AMD and other video cards in modern gaming applications. The game results should confirm our findings. It can be assumed that the average speed of the HD 4870 in games will be approximately on par with the Geforce GTX 260.

The power supply for the test bench was provided by the company TAGAN
Dell 3007WFP Testbed Monitor Courtesy of

Just a couple of weeks ago, the nVidia world was quiet and peaceful. The largest graphics card manufacturer has just released new models GeForce GTX 260 and 280 , which, despite a delay of six months, further pushed the unified architecture introduced with the GeForce 8 to the very limits of the 65nm process technology with a gigantic number of transistors. The performance compared to the previous (now old) generation of video cards was not particularly impressive (59% increase on average over the 9800 GTX), but the appearance of CUDA applications was an interesting step forward, and nVidia had no real competitor. Meanwhile, AMD seemed to become increasingly frustrated with its graphics division, which was unable to compete in the high-end segment of the market as it once did, and the existing high-end graphics cards were quickly becoming obsolete. This was followed by a loud release of the video card Radeon HD 4850 , which appeared in test labs before the announcement, and the retail price was announced at $ 199.

Yes, a miracle happened in the AMD camp. The performance of the Radeon HD 4850 surprised everyone, including nVidia. Despite the hasty release of the GeForce 9800 GTX +, which won't hit retail until mid-July, Nvidia still couldn't get the same great price / performance ratio as the new Radeon, which we already showed in radeon HD 4850 benchmarks ... Usually, marketing arguments such as optimizing efficiency and yield, which have always sounded unconvincing, have taken on new meaning given the benchmark results of the Radeon HD 4850. The novelty has raised hopes for even better performance in the future. Having received (perhaps even to its own surprise) a good opportunity to increase the number of stream processors from 320 to 800, despite a 43% increase in the number of transistors and the previous technical process, AMD decided not to stay "at the bottom", and for the better. GPU Radeon HD 4870 was announced, based on the same architecture, but providing higher performance (obviously, at a higher price), but it appeared in test laboratories very slowly, and not everything was clear until the last moment. At least on paper, this video card was a direct competitor to the new high-end models from Nvidia, but at a significantly lower price. But what did we get in practice?

For a long time, nVidia was a pioneer in implementing the latest technologies memory. After using DDR memory for GeForce in 2000, the Santa Clara company was the first to introduce GDDR2 memory with a GeForce FX video card, then GDDR3 memory with a GeForce 5700 model. But then ATI took over the leadership: GDDR4 first appeared along with the Radeon X1950 XT model, and today, two years later, ATI introduced the first graphics card with GDDR5 memory: the Radeon HD 4870.

With the increase in memory bandwidth, everything is clear: there are two ways. The first is to expand the memory bus, and the second is to increase the operating frequency. The first method has several obstacles. A wider bus is difficult to wire on a PCB, and packaging requires more contacts. All these contacts must be brought to the chip, which requires a larger number of interconnects around the periphery of the crystal. Therefore, the wide bus requires the core to be of a certain size - for this reason, for a long time, entry-level GPUs were limited to a 128-bit bus, while their high-end variants used a 256 or 384-bit bus. Another disadvantage is the increased power consumption of the chip.

Therefore, they resorted to this method very carefully. In fact, a 128-bit bus was used for a long time for high-end GPUs, from Riva 128 to Matrox Parhelia, and the ATI Radeon 9700 also used it four years ago. Likewise, the 256-bit bus didn't get any wider until the introduction of the nVidia GeForce 8800 in late 2006. Yes, GPU memory bandwidth requirements are constantly increasing, despite bandwidth saving technologies that are being optimized with each generation.

The second solution is to speed up memory performance. But this is easier said than done, since, as with any microcircuit, there is a limitation on the clock frequency at which memory chips can operate. To get around these restrictions, manufacturers have resorted to various tricks. DDR memory can transfer data on the rise and fall of the clock, doubling the effective memory bandwidth at the previous frequency. For this, DDR memory uses the so-called two-bit prefetch - for each memory access, instead of transferring one bit from the prefetch buffers, DDR memory transfers two. Subsequent developments of DDR technology provided for the transfer of more and more data at the same physical memory frequency, increasing the prefetch width. DDR2 uses 4-bit prefetch, just like GDDR3. With GDDR4 came 8-bit prefetch.

GDDR5

GDDR5 uses 8-bit prefetch, just like GDDR4, but with several innovations. For the first time, GDDR5 uses two clock speeds, CK and WCK, the latter being twice the first. Commands are transmitted in SDR (Standard Clock Rate) mode at the CK frequency; address information is transmitted in DDR mode at CK frequency; and data is transferred in DDR mode at WCK frequency. In the case of the Radeon HD 4870, which uses 900 MHz GDDR5 memory, commands are sent at 900 MHz SDR, addresses at 900 MHz DDR (1800 MHz effective frequency), and data at 1800 MHz DDR (3600 MHz effective frequency) ).

This approach reduces signal quality problems during command and address transmission by providing very high data rates. Unfortunately, higher frequencies also mean higher error rates. Therefore, to ensure reliable data transmission, GDDR5 uses an error detection mechanism that is used in networks. If the memory controller detects an error, the command with which it appeared will be executed again.

So AMD and nVidia have taken very different paths to increase the memory bandwidth for their GPUs, and these choices have to do with different views of GPUs. Nvidia, committed to the huge monolithic die principle, can afford a 512-bit memory bus, avoiding the chip supply issues that inevitably accompany the introduction of advanced memory technology. In contrast, with the introduction of the RV770, AMD is focusing its efforts on reduced die size GPUs for high-end graphics cards. As we were told by AMD engineers, the first version of the RV770 was supposed to be equipped with no more than 480 stream processors (ALU), but the GPU limited the number of interconnects for memory interfaces.

So AMD was able to offer the GPU that everyone is already familiar with, with 800 stream processors that are almost "free" in terms of core area. In the previous generation GPU, nVidia had to forget about the 384-bit bus when moving from the G80 (80 nm) to the G92 (65 nm). Therefore, there is every chance that the same step will happen with the 512-bit bus. This time, however, nVidia can rely on GDDR5 to make up for the lost bandwidth.


Graphics card specifications
HD 4850 HD 4870 GTX 260 GTX 280
GPU frequency 625 MHz 750 MHz 576 MHz 602 MHz
Stream Processor Frequency (ALU) 625 MHz 750 MHz 1 242 MHz 1 296 MHz
Memory frequency 1,000 MHz 900 MHz 999 MHz 1 107 MHz
Memory bus width 256 bits 256 bits 448 bits 512 bits
Memory type GDDR3 GDDR5 GDDR3 GDDR3
Memory size 512 MB 512 MB 896 MB 1,024 MB
Stream Processors (ALU) 800 800 192 240
Number of texture units 40 40 64 80
ROP number 16 16 28 32
Theoretical performance 1 TFlops 1.2 TFlops 715 GFlops 933 GFlops
Memory bandwidth 64 GB / s 115.2 GB / s 111.9 GB / s 141.7 GB / s
Number of transistors 956 million 956 million 1,400 million 1,400 million
Technical process 55 nm 55 nm 65 nm 65 nm
Crystal area 260 mm² 260 mm² 576 mm² 576 mm²
Generation 2008 2008 2008 2008
Shader Model Support 4.1 4.1 4.0 4.0

The difference between the Radeon HD 4870 and the "younger" 4850 model can be boiled down to two characteristics: theoretical performance, which increased by 20% due to the higher clock speed (the number of stream processors did not change, unlike nVidia's approach), and memory bandwidth, which almost doubled (over 80%). The reason for this change lies, as we have seen, in the use of GDDR5, with an effective frequency that is almost doubled compared to the GDDR3 used on the Radeon HD 4850. However, GDDR5 memory is expensive, although it does not cost as much as switching to 512 or even 448-bit bus, which is necessary to achieve the equivalent memory bandwidth on GDDR3, as nVidia decided to do. Not to mention the higher power consumption (memory chips + controller). The result was such that the bandwidth of the Radeon HD 4870 is almost the same (in fact, 3% more) compared to the GeForce GTX 260.

The superiority of the 4870 in theoretical performance over the GTX 260 is impressive given that the 4870's die area is only 45% of the nVidia GPU!

On the other hand, we cannot fail to mention the obvious limitation that arises after reading the specifications - the amount of memory, which is limited to 4870 512 MB. This is just over half of the GTX 260, and even if we consider that the Radeon card suffers less than the GeForce from accessing RAM on the PC in cases of limited frame buffer, we should carefully consider how the performance changes as the resolution is increased. It should be noted that some manufacturers, such as PowerColor, have already announced a 1GB version of the Radeon HD 4870, but they will not be available until late July.

Similar to the Radeon HD 3870, but unlike the Radeon HD 4850, the Radeon HD 4870 is a dual-slot cooler. This should allow it to more easily cope with the heat generated by the RV770, or at least expel the heat outside the case. But the similarities end there. The HD 4870 no longer requires one, but two six-pin power plugs PCI Expressand the length of the card has significantly increased - up to 9.5 "(24.1 cm) compared to 9" (22.8 cm) for the Radeon HD 3870 and 10.5 "(26.7 cm) for the GeForce GTX 260. In addition, instead of a cooler from Arctic Cooling with straight blades, a more traditional model is used.

We used a Sapphire video card for our tests. Includes 2GB USB stick in branded color, PowerDVD 7 OEM player with 6-channel audio support, Cyberlink DVD Suite 5 and full version 3DMark06. A USB stick is a nice bonus, but it would be nice to get a game in the package, since video cards are aimed at the gamer segment.

We used the same test configuration as in radeon HD 4850 article .


Motherboard Asus P5E3 Deluxe (Intel X38)
CPU Intel Core 2 Quad QX6850 (3 GHz)
Memory Crucial 2 x 1GB DDR3 1333 MHz 7-7-7-20
HDD Western Digital WD5000AAKS
Optical drive Asus 12x DVD
Power Supply Cooler Master Real Power Pro 850W
Software
OS Windows XP, Vista, Vista SP1
NVidia Drivers ForceWare 177.39 beta (9800 GTX +)
ForceWare 177.34 beta (GTX 260 and 280)
ForceWare 175.16 WHQL (9800 GTX, 9800 GX2, 8800 Ultra)
ATI Drivers Catalyst 8.7 beta (HD 4850, HD 4870)
Catalyst 8.6 WHQL (HD 3870)
Catalyst 8.5 WHQL (HD 3870 X2)

Test results




Unsurprisingly, in Flight Simulator, the Radeon HD 4870 performs at the same level as the 4850 (the cards use the same driver). Both cards outperformed the HD 3870 and even the GeForce GTX 200, but lagged behind most GeForce 8 and 9 models. Flight Simulator X is barely playable on the 4870 and current drivers.





This is the first real test for the Radeon HD 4870 and the graphics card did not disappoint us. It took the third place in the ranking, behind the GeForce GTX 280 and 9800 GX2, but ahead of the GTX 260! The average lead over the GTX 260 was 10% even at a resolution of 2,560 x 1,600. And the Radeon HD 4850 outperforms ATI's leader by 17%.




Test Drive Unlimited confirmed the very strong impressions of the Radeon HD 4870 that we got after Call of Duty 4. The video card not only outperformed the GTX 260 at all resolutions, but also finished slightly behind the GTX 280 at 1,920 x 1,200 with anti-aliasing, and even overtook the leader nVidia at 2,560 x 1,600 with anti-aliasing (and it was still playable at this resolution). The performance is impressive, especially considering that the 4870 has only half the memory of the GTX 280. We think that with the Radeon 4800 line, anti-aliasing is back in place.




Crysis is no exception, the Radeon HD 4870 is again positioned between the GeForce GTX 260 and the GTX 280. Moreover, it lags behind the latter by less than 11% on average (with the exception of 2,560 x 1,600 + anti-aliasing), although HD 4870 costs half as much.




When we ran World in Conflict, which is quite resource-hungry, we found that the Radeon HD 4870 got closer to the GeForce GTX 280, and the gap over the GTX 260 increased, averaging 17% - a clear advantage. The performance after enabling anti-aliasing was again amazing. The 4870 caught up with the GTX 280 at 2,560 x 1,600 with anti-aliasing despite the limited 512MB of memory (and even though it was still impossible to play with these settings).





The Radeon HD 4870 performed slightly worse with the Supreme Commander. However, the difference can still be admitted as relative, the card slightly outperformed the GeForce GTX 260 on average, and although the new ATI lagged behind the GTX 260 at a resolution of 1,680 x 1,050, the frame rate is good enough for gaming.





Unreal Tournament III is one of the first games to have a problem with the Radeon HD 4870. As you can see, it lagged behind the GeForce 9800 GTX at 1,680 x 1,050 resolution. Although the video card rose in the rating after the resolution was increased, it systematically outperformed the GeForce GTX 260 in this game (by 13% on average).





The Radeon HD 4870 is having difficulty playing Mass Effect again. The gap over the overclocked Asus Radeon HD 4850 was 33%, but it turned out to be not enough to bypass the GeForce GTX 260. And even the GeForce 9800 GTX + in resolutions of 1,680 x 1,050 and 1,920 x 1,200.





On the other hand, the Radeon HD 4870 won nicely in GRID, giving top scores in resolutions of 1,680 x 1,050 and 1,920 x 1,200. Only in 2,560 x 1,600 did the video card take second place after the GeForce GTX 280! We were impressed. It looks like AMD's new product handles racing with ease.

We were disappointed with the high power consumption of the Radeon HD 4850 in idle mode, although under load everything turned out to be quite normal. Let's see how the Radeon HD 4870 performs in this regard.

Unfortunately, the power consumption of the Radeon HD 4870 turned out to be more than 4850 in idle mode - by 22 W in total (measurements were taken on the power supply). When the computer displays a worker windows table (idle mode), the GPU speed drops to 550 MHz. And after launching the 3D application, the video card launches the GPU at a maximum frequency of 750 MHz. Compared to the Radeon HD 3870, the difference under load is 47W, which means we get a 40% increase in PC power consumption! So the low power efficiency of the Radeon HD 4800 line is confirmed in practice. Another problem: Unlike the HD 4850, the power consumption of the Radeon HD 4870 in games is not as low. In fact, the power consumption is even higher than that of the GeForce GTX 260. The high performance can justify this, but what about the low transistor count and high-end process technology?


The Radeon HD 4850 runs very quietly in idle mode, but the "older" model cannot provide the same performance - despite the switch to a dual-slot cooling system, the Radeon HD 4870 runs louder in idle mode, judging by our measurements, although in practice the difference is small. At 1,045 rpm - only 12% of maximum - the fan can still be considered quiet.

The situation changes under load. While the HD 4870 is clearly not quite as quiet as the Radeon HD 3870, the graphics card performs well (the fan speeds up to 1,600 rpm) and is much quieter than the noisy GeForce GTX 260.


While not breaking the Radeon HD 4850 temperature record, the HD 4870 runs extremely hot when idle, with GPU temperatures reaching 70 ° C. But this does not lead to any problems, and do not forget that due to the two-slot design of the cooling system, hot air is thrown out of the computer (unlike the Radeon HD 4850), which is always good.

Under load, the heatsink copes with its work, the temperature does not rise much - at least not as much as on the "younger" Radeon HD 4850.

Finally, our conclusion about the new Radeon HD 4870 will be simple: we have a great high-end graphics card in front of us! With the same architecture and most of the strengths of the Radeon HD 4850, it sits in the top category in terms of performance and price. It is only 6% faster (in most tests) than the GeForce GTX 260 on average, but the MSRP is $ 299 - $ 150 less than the nVidia card! Even the top model nVidia GeForce GTX 280, equipped with more transistors, twice as much memory and higher clock speeds, did not come out much ahead. It offers only 13% more performance than the Radeon HD 4870, although it costs twice as much.

However, there are a number of things that make the situation less ideal. First, the Radeon HD 4870 suffers somewhat from another tasty AMD card, the HD 4850, which offers a better price / performance ratio (just 23% less performance at 60% less price). AMD has completely changed its weaknesses and strengths compared to the previous generation, in particular, with the Radeon HD 3870 - the performance of the Radeon HD 4870 with anti-aliasing enabled is quite good (despite 512 MB of memory), but the card consumes much more power in idle mode, as well as under load (more than the GeForce GTX 260). And the new model can hardly be called quiet, although the card runs much quieter than the GeForce GTX 260, and nothing heats up inside the case.

Now nVidia will have to react and quickly drop prices for the GeForce GTX 260, which is also good news - although we haven't seen a price cut yet. As for AMD, it needs to take a few more steps in order for the new generation to be fully successful - to release a high-end card (on two RV770s, as expected), which will receive equally enthusiastic reception. And this will not be so easy to do.

With the same qualities as the Radeon HD 4850, but for a higher price, the Radeon HD 4870 graphics card was able to compete directly with the GeForce GTX 260 - it is slightly faster and much cheaper than the offer from NVidia, and without excessive noise. Despite the high power consumption, if the promised prices become reality, you can hardly find a better choice on the market.

Benefits.

  • 6% faster performance than GeForce GTX 260;
  • much lower price than GeForce GTX 260;
  • less noise compared to GeForce GTX 260.

disadvantages.

  • High power consumption under load, but especially in idle mode;
  • the price / performance ratio is not as good as the Radeon HD 4850.

For all achievements, the Radeon HD 4870 graphics card receives the "Recommended Purchase" award.




The graphs show the average results for each video card and each game. If the card could not display the game at some resolution or with anti-aliasing, then it received a zero result, which seriously affected video cards with 512 MB of memory (or less) at a resolution of 2,560 x 1,600 with anti-aliasing (except for the Radeon HD 4870). and also on the Radeon HD 3870 X2, which does not have anti-aliasing in Mass Effect. It should be noted that our sample 4850 from Asus had a drawback: it could not work properly in Race Driver: GRID, which improved the results of other video cards.

And again “Crysis”, but this time using DirectX 10. As you can see, only in low resolution the Radeon HD 4870 can compete with the competitor GeForce, but in 1600x1200 it is already noticeably inferior. Playing at maximum settings in DirectX 10 consumes a lot of video memory, and it looks like the GeForce GTX 260 benefits from its "extra" megabytes.

Conclusion

In this article, we got acquainted with the modern flagship from AMD and compared it with the younger model on the same graphics chip and with a competitor from NVIDIA. Although the GeForce GTX 260 can hardly be called a direct competitor, the price of this video card is still slightly higher. Initially, when new models began to appear on the market, the price difference was even greater. But AMD's pricing policy forced NVIDIA to reduce the cost of its new products. And it is no coincidence, because in many tests the Radeon HD 4870 is only slightly inferior to the GeForce GTX 260, and sometimes even overtakes. But the larger memory capacity and good overclocking potential help the GeForce GTX 260 to confidently hold the leading position in many applications. Additional benefits of the Radeon HD 4870 include lower power consumption.

As for the ratio between Radeon HD 4850 and Radeon HD 4870, sometimes we observe a significant gap between the older model and the younger one. And if the difference in the frequency of the chip is not so great, then, no doubt, faster memory helps to achieve such indicators. On the other hand, with a performance difference of 20-30%, the junior card is much cheaper. This situation was at one time with the Radeon HD 3850 and Radeon HD 3870, and then gradually the difference in price between these cards decreased to a very insignificant amount. So the Radeon HD 4850 is without a doubt a pretty good model in its price range.

The Radeon HD 4870 also outperforms its closest competitor on price. Although if the GeForce GTX 260 is replaced by a new, slightly cheaper and more economical board, then all the minimal advantages of Radeon will fade into the background. But so far there are no such options, and the Radeon HD 4870 occupies its own price niche. And the answer to ultra-fast and hot NVIDIA graphics cards at an exorbitant price will be the Radeon HD 4870 X2 based on two RV770s. And this card will probably rightfully take the place of the leader, because the RV770 has potential, as we have already seen.

On the NVIDIA video adapter market.

In this article we will consider direct price competitors - Radeon HD 4870 and GeForce GTX 260. It should be noted that initially the price of the second model was higher, but recently NVIDIA has reduced the cost of new products.

MSI R4870-T2D512 (Radeon HD 4870 512MB GDDR5)

We'll start with an AMD graphics card. The architecture of the new RV770 graphics chip has already been discussed in the article on the Radeon HD 4850, so let's go straight to the card, which differs from the younger representative of the new Radeon line in higher operating frequencies and GDDR5 memory.

MSI R4870-T2D512 graphics card comes in a proprietary box with a handle for easy portability.


The complete set is as follows:

  • DVI / D-Sub adapter;
  • HDTV adapter;
  • DVI / HDMI adapter;
  • S / Video-RCA adapter;
  • CrossFire bridge for connecting video cards;
  • Driver disc;
  • Installation instructions.
The card completely repeats the reference design and all that sets it apart from the competition is a sticker on the cooling system made in the same style as the box.



The older model, unlike the Radeon HD 4850, has a large two-slot cooler with heat pipes and a pipe.



Due to the increased power consumption, the design of the cooler has been changed compared to the Radeon HD 3870, and now it is more reminiscent of the CO from the Radeon HD 2900: the aluminum base contacts only memory and power elements, and a separate copper insert opposite the core transfers heat through two heat pipes to thin aluminum fins. From above, the structure is covered with a translucent casing and the air blown by the fan blowing through the radiator fins is thrown out of the case. Implementing a cooling system for memory and power elements separately from the GPU makes some sense. More than once, in video cards with a small turbine cooler, which was simply not enough to cool the core, a situation was observed when a hot heatsink did not improve the temperature regime of the memory chips, but only heated them up. In this card, in the absence of a single heat sink, the total temperature of all elements no longer depends on the hottest component of the board.

As you remember, in one of the news we wrote about how the thermal imager showed the hottest element of such a video card. It turned out to be a multiphase inductor VITEC 59PR9853 of a digital power control circuit. The fact is that it does not come into contact with the aluminum base of the cooler - there is a rectangular hole in this place. Those. cooling of this throttle and two adjacent ones is carried out only by air passing near them due to the thrust created by the fan-turbine. Apparently, the engineers decided that this would be enough. Those who love alternative cooling systems can experiment with installing additional heat sinks on the inductors. Perhaps this will increase the cooling efficiency, raise the overclocking level, or increase the lifespan of the overclocked video card.

The Radeon HD 4870 card has a more complex power subsystem than that of the youngest member of the new family, and has acquired a second 6-pin auxiliary power connector. The power consumption of these solutions is at the level of 160 W.


The RV770 graphics chip remains the same:


The memory used is GDDR5 chips manufactured by Qimonda in the amount of eight pieces and a total volume of 512 MB. The memory bus is, as before, 256 bits.

Monitoring and overclocking

The operating frequencies of the Radeon HD 4870 video card are 750/3600 MHz (core / memory). The real physical frequency of GDDR5 memory corresponds to 900 MHz (4 times less effective), but some utilities can determine the operating frequency as 1800 MHz. By the way, according to the marking of memory chips ending in 40x (1.0 ns), the unsoldered memory is designed for 1 GHz, which gives a small margin for overclocking. As for software tools according to monitoring, so far only the latest versions of GPU-Z and AMD GPU Clock Tool show accurate temperature data. The second utility allows you to overclock the card above the limit values \u200b\u200bavailable from ATI Overdrive.

In the GPU-Z utility, the operating memory frequency is determined as 900 MHz, but the monitoring displays 1800 MHz.


However, using some utilities is fraught with unstable work. When trying to warm up the video card using the ATI Tool test, the system hung tightly when switching to the GPU-Z monitoring window. But when that window remained in the background, everything worked fine. The younger model did not have such problems, moreover, the same driver version was used.

In 2D, the operating frequency of the chip is reduced to 500 MHz, although even so, the temperature remains at 80 ° C.


Under load, the temperature does not rise much, a maximum of 5-7 degrees, and then, according to the readings of secondary sensors. The data of the first sensor displayed by GPU-Z changes minimally. Such a minimum delta is realized by increasing the fan speed. Its operation cannot be called silent, but in idle time the turbine can hardly be heard against the background of the power supply. Compared to the Radeon HD 4850, this cooling system is quieter. However, under load, with high revs, the characteristic whistle of the turbine is already clearly audible.

There is no software for regulating fan speed yet, but there is an easy way to edit the Catalyst Control Center profile, first described on the Guru3D forum. You can find more details on the corresponding link. In our case, when the turbine speed was increased to 50%, the temperature under load did not exceed 76 ° C.

Just like last time, it was decided to compare the reference cooling system with the Zalman VF900-Cu. To do this, the native cooler was completely dismantled, Zalman was installed, and an 80 mm case fan at 2000 rpm was screwed on to blow the power circuit. However, even at maximum speed, the chip temperature under the Zalman VF900-Cu exceeded 100ºC after two minutes of the ATITool test! As a result, unstable work and freezing. The Crysis benchmark hung on the second pass. This allows us to speak about good performance of the native cooler even with the minimum speed at which it initially operates. But the strangest thing is that the same RV770 chip used on the Radeon HD 4850, which has only 125 MHz below the frequency, did not warm up above 60 ° C with the Zalman VF900-Cu. And here everything is 100 ° C! Did the increase in frequency and supply voltage lead to such a difference? Moreover, Zalman and the board itself were incredibly hot, although everything was blown out.

Actually, this would not have been a problem if overclocking had not been limited to cooling. ATI Overdrive allows you to overclock the chip only up to 790 MHz. Using the AMD GPU Clock Tool, the core could be raised to 830 MHz and even 840 MHz. In this mode, the card worked, but not for long - the low speed of the turbine affected. Here we would raise the revs, and everything would be fine, but if the overclocking limit was exceeded, the created profile for the Catalyst Control Center became inactive and the revs were reset to nominal. There was nothing more powerful than the Zalman VF900-Cu at hand, so we had to limit the overclocking to 790 MHz - at this frequency the profile with increased revs was activated, which allowed the card in question to work stably. The memory was overclocked to 1100 MHz (4400 MHz).


The chip still has a margin for overclocking, and if you improve the cooling and the general temperature regime of the board, you can gain about 40-50 MHz at the standard supply voltage. At least on this copy from MSI.
ASUS ENGTX260 / HTDP / 896M (GeForce GTX 260 896MB GDDR3)

ASUS ENGTX260 / HTDP / 896M video card is based on the "stripped-down" GT200 chip with one disabled TPC cluster. As a result, we have 192 stream processors, 64 texture units and 28 blending units. The width of the memory bus has also been reduced to 448 bits and its volume to 896 MB. Due to this relief, the power consumption of the video card has decreased to 182 W, which is slightly more than that of the GeForce 8800 GTX.

The video card is shipped in a box of a standard design for this manufacturer, designed for cards based on NVIDIA solutions.


Inside, the components are packed in small boxes with the ASUS logo. Although everything is made of plain black cardboard, it looks quite stylish.


The package bundle is quite rich:
  • DVI / D-Sub adapter;
  • HDTV adapter;
  • Power adapter from Molecule to 6-pin;
  • Driver disc;
  • Installation instructions;
  • CD bag;
  • Mousepad.
The last two accessories look no less stylish than the packaging. And although a rug made of embossed faux leather looks impressive, it is unlikely to be useful. Such an uneven surface is not the best option as a working surface for an optical mouse.


ASUS even managed to stand out in the design of the reference product. The girl with a bow, traditional for the company, is depicted on a "camouflage" background. Khaki video card - this, no doubt, looks unusual.



Like the older model GeForce GTX 280, this video card is completely "chained" in a metal casing of the cooling system: the front side is covered by a two-level cooler, and the back side is covered with a radiator plate.

The video adapter has two 6-pin power connectors. The rear panel interfaces are standard - two DVI and TV-out. By the way, for some unknown reason, the bundle did not include an adapter for DVI / HDMI, which is usually bundled with even cheaper products. In addition, the accelerator has two MIO connectors (they are hidden behind a rubber cap), which will allow you to build a 3-Way SLI system from three similar video cards.


Unfortunately, we did not have the opportunity to disassemble this copy, but the board itself does not differ much from the PCB of the GeForce GTX 280, except for two unsoldered memory chips.

Monitoring and overclocking

The operating frequencies of ASUS ENGTX260 fully comply with the recommended specifications. The core runs at 576 MHz, the stream processors run at 1242 MHz, and the memory runs at 1998 MHz. Under load, the core heats up to 77 ° C, while the fan speed does not rise above 1000 rpm. Despite such a low speed, the noise from the turbine begins to manifest itself clearly. However, the CO of this video card is still quieter than that of the GeForce GTX 280, Radeon HD 4850 and Radeon HD 4870.


To reduce power consumption in 2D mode, the operating frequencies of the card are reduced to 300/100/200 MHz (core / stream processors / memory). At the same time, the temperature drops to 50-60 ° C, and the cooling fan becomes absolutely silent. Under light load, the video card can increase frequencies to 400/100/594 MHz without switching to the maximum values.


A characteristic feature of the new NVIDIA video cards in the work of their energy-saving technologies is inertial mode switching when the load is reduced. If you exit the 3D application, the video card first switches to 400/100/594 MHz, and only after some time to the minimum values.

You can overclock new video cards from RivaTuner 2.09. From the same utility you can control the fan speed of the cooling system. Only now the asynchronous core overclocking is still unstable and our sample was able to work at 691/1458/2520 MHz.


The fan speed has been increased for stable operation. Note the interesting behavior with clock speeds from RivaTuner. If you set the core clock to 692 MHz, the shader processors synchronously switch to 1512 MHz, while the core runs at 691 MHz. If we set, for example, 688 MHz, then the shader units are switched to 1458 MHz, and the core still remains at 691 MHz. But since long-term operation at 1512 MHz caused artifacts, we limited ourselves to a lower value. The chip temperature during overclocking at 90% of the turbine speed was kept within 70 ° C.

The resulting overclocking is slightly better than that of the GeForce GTX 280. It will be interesting to see if the younger model can catch up with the older one and compensate for the lack of computing units by increasing the frequencies. Comparative table of characteristics of video cards

Sapphire Radeon HD 4850 XpertVision GeForce GTX 280 ASUS ENGTX260 ZOTAC GeForce 9800GX2 ASUS EN8800GTS
Processor codename RV770 RV770 GT200 GT200 2 x G92 G92
Technological process, nm 55 55 65 65 65 65
Core frequency, MHz 750 625 602 576 602 650
Unified shader unit frequency, MHz 750 625 1296 1242 1512 1625
Number of unified shader units 800 800 240 192 2 x 128 128
Number of texture units (TMU) 40 40 80 64 2 x 64 64
Blending Blocks (ROP) 16 16 32 28 2 x 16 16
Memory type GDDR5 GDDR3 GDDR3 GDDR3 GDDR3 GDDR3
Memory frequency, MHz 3600 1986 2214 1998 1998 1944
Memory interface width, bit 256 256 512 448 2 x 256 256
Memory size, MB 512 512 1024 896 2 x 512 512

Test stand:

  • Processor: Core 2 Duo E8400 (3 GHz @ 4 GHz, FSB 445 MHz);
  • Cooler: Thermalright Ultra-120 eXtreme;
  • Motherboard: Gigabyte GA-P35-S3;
  • Memory: OCZ PC6400 (2x2GB, [email protected] MHz, 5-5-5-15);
  • Hard drive: Hitachi T7K250 (320GB);
  • Sound Card: Creative Audigy 4 (SB0610);
  • Power supply: Chieftec CFT-1000G-DF (1000 W);
  • Operating system: Windows XP SP2, Windows Vista Ultimate;
  • GeForce video card driver: ForceWare 175.16 (GF 8800GTS), ForceWare 177.41 (GF GTX 260 / GTX 280), ForceWare 175.19 (GF 9800GX2);
  • Radeon Graphics Driver: Catalyst 8.6.
Used 32-bit oSso, despite the total memory of 4 GB, less was involved - 3.5 GB. A traditional set of gaming tests was used. Note that almost all games were tested during real gameplay. All test details are described below for each game separately. Antialiasing was enabled in those games that natively support it; forced forcing from drivers is not enabled. The test configuration did not change throughout all the tests, which made it possible to reduce all previously obtained results to general diagrams.

Test results in DirectX 9

We'll start with a synthetic test. When comparing video cards of different generations and different architectures, we focus on tests in gaming applications, so we use only one, still the most popular, test from Futuremark.


In this test suite, both Radeon cards do not perform well. But this is just synthetics, and how everything will turn out in real games, we will see below.

S.T.A.L.K.E.R. (DX9)


Graphics settings are maximum. Testing was carried out at the WarPig level, saturated with objects, explosions, flashes. For a graphics card, this is one of the most difficult scenes in the application. In view of the fact that even a short episode cannot be repeated in the game with all the desire and the same pass, the test was repeated five times. Diagrams were built based on average results.





Lagging Radeon in past games has turned into a confident leadership in this game. The Radeon HD 4870 not only outperforms the GeForce GTX 260, but even demonstrates a minimal advantage over the GeForce GTX 280 at 1280x1024. At high resolution, NVIDIA's new flagship is slightly faster. The GeForce GTX 260 slightly outperforms the Radeon HD 4850, but it can only compete with the Radeon HD 4870 in overclocking.

Legend: Hand of God (DX9)

A striking game, although a typical Diablo clone. The graphics in the game are beautiful, but system requirements disproportionately large. But this only makes it more interesting who will show the best results in this game with an unoptimized graphics engine. The SLI technology in this game, judging by our last test, does not work.


All graphics settings are maximum. Filtering and anti-aliasing was enabled from the game menu.





Without anti-aliasing, the performance of all video cards is about the same. Only the GeForce GTX 280 stands out. With anti-aliasing, the difference is also not big, but the overclocked GeForce GTX 260 also takes the lead. In the nominal mode, its results are identical to the Radeon HD 4870.

Race Driver: GRID (DX9)

A popular car simulator based on the modernized DIRT game engine.


Graphics settings are maximum. For each mode, the San Francisco ring track was replayed three times. The anti-aliasing mode was MSAA4x multisampling. There are no results for Radeon HD 4850 and GeForce 8800GTS for these modes.





Here it is, Radeon's first triumph. The Radeon HD 4870 graphics card also outperforms the GeForce GTX 260 and even competes with the GeForce GTX 280.

Crysis (DX9)

Crysis has been featured twice in our testing. Traditionally, first we will consider the performance of video cards in this game under DirectX 9.


Graphics settings in High position. For the tests, the standard GPU benchmark was used.





Without anti-aliasing, the Radeon HD 4870 is 4% faster than the GeForce GTX 260, but when anti-aliasing is activated, it is already 1-4% lower. When overclocked, the GeForce GTX 260 shows an unconditional advantage over the overclocked Radeon HD 4870, which reaches 15-18% in heavy modes. Test results in DirectX 10

Devil May Cry 4 (DX10)

New game from Capcom. A good example of good hardware optimization and great picture.


Graphics at maximum value. A separate pre-release game benchmark was used, consisting of 4 scenes. The final results are the arithmetic mean of these four scenes.





The behavior of Radeon graphics cards in this game is more than mysterious. Enabling multisampling results in a 10% performance improvement. Of course, this is not possible, so there are probably some errors when rendering the picture. By the way, there is information that multisampling in this game does not work on Radeon. However, even + 10% does not help the Radeon HD 4870 perform better than the GeForce GTX 260. If we take the results without MSAA as correct, then everything looks sad for new AMD video cards.

Assassin's Creed (DX10)


Testing in this game is as follows: Walking a specific route, including a walk on rooftops, side streets and a small square saturated with NCP. Average results were obtained for triple tests. At high resolution 1600x900, the game simply does not allow enabling anti-aliasing, so these results for this resolution are not shown on the diagram. In the settings, the slider sets the multisampling level, which takes three discrete values. It is not known which mode corresponds to the maximum quality of anti-aliasing. Note that the game is played without patch 1.02, which seems to remove support for Direct X 10.1, which causes the performance of the Radeon to drop. So, in theory, the results of Radeon in this game should please us.
processor test
Although with a slight advantage, the Radeon HD 4870 outperforms the GeForce GTX 260 in all modes except the last. The fact is that at 1600x1200, when anti-aliasing was activated, the test hung on a Radeon video card. The cause of this problem may be due to insufficient video memory, which is causing the software crash. Or, perhaps, the problem is precisely in the drivers (this is not the first case), because this did not happen on the Radeon HD 3870/3850 with Catalyst 7.12. On the GeForce with more memory, no problems were observed. As for the overclocking results, here again the Radeon HD 4870 cannot compete with the GeForce GTX 260.

conclusions

Based on the results of the testing, one can again come to a disappointing result that AMD has not carried out any revolution with the release of new video cards. The performance of video cards has grown, problems with a drop in performance when anti-aliasing is activated have disappeared. But as we can see, comparing the Radeon HD 4870 with the direct competitor in the face of the GeForce GTX 260, sometimes one card is in the lead, sometimes another. There are games where the Radeon results are much lower (S.T.A.L.K.E.R., TimeShift), but there are also those where the Radeon HD 4870 competes with the more expensive GeForce GTX 280 (GRID), and even slightly outperforms it (Call of Duty 4).

On the side of the top AMD model, the power consumption is lower. However, even so, the GeForce GTX 260 is still quieter, thanks to the better cooling system. Also, the GeForce GTX 260 has a good overclocking potential, which allows you to come close to the results of the older model GeForce GTX 280 at nominal frequencies. However, overclocking does not compensate for the lack of computing units, but the financial benefits are tangible. As for overclocking the Radeon HD 4870, the results would have been better with adequate cooling, and this should not be forgotten. If you look at the results, an increase in the frequency of the chip by 5% and the memory by 22% gives an increase in games of 3-7%. This indicates that performance is limited by the power of the graphics chip, and the memory bandwidth is already headlong. If we theoretically assume that we would overclock the core to 830-840 MHz, we would have won the same amount, i.e. the overall increase would be 6-14%. In GeForce GTX 260, the overclocking efficiency (with an increase in core frequencies by 20%, by 17% of shader units and by 26% of memory) almost always reaches 15-20% (the only exception is Call of Duty 4, where we win only 10%). That is, theoretically, even overclocking the Radeon HD 4870 to 840 MHz, the GeForce GTX 260 will still have higher performance.

But this is just a logical assumption. The overclocking of each specific instance can be different, more or less. In all games, both video cards show sufficient performance, so the choice will depend more on your personal preferences. The only recommendation for the GeForce GTX 260 is that at nominal frequencies you get a quieter card. And besides this, it has more memory, which will be very useful for high resolutions and heavy modes.

As for other video cards that have already appeared in our testing, the Radeon HD 4850 deserves attention. While its price is 50% less than the price of the older model, it is only 20-35% behind in performance. With good overclocking, you compensate for some of this lag, but you will hardly be able to achieve the results of the older model, due to the use of faster memory in the Radeon HD 4870. And if overclocked, you will have to change the native cooling system.

The GeForce 8800GTS also looks great, competing successfully with the Radeon HD 4850. The good overclocking potential allows this accelerator with overclocking not only often overtaking its competitor from AMD, but also approaching the results of the GeForce GTX 260.

GeForce GTX 280 is the undisputed single-chip leader. But few can withstand the high noise of the native cooling system. We have already questioned the expediency of buying this card. This is just a product designed to maintain NVIDIA's leadership in the video adapter market. But with a bunch of shortcomings, this is a dubious option for a gamer, but such a card can be a choice of a bencher without any problems.

As we have seen, in some games, the performance of Radeon graphics cards is questionable. New Catalyst 8.7 drivers have been released recently. They are claimed to increase performance in many applications. Of course, such loud statements often turn into only a couple of percent gain. However, in one of the following articles we will try to compare the performance of the new Radeon video adapters on different drivers. Let's hope that their performance in games like S.T.A.L.K.E.R. and TimeShift will grow, and the situation with the activation of MSAA in Devil May Cry4 will become clear.


Thanks to the following companies for providing the test equipment:

  • DC-Link, in particular Alexander aka Punisher, for a GeForce GTX 280, GeForce GTX 260, Radeon HD 4850, Radeon HD 4870 and Chieftec CFT-1000G-DF power supply;
  • PCshop Group for the GeForce 9800GX2 video card;
  • STORM store for Core 2 Duo E8400 processor and OCZ PC6400 memory.