D3D RightMark ReadMe (English)

D3D RightMark synthetic tests (DirectX 9)

Introduction

This version of D3D RightMark distributed without any warranty expressed or implied. Authors are not responsible for any damages or losses caused by usage of the above mentioned Software. Current version is provided on free of charge basis, any commercial usage is permitted only by written approval of the Authors of Software.

The D3D RightMark includes the following synthetic tests at this moment:

Pixel Filling Test;
Geometry Processing Speed Test;
Hidden Surface Removal Test;
Pixel Shader Test;
Point Sprites Test.

System Requirements

PC with Intel-compatible 500Mhz processor or above.
128MB RAM or above.
DirectX 9.0 compatible graphics adapter.
Microsoft Window 98/Me or Microsoft Windows XP or above.
DirectX 9.0 or above.

Remark 1: To obtain full functionality, you need fully DirectX9-compatible graphics adapter.

Remark 2: For reading XLS reports, generated by tests, you need Microsoft Office XP or above.

Philosophy of the synthetic tests

The main idea of all our tests is focusing on performance of one or another chip's subsystem. In contrast to real applications which measure effectiveness of accelerator's operation in one or another practical application integrally, synthetic tests stress on separate performance aspects. The matter is that a release of a new accelerator is usually a year away from applications which can use all its capabilities effectively. Any those users who want to be on the front line with technology have to buy one or another accelerator almost blindly, warmed only with results of the tests carried out on outdated software. No one can guarantee that the situation won't change with the games they are waiting for. Apart from such enthusiasts which take such risk, there are some other categories of people in such a complicated situation:

First category - people who don't want to mess up with upgrade and who buy a computer of the maximum configuration for a long time. It's very important for them to make the time of suitability of their machines for oncoming applications as long as possible.
Second category - software developers; they have to keep their eye on capabilities and balance of new accelerators to design and balance competently the engine (code) and content (levels, models) taking into account the effective usage of equipment which will become widespread by the time their applications get onto the market. The synthetic tests will help them choose ways for realization of their ideas and restrain the bounds of their imagination :-).
Third category - IT analysts (for example, from big trade companies) and hardware reviewers, i.e. the people who have to estimate potential of products when they are not officially announced yet.

So, synthetic tests allow estimating performance and capabilities of separate subsystems of accelerators in order to forecast accelerator's behavior in some or other applications, both existing (overall estimation of suitability and prospects for a whole class of applications) and developing, provided that a given accelerator demonstrates peculiar behavior under such applications.

Description of the D3D RightMark synthetic tests

Pixel Filling

This test has several functions, namely:

Measurement of frame buffer filling performance
Measurement of performance of different texture filtering modes
Measurement of effectiveness of operation (caching) with textures of different sizes
Measurement of effectiveness of operation (caching and compression) with textures of different formats
Measurement of multitexturing effectiveness
Visual comparison of quality of implementation of some or other texture filtering modes

The test draws a pyramid whose base lies in the monitor's plane and the vertex is moved away to the maximum. Each of its four sides consists of triangles. A small number of triangles allows to avoid dependence on geometrical performance which has nothing to do with what is studied. 1 to 8 textures are applied to each pixel during filling. You can disable texturing (0 textures) and measure only the fill rate using a constant color value. During the test the vertex moves around at a constant speed, and the base rotates around the axis Z. So, the pyramid's sides take all possible angles of inclination in both planes, and the number of shaded pixels is constant and there are all possible distances from the minimal to the maximum. The inclination of the shaded plane and the distance to the shaded pixels define many filtering algorithms, in particular, anisotropic filtering and various modern realizations of trilinear filtering. By rotating the pyramid we put the accelerator in all conditions which can take place in real applications. It allows us to estimate the filtering quality in all possible cases and get weighted performance data.

The test can be carried out in different modes - the same operations can be accomplished by shaders of different versions and fixed pipelines inherited from the previous DX generations. That is why you can find out the performance gap between different shader versions.

A special texture with different colors and figures eases investigation of quality aspects of the filtering and its interaction with full-screen anti-aliasing. MIP levels can have different colors. So that you can estimate the algorithm of their blending and selection.

The test gives its results in FPS and FillRate. The latter plays two roles. In the no-texture mode we measure exactly the frame buffer write speed. In this respect, this parameter defines the number of pixels filled in per second - Pixel FillRate. In the texture mode it indicates the number of sampled and filtered texture values per second (Texturing Rate, Texture Fill Rate).

Geometry Processing Speed

This test measures the geometry processing speed in different modes. We tried to minimize the influence of filling and other accelerator's subsystems, as well as to make geometrical information and its processing as close to real models as possible. The main task is to measure the peak geometrical performance in different transform and lighting tasks. At present, the test allows for the following lighting models (calculated at the vertex level):

Ambient Lighting - simplest constant lighting
1 Diffuse Light
2 Diffuse Lights
3 Diffuse Lights
1 Diffuse + Specular Light
2 Diffuse + Specular Lights
3 Diffuse + Specular Lights

The test draws several samples of the same model with a great number of polygons. Each sample has its own parameters of geometrical transformation and relative positions of light sources. The model is extremely small (most polygons are comparable or smaller than a screen pixel). Thus, the resolution and filling do not affect the test results. The light sources move in different directions during the test to underline various combinations of the initial parameters.

There are three degrees of scene detailing - they influence the total number of polygons transformed in one frame. It's necessary to make sure that the test results do not depend on a scene and fps at all.

The test results are available in FPS and PPS (Polygons Per Second).

Hidden Surface Removal

This test looks for techniques of removal of hidden surfaces and pixels and estimates their effectiveness, i.e. effectiveness of operation with a traditional depth buffer and effectiveness and availability of early culling of hidden pixels. The test generates a pseudorandom scene of a given number of triangles.
Which will be rendered in one of three modes:

sorted, front to back order
sorted, back to front order
unsorted

In the second case the test renders all pixels in turn, including hidden ones, in case the accelerator is based on the traditional or hybrid architecture (a tile accelerator can provide optimization in this case as well, but remember that the sorting will take place anyway, even though on the hardware or driver levels).

In the first case the test can draw only a small number of visible pixels and the others can be removed yet before filling. In the third case we have some sort of a middle similar to what the HSR mechanism can encounter in real operations in applications that do not optimize the sequence of scene displaying. To get an idea on the peak effectiveness of the HSR algorithm it's necessary to collate the results of the first and second modes (the most optimal first mode with the least convenient second one). The comparison of the optimal mode with the unsorted one (i.e. the first and third) will give us an approximate degree of effectiveness in real applications.

The scene rotates around the axis Z in the test to smooth away any potential peculiarities of different early HSR algorithms which are primarily based on the frame buffer zoning. As a result, the triangles and their verges take all possible positions.

For checking support and effectiveness of the Early Z reject (ATI) and Early Z cull (NVIDIA) technologies which allow avoiding texturing and shader execution for pixels that do not pass the Z test, there is an added option which forces texturing of all triangles of the scene. You can also change the number of rendered triangles to see how the test depends on other chip's subsystems and drivers. We can expect improvement of the results as the number of triangles grows up, but on the other hand, the growth is justified only up to a certain degree after which the influence of other subsystems on the test can start going up again. That is why this parameter was brought in to estimate quality of the test regarding the number of triangles.

Pixel Shading

This test estimates performance of various pixel shaders 2.0. In case of PS 1.1 the speed of execution of shaders translated into the stage settings could be easily defined, and it was needed to have only a test like Pixel Filling carried out with a great number of textures, in case of PX 2.0 the situation looks much more complicated. Instruction per clock execution and new data formats (floating-point numbers) can create a significant difference in performance not only when the accelerator architectures differ, but also on the level of combination of separate instructions and data formats inside one chip. We decided to use an approach similar to the CPU benchmarking for testing performance of pixel processors of modern accelerators, i.e. to measure performance of the following set of pixel shaders which have real prototypes and applications:

per-pixel diffuse lighting with per-pixel attenuation - 1 point source
per-pixel diffuse lighting with per-pixel attenuation - 2 point sources
per-pixel diffuse lighting with per-pixel attenuation - 3 point sources
per-pixel diffuse lighting + specular lighting with per-pixel attenuation (1 point source)
per-pixel diffuse lighting + specular lighting with per-pixel attenuation (2 point sources)
per-pixel diffuse lighting + specular lighting with per-pixel attenuation (3 point sources)
marble animated procedure texturing
fire animated procedure texturing

Two last tests implement the procedure textures (pixel color values are calculated according to a certain formula) which are an approximate mathematical model of the material. Such textures take little memory (only comparatively small tables for accelerated calculation of various factors are stored there) and support almost infinite detailing! They are easy to animate by changing the basic parameters. It's quite possible that future applications will use exactly such texturing methods as capabilities of accelerators will grow.

The geometrical test scene is simplified, and dependence on the chip's geometrical performance is almost eliminated. Hidden surface removal is absent as well - all surfaces of the scene are visible at any moment. The load is laid only on the pixel pipelines.

For checking effectiveness of the floating-point format of FP16 half precision there is an option which allows picking one of three types of pixel shaders - base 2.0 where a precise operation format can't be indicated, and two types of 2.X - with forcing of 16bit precision of calculations and of 32bit precision respectively.

Point Sprites

This test measures performance of just one function: displaying of pixel sprites used for creating systems of particles. The test draws an animated system of particles resembling a human body. We can adjust a size of the particles (which will affect the fillrate), enable and disable light processing and animation. In case of a system of particles geometry processing is very important, that is why we didn't separate these two aspects - filling and geometrical calculations (animation and lighting) but made possible to change a load degree of one or another body part by changing sprite size and switching on/off their animation and lighting.

Contact Information

You can send any comments, suggestions and bug reports to:

Alexander Medvedev unclesam@ixbt.com

Philipp Gerasimov philger@mail.wplus.net

Copyrights

The names of actual companies and products mentioned herein may be the trademarks of their respective owners.