The following tests were carried out using the Intel GPA Frame Analyzer and Monitor . Although I don’t have an Intel card it makes it possible to instrument the DX9 API and have a look how FSX works under the hood. I previously tried the AMD equivalent but never really got it to work. The later versions of the AMD tool don’t support DX9, an old version did but had to be hacked to install on Windows 7 and then never seemed to work.
The results were not what I expected.
I took off in the default Cessna and paused it somewhere near the Millennium Dome. It was in a dive so that there wasn’t much sky and I ensured that I could see all the way to Heathrow. I saved the flight so I could reload at will.
My interest was in how it painted the screen not how it dealt with movement and texture loading, so using a paused flight seemed the best way to see how a frame was drawn.
What I then did was to alter the sliders for Scenery and Autogen , load the saved flight and observe using the Intel HUD how many drawcalls etc occurred displaying the paused screen. I also saved each frame for later inspection with the Frame Analyzer.
Throughout this process I kept the Level of Detail range as Medium – I later experimented altering that too.
|None||Sparse||Normal||Dense||Very dense||Extremely dense|
Test 1 Autogen None Scenery Very Sparse
284 Draw Calls per frame
Bit dull though! That’s our baseline.
So that’s drawing the sky, river, terrain sky a few clouds and the 3D cockpit.
It took about 12ms to draw the frame (that’s with tracing on mind).
The cockpit was very good only 8 draw calls and about 1ms.
The terrain took 214 drawcalls and 10.3ms (PS=2.7 VS= 2.3)
The clouds were 8 drawcalls and 0.4ms (PS =0.057, VS=0.038)
The rest was drawing the window and the on screen text
NOTE: This is only looking at the drawing and preparation commands that occur per frame – the texture buffer loading is costed elsewhere (I think) and probably the vertex buffers.
Test 2 Autogen None Scenery Sparse
337 Draw Calls per frame
Ok we gain the Millenium dome , a power station Tower bridge, The London Eye, Houses of Parliament and Heathrow Airport
There are other buildings following the river to the left – I think about 18 or 19 in total.
Millenium dome is one drawcall.
Tower bridge seems to be one too.
I can find at least 4 for the power stations – one for each chimney which seems odd – where is the famous batching??
Test 3 Autogen None Scenery Normal
480 Draw Calls per frame
We gain Canary Wharf, CentrePoint , Nat West Tower (whatever its called now) and some other buildings.
Test 4 Autogen None Scenery Dense
We gained a cargo ship –in London really? Quite a few office blocks – including two grim white ones straight ahead. Do they exist?
787 Draw Calls per frame
Its worth mentioning that the frame time is (only) 17ms.
Its 3 draw calls for the two white flats and 3 for the brown building to their right. Its one drawcall for the sides of both white buildings, one for the two roofs and base of one; the base of the other is a separate call?!
The ship is 1546 primitives but only one draw call at least.
Test 5 Autogen None Scenery Very Dense
1090 Draw Calls per frame
More buildings…mainly in the distance. We have added 210 drawcalls mainly for the bunch of small grey squares in the distance?! I guess that’s life.
If you notice there are two blocks of flats in the foreground – these were 3 drawcalls as the tops are different whilst the sides are the same.
Test 6 Autogen None Scenery Extremely Dense
1106 Draw Calls per frame
I cannot tell the difference
Test 7 Autogen Sparse Scenery Dense
Compare this with test 4 – I wound back scenery for this test.
1609 Draw Calls per frame
Its added a lot. How efficiently though? Its taken 800 drawcalls Surely it only takes a few for the trees? I read how it batched up texture drawcalls didn’t I?
Here is one draw call highlighted in pink. There are 2 trees in foreground and 1 further back?
Here is another tree write which worked much much better
There are about 10 tree texture sheets used for the whole frame all 512×512. The contain images of 4-6 trees each.
Whats going on?
I think that the batching is organised per tile(cell) not sure of the exact terminology?
Which works eficiently for tiles like that shown above – a good clear view at a nice angle. However there is no batching across distant cells which may have one or two trees visible.
These writes are all trees (a tree has 4 primitives – triangles)
So we see 15 trees, 7 trees, 55 trees, 71 trees, then 3 trees, 7 trees, 26 then 1 then 1. Note that the time in micro seconds doesn’t vary linearly with the number of trees. The 76 trees in drawcall 1321 take 35ms compared with the single trees in drawcalls 1334 and 1335 which took about 26ms.