I think we can safely assume that as new scenery appears Flight Simulator needs to store the vertices of objects within the world. Also as scenery gets closer Flight Simulator needs to increase the number of trees and possibly the detail of buildings (if saved as LOD enabled)
When using buffer pools
Flight Simulator allocates a number of large Dynamic Vertex Buffers each sized according to the PoolSize configuration parameter. I shall call these a shared buffer.
These are created in the D3D_DEFAULT POOL in the Video Card memory not the PC memory. However since they are dynamic and easily addressable by the PC the memory address space of the Fsx process may well be increased by the same amount – but this doesn’t mean that the data is stored in PC memory and that any PC memory is used by the allocation. This is a subtle point.
Objects that are bigger than RejectThreshold bypass this and get their own dedicated buffer. In this case sometimes the buffer is static and sometimes dynamic. I don’t know why.
Smaller objects go into the large shared vertex buffers and so have an offset into it.
Trying some longer flights (the logs are too large to post!) I observed that
Over the first 5 minutes or so the number of shared buffers increased – Starting above London they reached about 40 (each of 2MB) and then stabilised. I flew for a further half an hour and never saw another shared buffer created.
- None of the shared buffers were released and I didn’t see any DISCARD locks – all locks were NOOVERWRITE (or zero see below)
I therefore think that space within these buffers must be tracked and reused by Fsx itself – so that when an object goes out of view the space goes out of view the area of the shared buffer is added to some form of free list.
Using the Intel tool I noticed that the drawcalls are organised by texture. Drawcall batching only seems to operate within a single terrain cell. Within this they are then grouped by Shared Buffer. Looking at trees they seemed to use a small number of the buffers so I guess that there is some allocation strategy to buffers going on either by type of object or size.
Over the first 5 minutes of the flight I noticed some locks without flags e.g
00002095 62.69193268  STEVE!!! 3DDevice::CreateVertexBuffer BUF28 158400 8 0 0
00002096 62.69855881  STEVE!!! Vertex BufferLock BUF17 966144 65536 0
00002097 62.69995880  STEVE!!! 3DDevice::CreateVertexBuffer BUF29 265184 520 0 0
These were not frequent but will stall the CPU if the GPU is using the buffer. I cannot think of an explanation for this occurs!
That aside the strategy seems sound and RejectThreshold allows for fine tuning. What I cannot comment on is the efficiency of the space tracking algorithm within a shared buffer. If it is flawed it is most like to be affected by large allocations and so tuning RejectThreshold should help.
I don’t actually see any immediately obvious reason for larger shared buffers being less efficient in any way.
When not using buffer pools (UsePools =0)
When using BufferPools the balance between shared and dedicated buffers can be altered using RejectThreshold. UsePools=0 is effectively the same as reducing the threshold to zero. Every object then gets its own dedicated vertex buffer and there are no shared buffers at all.
Interestingly these dedicated buffers aren’t always static – as in the UsePools=1 scenario there is some reason that some of the buffers are static and some dynamic that isn’t immediately obvious to me.
This setting shouldn’t use more video memory than using BufferPools. Each object requires the space it needs plus some per buffer overhead. When using shared vertex buffers there is less buffer overhead but there will be empty space within each shared buffer. Depending on the memory allocation strategy and reporting tools the video card might report that the memory high water mark is higher but the used space shouldn’t be.
There is no reason (other than a bug in FSX or the Video card) for this setting to be less stable than using pools. There is no obvious race going on.
Using a shared buffer versus a dedicated buffer trades the FSX space allocation strategy against that of the video card plus the overhead of creating a buffer. If the video card is faster then that is a win which occurs when the object is created (comes into view!) or changes in size (LOD/Trees increase). The gain depends on elapsed time the two methods take. I have no proof that one is faster than the other.
When it comes to drawing an object then a dedicated buffer is sometime static which will be a performance gain. On the other hand there will be more buffer switching which is a negative. However if the GPU is much faster than the CPU then this doesn’t really matter – the GPU needs to be fast enough. On my PC UsePools=0 is slower on the GPU.
RejectThreshold with UsePools =1 permits the tuning of this balance between shared and dedicated buffers. UsePools = 0 just sets one extreme.