Buffer Pools Part 2

I think we can safely assume that as new scenery appears Flight Simulator needs to store the vertices of objects within the world. Also as scenery gets closer Flight Simulator needs to increase the number of trees and possibly the detail of buildings (if saved as LOD enabled)

When using buffer pools

Flight Simulator allocates a number of large Dynamic Vertex Buffers each sized according to the PoolSize configuration parameter. I shall call these a shared buffer.

These are created in the D3D_DEFAULT POOL in the Video Card memory not the PC memory. However since they are dynamic and easily addressable by the PC the memory address space of the Fsx process may well be increased by the same amount – but this doesn’t mean that the data is stored in PC memory and that any PC memory is used by the allocation. This is a subtle point.

Objects that are bigger than RejectThreshold bypass this and get their own dedicated buffer. In this case sometimes the buffer is static and sometimes dynamic. I don’t know why.

Smaller objects go into the large shared vertex buffers and so have an offset into it.

Trying some longer flights (the logs are too large to post!) I observed that

  • Over the first 5 minutes or so the number of shared buffers increased – Starting above London they reached about 40 (each of 2MB) and then stabilised. I flew for a further half an hour and never saw another shared buffer created.
  • None of the shared buffers were released and I didn’t see any DISCARD locks – all locks were NOOVERWRITE (or zero see below)

I therefore think that space within these buffers must be tracked and reused by Fsx itself – so that when an object goes out of view the space goes out of view the area of the shared buffer is added to some form of free list.

Using the Intel tool I noticed that the drawcalls are organised by texture. Drawcall batching only seems to operate within a single terrain cell. Within this they are then grouped by Shared Buffer. Looking at trees they seemed to use a small number of the buffers so I guess that there is some allocation strategy to buffers going on either by type of object or size.

Over the first 5 minutes of the flight I noticed some locks without flags e.g

00002095    62.69193268    [4148] STEVE!!! 3DDevice::CreateVertexBuffer BUF28 158400 8 0 0

00002096    62.69855881    [4148] STEVE!!! Vertex BufferLock BUF17 966144 65536 0

00002097    62.69995880    [4148] STEVE!!! 3DDevice::CreateVertexBuffer BUF29 265184 520 0 0

These were not frequent but will stall the CPU if the GPU is using the buffer. I cannot think of an explanation for this occurs!

That aside the strategy seems sound and RejectThreshold allows for fine tuning. What I cannot comment on is the efficiency of the space tracking algorithm within a shared buffer. If it is flawed it is most like to be affected by large allocations and so tuning RejectThreshold should help.

I don’t actually see any immediately obvious reason for larger shared buffers being less efficient in any way.

When not using buffer pools (UsePools =0)

When using BufferPools the balance between shared and dedicated buffers can be altered using RejectThreshold. UsePools=0 is effectively the same as reducing the threshold to zero. Every object then gets its own dedicated vertex buffer and there are no shared buffers at all.

Interestingly these dedicated buffers aren’t always static – as in the UsePools=1 scenario there is some reason that some of the buffers are static and some dynamic that isn’t immediately obvious to me.

This setting shouldn’t use more video memory than using BufferPools. Each object requires the space it needs plus some per buffer overhead. When using shared vertex buffers there is less buffer overhead but there will be empty space within each shared buffer. Depending on the memory allocation strategy and reporting tools the video card might report that the memory high water mark is higher but the used space shouldn’t be.

There is no reason (other than a bug in FSX or the Video card) for this setting to be less stable than using pools. There is no obvious race going on.

Conclusions

Using a shared buffer versus a dedicated buffer trades the FSX space allocation strategy against that of the video card plus the overhead of creating a buffer. If the video card is faster then that is a win which occurs when the object is created (comes into view!) or changes in size (LOD/Trees increase). The gain depends on elapsed time the two methods take. I have no proof that one is faster than the other.

When it comes to drawing an object then a dedicated buffer is sometime static which will be a performance gain. On the other hand there will be more buffer switching which is a negative. However if the GPU is much faster than the CPU then this doesn’t really matter – the GPU needs to be fast enough. On my PC UsePools=0 is slower on the GPU.

RejectThreshold with UsePools =1 permits the tuning of this balance between shared and dedicated buffers. UsePools = 0 just sets one extreme.

About stevefsx

I don't use FSX that much. But I am very annoyed when it doesn't work properly!
This entry was posted in BufferPools. Bookmark the permalink.

3 Responses to Buffer Pools Part 2

  1. Tim Stewart says:

    Thank you for these observations. There is SO much bad, patently false information being spread through the community regarding bufferpools, it is almost comical. It’s touted as the holy grail of sim performance tweaks, and everyone has their own magic recipe of pool sizes and thresholds, while others insist that turning them off will gain you better than 30% increased frame rate.

    It would be really nice to find a guide somewhere, written by someone who knows what he’s talking about, that can actually accurately inform us laymen as to the most efficient pool size and threshold values to use for a given rig, considering CPU specs, GPU ram/speed, etc.

    Unfortunately I haven’t found anything that wasn’t written by a tweak fanatic who doesn’t have a clue what he’s actually doing.

    This at least tells us what the settings do (and don’t do) which is a tremendous leap in the right direction.

    So thanks again!

  2. dellycowboy says:

    CHIK I have a faster system I7-4770K / 780GTX and I use the following

    [BufferPools]
    RejectThreshold=71072

    Try that and if it works great, but if you get image glitches raise RejectThreshold until stable.

    Great article Steve.

  3. Bill Honcoop says:

    Hello Steve

    I have read over your topic above and honestly, it just makes me dizzy! How in the world do you comprehend such complexities? If there is a more simple way for me to understand what my settings should be can you suggest what numbers are best suited for my setup?
    My specs are:
    Asus P6X58D Premium LGA 1366
    Intel i7 960 Bloomfield 3.2G-OC-4.5
    Noctua NH-D14 120mm & 140mm SSO CPU Cooler
    EVGA GTX680 Super-clocked
    6G Mushkin Ridgeback DDR3 @ 1600 (8-8-8-24
    ….Also I found these numbers but I cannot remember where. Can you explain in laymen’s terms what they are and how to use them properly?
    [BufferPools]
    UsePools=1
    PoolSize=8388608 //PoolSize //8388608 //5242880 5M //10485760 10M //20971520 20M
    RejectThreshold=262144 //RejectThreshold //131072 //262144 //524288 //786432 //1048576

    Thanks very much

    CHIK

Comments are closed.