Autoexposure example restoration #137

devshgraphicsprogramming · 2024-08-13T22:08:36Z

No description provided.

…Examples-and-Tests into autoexposure_ex

…update luma DSes

devshgraphicsprogramming · 2024-08-13T22:10:43Z

26_Autoexposure/app_resources/common.hlsl

+    float meteringWindowScaleX, meteringWindowScaleY;
+    float meteringWindowOffsetX, meteringWindowOffsetY;
+    float lumaMin, lumaMax;
+    uint32_t sampleCountX, sampleCountY;
+    uint32_t viewportSizeX, viewportSizeY;
+    uint64_t lumaMeterBDA;


you should pack these up intro structs in nbl::hlsl::luma_meter and friends

P.S. I see you already have, why not use things like nbl::hlsl::luma_meter::LumaMeteringWindow directly here?

devshgraphicsprogramming · 2024-08-16T12:42:44Z

26_Autoexposure/main.cpp

+					params.shader.entryPoint = "main";
+					params.shader.entries = nullptr;
+					params.shader.requireFullSubgroups = true;
+					params.shader.requiredSubgroupSize = static_cast<IGPUShader::SSpecInfo::SUBGROUP_SIZE>(5);


this needs to be set from m_physicalDevice->getLimits().maxSubgroupSize, not just set to 32

devshgraphicsprogramming · 2024-08-16T12:43:50Z

26_Autoexposure/main.cpp

+			m_lumaPresentDS[0] = lumaPresentPool->createDescriptorSet(core::smart_refctd_ptr(lumaPresentDSLayout));
+			m_lumaPresentDS[1] = lumaPresentPool->createDescriptorSet(core::smart_refctd_ptr(lumaPresentDSLayout));
+			if (!m_lumaPresentDS[0] || !m_lumaPresentDS[1])


why are you double buffering descriptor sets ? the only reason old example did it was because of round robining the luma meter output SSBOs.

Now we have BDA in Push Cosntants

devshgraphicsprogramming · 2024-08-16T12:48:43Z

26_Autoexposure/main.cpp

+					assert(allocation->memory.get() == buffer->getBoundMemory().memory);
+				};
+
+				build_buffer(m_device, &m_lumaGatherAllocation, buffer, m_physicalDevice->getLimits().maxSubgroupSize, "Luma Gather Buffer");


you seem to have forgotten that there's a sizeof(uint32_t)* needed on your maxsubgroup size.

devshgraphicsprogramming · 2024-08-16T12:54:17Z

26_Autoexposure/main.cpp

+		// Allocate and Leave 1/4 for image uploads, to test image copy with small memory remaining
+		{
+			uint32_t localOffset = video::StreamingTransientDataBufferMT<>::invalid_value;
+			uint32_t maxFreeBlock = m_utils->getDefaultUpStreamingBuffer()->max_size();
+			const uint32_t allocationAlignment = 64u;
+			const uint32_t allocationSize = (maxFreeBlock / 4) * 3;
+			m_utils->getDefaultUpStreamingBuffer()->multi_allocate(std::chrono::steady_clock::now() + std::chrono::microseconds(500u), 1u, &localOffset, &allocationSize, &allocationAlignment);
+		}


you don't need this, this is test code from ex 24

devshgraphicsprogramming · 2024-08-16T12:56:16Z

26_Autoexposure/main.cpp

+			IGPUDescriptorSet::SDescriptorInfo info1 = {};
+			info1.info.image.imageLayout = IImage::LAYOUT::READ_ONLY_OPTIMAL;
+			info1.desc = m_gpuImgView;
+
+			IGPUDescriptorSet::SDescriptorInfo info2 = {};
+			info2.info.image.imageLayout = IImage::LAYOUT::READ_ONLY_OPTIMAL;
+			info2.desc = m_gpuImgView; // FIXME: temporarily pass in input image
+
+			IGPUDescriptorSet::SWriteDescriptorSet writeDescriptors[] = {
+				{
+					.dstSet = m_lumaPresentDS[0].get(),
+					.binding = 0,
+					.arrayElement = 0,
+					.count = 1,
+					.info = &info1
+				},
+				{
+					.dstSet = m_lumaPresentDS[1].get(),
+					.binding = 0,
+					.arrayElement = 0,
+					.count = 1,
+					.info = &info2
+				}
+			};
+
+			m_device->updateDescriptorSets(2, writeDescriptors, 0, nullptr);


see https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/pull/137/files#r1719794211

devshgraphicsprogramming · 2024-08-16T12:58:00Z

26_Autoexposure/main.cpp

+	// We do a very simple thing, display an image and wait `DisplayImageMs` to show it
+	inline void workLoopBody() override
+	{
+		const uint32_t SubgroupSize = m_physicalDevice->getLimits().maxSubgroupSize;


make this a member, its painful to see it being initialized multiple times

devshgraphicsprogramming · 2024-08-16T12:59:49Z

26_Autoexposure/main.cpp

+			auto cmdbuf = m_computeCmdBufs[0].get();
+			cmdbuf->reset(IGPUCommandBuffer::RESET_FLAGS::NONE);


I'd start the example same as the others with the frames in flight (ex24 doesn't do FIF because it displays the same image for a long time), so proper round robining of multiple commandbuffers, blocks and waits before resets

devshgraphicsprogramming · 2024-08-16T13:01:13Z

26_Autoexposure/main.cpp

+				.meteringWindowScaleX = MeteringWindowScale[0] * m_gpuImg->getCreationParameters().extent.width,
+				.meteringWindowScaleY = MeteringWindowScale[1] * m_gpuImg->getCreationParameters().extent.height,
+				.meteringWindowOffsetX = MeteringWindowOffset[0] * m_gpuImg->getCreationParameters().extent.width,
+				.meteringWindowOffsetY = MeteringWindowOffset[1] * m_gpuImg->getCreationParameters().extent.height,


why are you still calling it a scale if its a literal dimension!?

Also if you're using UV normalized coordinates via Sample() with float coords and explicit lod level, they should stay as they are and you shouldn't care about the input image extents!

devshgraphicsprogramming · 2024-08-16T13:03:22Z

26_Autoexposure/main.cpp

+				.sampleCountY = SampleCount[1],
+				.viewportSizeX = m_gpuImg->getCreationParameters().extent.width,
+				.viewportSizeY = m_gpuImg->getCreationParameters().extent.height,
+				.lumaMeterBDA = m_lumaGatherBDA


you need two addresses to ping-pong between, one that you write current luma data to, and one you clear for a "future" dispatch

First frame's luma dispatch writes to A, clears B
Second frame's luma dispatch writes to B, clears A

Note that its not possible to use the tonemapping shader to clear the same buffer, as one workgroup may still be running while another quits and without an extra atomic (which would have to be cleared) you can't know which workgroup is the "last to quit"

devshgraphicsprogramming · 2024-08-16T13:05:27Z

26_Autoexposure/main.cpp

+			cmdbuf->bindComputePipeline(m_lumaMeterPipeline.get());
+			cmdbuf->bindDescriptorSets(nbl::asset::EPBP_COMPUTE, m_lumaMeterPipeline->getLayout(), 0, 1, &ds); // also if you created DS Set with 3th index you need to respect it here - firstSet tells you the index of set and count tells you what range from this index it should update, useful if you had 2 DS with lets say set index 2,3, then you can bind both with single call setting firstSet to 2, count to 2 and last argument would be pointet to your DS pointers
+			cmdbuf->pushConstants(m_lumaMeterPipeline->getLayout(), IShader::E_SHADER_STAGE::ESS_COMPUTE, 0, sizeof(pc), &pc);
+			cmdbuf->dispatch(1 + (SampleCount[0] - 1) / SubgroupSize, 1 + (SampleCount[1] - 1) / SubgroupSize);


this is backwards, you should get your sample count from workgroup dispatch count, not the other way round

devshgraphicsprogramming · 2024-08-16T13:41:07Z

26_Autoexposure/main.cpp

+			m_EV = 0.0f;
+			for (int index = 0; index < SubgroupSize; index++) {
+				m_EV += static_cast<float>(buffData[index]) / (log2(LumaMinMax[1]) - log2(LumaMinMax[0])) + log2(LumaMinMax[0]);
+			}
+			m_EV /= (SampleCount[0] * SampleCount[1]);


see the discord conversation, the per-workgroup-averages need to be in the [0,leftoverBits) range already

then your decode step is

for () ev += static_cast<float>(buffData[index]) * (log2(LumaMinMax[1]) - log2(LumaMinMax[0])) / ((1<<fixedPointBitsLeft)-1) + log2(LumaMinMax[0]); ev /= float(workgroupCount[0]*workgroupCount[1]);

devshgraphicsprogramming · 2024-08-16T13:41:56Z

26_Autoexposure/main.cpp

+			auto pc = AutoexposurePushData
+			{
+				.meteringWindowScaleX = MeteringWindowScale[0] * m_gpuImg->getCreationParameters().extent.width,
+				.meteringWindowScaleY = MeteringWindowScale[1] * m_gpuImg->getCreationParameters().extent.height,
+				.meteringWindowOffsetX = MeteringWindowOffset[0] * m_gpuImg->getCreationParameters().extent.width,
+				.meteringWindowOffsetY = MeteringWindowOffset[1] * m_gpuImg->getCreationParameters().extent.height,
+				.lumaMin = LumaMinMax[0],
+				.lumaMax = LumaMinMax[1],
+				.EV = m_EV,
+				.sampleCountX = SampleCount[0],
+				.sampleCountY = SampleCount[1],
+				.viewportSizeX = m_gpuImg->getCreationParameters().extent.width,
+				.viewportSizeY = m_gpuImg->getCreationParameters().extent.height,
+				.lumaMeterBDA = m_lumaGatherBDA


if you're overdeclaring push constants to share the layout between two pipelines, then only build them once

devshgraphicsprogramming · 2024-08-16T13:42:50Z

26_Autoexposure/main.cpp

+			auto queue = getGraphicsQueue();
+			auto cmdbuf = m_graphicsCmdBufs[0].get();
+			cmdbuf->reset(IGPUCommandBuffer::RESET_FLAGS::NONE);


same as https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/pull/137/files#r1719810297

devshgraphicsprogramming · 2024-08-16T13:50:06Z

26_Autoexposure/main.cpp

+				// we don't need to wait for the transfer semaphore, because we submit everything to the same queue
+				const IQueue::SSubmitInfo::SSemaphoreInfo acquired[1] = { {
+					.semaphore = acquire.semaphore,
+					.value = acquire.acquireCount,
+					.stageMask = PIPELINE_STAGE_FLAGS::NONE
+				} };


you need to wait for the compute queue semaphore (from the luma metering) as well as the acquire

because eventually you'll compute the EV in the tonemapping pipeline and stop doing any host wait on the compute queue.

devshgraphicsprogramming · 2024-08-16T13:50:24Z

26_Autoexposure/main.cpp

+			// Wait for completion
+			{
+				const ISemaphore::SWaitInfo cmdbufDonePending[] = {
+					{
+						.semaphore = m_presentSemaphore.get(),
+						.value = m_submitIx
+					}
+				};
+				if (m_device->blockForSemaphores(cmdbufDonePending) != ISemaphore::WAIT_RESULT::SUCCESS)
+					return;
+			}


time to remove this and do Frames in Flight

devshgraphicsprogramming · 2024-08-16T13:54:15Z

26_Autoexposure/main.cpp

+	constexpr float Key = 0.18;
+	auto params = ToneMapperClass::Params_t<TMO>(Exposure, Key, 0.85f);
+	{
+		params.setAdaptationFactorFromFrameDelta(0.f);


we used this basically
https://en.wikipedia.org/wiki/Exponential_smoothing

to blend the EV value over time

0 was a special value to "reset" the exposure (allow it to use the instantaneous value)

…Examples-and-Tests into autoexposure_ex

devshgraphicsprogramming · 2024-10-01T07:21:27Z

Whats the status on this?

nipunG314 and others added 28 commits July 19, 2024 22:07

Add 26_Autoexposure

345ac7b

Change 26_Autoexposure to SimpleWindowedApplication

87d4794

Build a staging buffer and upload exr image

7a5ea7c

Merge branch 'master' of github.com:Devsh-Graphics-Programming/Nabla-…

5026a64

…Examples-and-Tests into autoexposure_ex

Init surface and create the swapchain

5d63d04

Load shaders and create the pipeline for full screen triagnle

640e6a3

Merge branch 'master' of github.com:Devsh-Graphics-Programming/Nabla-…

0148f60

…Examples-and-Tests into autoexposure_ex

Set window size according to loaded image

d69a111

Stop running if window is closed

54bf38f

Acquire swapchain image and present uploaded image to it

461efd3

Set window size directly and use that for swapchain rendering

734fea9

Merge branch 'master' of github.com:Devsh-Graphics-Programming/Nabla-…

fc9b0bb

…Examples-and-Tests into autoexposure_ex

m_computeSubgroupSize

4a11724

Merge branch 'master' of github.com:Devsh-Graphics-Programming/Nabla-…

734b887

…Examples-and-Tests into autoexposure_ex

Allocate buffer for gathered luma values

7d4895a

Create gpu resources for all passes

0e3e125

Create shaders and pipelines

cef80b3

Allocate and create texture for tonemapping

15e489f

Create separate ds for luma and present

c646c7d

Record luma meter commands

36d7097

Merge branch 'master' of github.com:Devsh-Graphics-Programming/Nabla-…

3a70977

…Examples-and-Tests into autoexposure_ex

fix layout issues with compute pipeline in 26_Autoexposure example + …

6addbf1

…update luma DSes

Create two sets from common lumaPresentLayout correctly

8434f20

Create compute and graphics resources separately and finish luma meter

bf08caa

Fix descriptor binding for luma_meter

b342c6c

Create separate pipeline layouts for luma and present

817c4a7

Setup luma_meter.comp.hlsl

7f89542

Pass push constants

defd45e

devshgraphicsprogramming commented Aug 13, 2024

View reviewed changes

Record draw pass correctly

f6f8154

devshgraphicsprogramming commented Aug 16, 2024

View reviewed changes

nipunG314 added 13 commits August 20, 2024 18:27

Separate LumaMeteringWindow into a common header

dca49d2

Simplify luma_meter naming

9e28395

Merge branch 'master' of github.com:Devsh-Graphics-Programming/Nabla-…

8b6675b

…Examples-and-Tests into autoexposure_ex

Update luma examples to shared accessor api

18fae9f

Refactor tonemapping operators

9b31c2c

Simplify push constants and remove explicit sample counts

e987452

Infer sample count from viewportSize and simplify userspace HLSL

e135e43

Templatize float type and add toXYZ method to TexAccessor

57e49ae

Refactor the example into using a 2-compute, 1-fragment architecture

f8d50e8

Handle image layouts correctly

d3b5765

Simplify type

612f0f6

Wait for correct semaphore value

cb46d82

Remove unnecessary data members

1996cf3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoexposure example restoration #137

Autoexposure example restoration #137

devshgraphicsprogramming commented Aug 13, 2024

devshgraphicsprogramming Aug 13, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming Aug 16, 2024

devshgraphicsprogramming commented Oct 1, 2024

		auto cmdbuf = m_computeCmdBufs[0].get();
		cmdbuf->reset(IGPUCommandBuffer::RESET_FLAGS::NONE);

Autoexposure example restoration #137

Are you sure you want to change the base?

Autoexposure example restoration #137

Conversation

devshgraphicsprogramming commented Aug 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

devshgraphicsprogramming commented Oct 1, 2024