Skip to content

Fix performance of BatchedMesh removals and insertions #31468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

andreas-hilti
Copy link
Contributor

Related issue: #31465

Description

BatchedMesh keeps track of the deleted ids in _availableInstanceIds or _availableGeometryIds, respectively.
For every (!) call of addInstance/addGeometry these arrays are sorted.
This implies if you delete n entities, and add n entities again, you have at least a complexity of n^2 (even ignoring the log n for now).
As a user, I would rather expect a complexity of roughly n log n for this scenario.

To avoid this, two internal flags _availableInstanceIdsSorted and _availableGeometryIdsSorted are introduced, which keep track whether the sorting was already done and thus can be skipped.

For the two test cases described in the issue, this reduced the runtime from roughly 10s for each of them to 10 ms (test1) or 150ms (test2) on my laptop.
This should reduce the complexity in the scenario described above to n log n (as you'll perform a single sort plus the actual removal and insertion).

Copy link

github-actions bot commented Jul 21, 2025

📦 Bundle size

Full ESM build, minified and gzipped.

Before After Diff
WebGL 338.04
78.86
338.04
78.86
+0 B
+0 B
WebGPU 559.1
154.77
559.1
154.77
+0 B
+0 B
WebGPU Nodes 558.02
154.56
558.02
154.56
+0 B
+0 B

🌳 Bundle size after tree-shaking

Minimal build including a renderer, camera, empty scene, and dependencies.

Before After Diff
WebGL 469.29
113.56
469.29
113.56
+0 B
+0 B
WebGPU 634.39
171.81
634.39
171.81
+0 B
+0 B
WebGPU Nodes 589.52
161.12
589.52
161.12
+0 B
+0 B

@andreas-hilti andreas-hilti force-pushed the fix/batchedmesh_performance branch from 45c957f to 9c652bc Compare July 21, 2025 20:47
@andreas-hilti
Copy link
Contributor Author

Note that this doesn't address the following:

The variation of the second test which uses

	// this is slow:
	for (let boxGeometryId of boxGeometryIds){
	 	batchedMesh.deleteGeometry( boxGeometryId );
	}

instead of

        // this is quicker:
	for (let boxInstanceId of boxInstanceIds){
		batchedMesh.deleteInstance( boxInstanceId );
   	}
	batchedMesh.setInstanceCount(1)
	batchedMesh.setInstanceCount(n)
	for (let boxGeometryId of boxGeometryIds){
	 	batchedMesh.deleteGeometry( boxGeometryId );
	}

to remove the geometries (and all instances) is still slow (roughly 5s).

Background is that deleteGeometry always loops over all instances to find (and remove) instances of this particular geometry.

By manually removing the instances first and then pruning the used instances via setInstanceCount, this can be worked around.

A proper fix would probably need to keep track which instances are associated with which geometry, to avoid doing always linear searches.

@gkjohnson
Copy link
Collaborator

gkjohnson commented Jul 22, 2025

to remove the geometries (and all instances) is still slow (roughly 5s).

Can you please provide full context for these timings? I'm seeing that this is taking less than 1.5ms on my machine using the 50,000 instances count cited in #31465:

const mesh = new THREE.BatchedMesh( 100000, 5000, 10000 );
const gid = mesh.addGeometry( new THREE.SphereGeometry() );
for ( let i = 0; i < 50_000; i ++ ) {
  const id = mesh.addInstance( gid );
  // ...
}

console.time( 'delete geometry' );
mesh.deleteGeometry( gid );
console.timeEnd( 'delete geometry' );

@andreas-hilti
Copy link
Contributor Author

Can you please provide full context for these timings? I'm seeing that this is taking less than 1.5ms on my machine using the 50,000 instances count cited in #31465:

I'll provide more details later.

However, note the following:
#31465
test1 uses 50000 instances of a single geometry.
test2 uses 50000 instances of 50000 different geometries.

Removing the single geometry in test1 is fast (and the code snippet that you showed resembles test1, given that it uses a single geometry.
Removing all the geometries in test2 in the straightforward way is slow, because for each geometry removal you loop over all 50000 instances.

@gkjohnson
Copy link
Collaborator

gkjohnson commented Jul 23, 2025

test2 uses 50000 instances of 50000 different geometries.

Thats 2.5 trillion instances. If we're going to report performance issues it needs to be using realistic values. I don't think the case of deleting a geometry with so many instances is worth optimizing. I've looked at the test again and see you meant 50,000 instances and 50,000 geometries. This still feels like an unrealistically extreme case and it would be best to discuss how this is impacting a project in practice.

Can you explain your use case for deleting and then reinserting 50,000 instances, as well? Is this something you're actually doing in your code or just a large example number? What issues is it causing your project?

@andreas-hilti
Copy link
Contributor Author

Sorry if my description was misleading. Yes, I meant one instance for each geometry, so 50000 instances in total.

Can you explain your use case for deleting and then reinserting 50,000 instances, as well? Is this something you're actually doing in your code or just a large example number? What issues is it causing your project?

In general, this is just a (much) simplified example and I just had to put a reasonably large number to see the effect here as well.
But it is a realistic order of magnitude and it would also occur on lower numbers, just less extreme.

The rough summary is this: we use BatchedMeshes to visualize large finite element models. We have to decompose it into many small regions (and we have had cases where this ended up in 200k regions), each region corresponding to a geometry and thus, we had to batch them to keep the number of draw calls small.
This implies that the case of a single instance per geometry is the default case for us (at least in the current phase of the project).

We first noticed it when we did a full refresh of the scene, but kept the BatchMesh. We wanted to keep it, but arguably you'd be better off by starting from scratch.

However, we are also doing partial updates of the scene, in which case there are really good reasons to keep the BatchMesh. And dependening on what the user is requesting, it can result in many new/modified geometries. (Maybe one could overwrite geometries, but this isn't very clean and it doesn't align well with our code structure.)

(As a side-note, the BatchedMesh was really a key ingredient to make this possible at all, i.e. thanks for developing it!)

@gkjohnson
Copy link
Collaborator

gkjohnson commented Jul 23, 2025

Thanks for the explanation. It's always helpful to know the full context - I'm just thinking through the problem and whether there are other suitable solutions. I don't love that extra flags are added since it makes things a bit more difficult (they now need to be supported in toJSON, as well, for example) and depending on the order the user performs operations this would still be slow - deleting and then adding an instance immediately over and over again with a lot of unused instances instead of deleting them all and adding the new ones would cause the sort flag to be always be true and trigger the bad behavior:

// fast since sort happens once
for ( let i = 0; i < 50_000; i ++ ) mesh.deleteInstance( i );
for ( let i = 0; i < 50_000; i ++ ) mesh.addInstance( geometryId );

// slow if there are already a lot of unused instance ids
for ( let i = 0; i < 50_000; i ++ ) {
  mesh.deleteInstance( i );
  mesh.addInstance( geometryId );
}

I'm thinking it might be most simple to keep the "available ids" arrays always in a sorted order by just inserting the id correctly in the first place. It won't be as fast but doing a naive insert is more than an order of magnitude faster on my machine. The "sort" calls can then just be deleted:

deleteInstance( instanceId ) {
	// ...

	// keep the ids list always sorted
	let index = 0;
	const ids = this._availableInstanceIds;
	while ( index < ids.length && ids[ index ] <= instanceId ) {
		index ++;
	}

	ids.splice( index, 0, instanceId );
	return this;
}

This brings me down from total time of delete and add instances from ~13000ms to ~870ms. It, of course, also shifts a lot of the time spent to the delete function rather than the add function. Using something like a binary search brings it down further but I'd like to avoid adding something like that if a simpler solution is "good enough". Here's a rough table of timings for my machine using 50,000 instances:

unchanged lazy sort naive sorted insert binary search insert
deletion ~1ms ~1ms ~800ms ~100ms
addition ~13000ms ~100s ~70ms ~5ms
total ~13000ms ~102ms ~870ms ~105ms

My preference for now would probably be to use the naive sorted insert if it's good enough for your use case. Otherwise maybe the binary search insert is suitable. But generally keeping the array sorted will allow for doing this delete / add work over multiple frames or calling the functions in any order without worrying about triggering the sort path unnecessarily.

@andreas-hilti
Copy link
Contributor Author

Thank you for the detailed investigation.

I wasn't aware of the json export and in particular, that there is BatchMesh specific code outside of this file (Object3D.js/ObjectLoader.js). But I guess you could still make this compatible with the lazy sort.

I also agree with your finding that the scenario "deleting and then adding an instance immediately over and over again with a lot of unused instances" is not solved by my approach.

Furthermore, I do understand that you don't want to make it overly complicated for maintenance reasons.
However, I tend to think we should not replace one solution with quadratic complexity with another one with the same complexity (with probably lower constant) and I don't like the fact that deletion would become significantly slower.

Note that in your approach "array always sorted", a binary search brings down the complexity of the find to log n, however, the insert operation using splice is still order n. I tend to think this would only be a clean solution only if we used a data structure that guaranteed that arbitrary inserts (and removals) are order log n; I'm thinking here of a binary heap.
(As a side-note, this would also allow to replace the shift (with complexity n) with a log n operation.)

But let us maybe also take a step back, what is the reason that these arrays need to be sorted?
Do I miss anything if I say that for almost all BatchedMesh operations it doesn't matter which of the unused identifiers is picked?
The only exception that I see is if you want to shrink the maximum instance count; there you try to remove the instances with the higher ids. So would it be an option to drop this requirement or to make it optional?

@andreas-hilti
Copy link
Contributor Author

andreas-hilti commented Jul 26, 2025

I'm aware that you most probably wouldn't want to add a dependency, but this would demonstrate the performance when using a heap:
dev...andreas-hilti:three.js:fix/batchedmesh_performance_heap
Maybe you could rerun your tests such that we would have comparable numbers.
(For me, the third test, which corresponds to the scenario "deleting and then adding an instance immediately over and over again with a lot of unused instances", reduces from roughly 23s (dev) to 80ms (using the heap).)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants