Improve useList efficiency #90

caleb-harrelson · 2020-11-13T07:30:12Z

Use the .once’s val() to determine the number of children, then batch that many calls to the child_added handler into one value dispatch, resulting in one additional render instead of one progressively slower render per child.
Fix shadowing of query variable, rename to ref instead.
Fix the ref.off calls to pass the right function to unsubscribe.
Fix dependencies on useEffect.
Fix some type references.

Fixes #74

Use the `.once`’s `val()` to determine the number of children, then batch that many calls to the `child_added` handler into one `value` dispatch, resulting in one additional render instead of one per child. Fix shadowing of `query` variable, use `ref` instead. Fix the `ref.off` calls to pass the right function to unsubscribe. Fix dependencies on useEffect. Fixes CSFrequency#74

caleb-harrelson · 2020-11-13T07:32:30Z

Note: since it looks like PRs can take some time to get merged, if anybody uses this code you'll also want the changes from PR #79 or you'll still get unnecessary renders.

Satyam · 2020-11-30T17:16:29Z

I was thinking ways to fix this myself. I was thinking on collecting the data from the value event and return the data from there and on the child_added don't even dispatch the events whose keys are already there.

As for the delays on maintenance you might want to follow #93 and see what comes out.

Satyam · 2020-12-08T16:48:22Z

A couple of thoughts:

Perhaps it would be good to enable this option with a fastInitialLoad option or some such, on useList and derived methods. Some developers might count on having the successive records trigger some display updates. After all, Firebase itself offers both mechanism, listen once to value or listen multiple times on child_added and I imagine they did this on purpose so we should offer both options as well. To keep backward compatibility, having this as an option would be better.

Instead of counting, wouldn't it be safer to have, besides returning the results of the initial load at once, the following line in addChild:

const addChild = (
  currentState: KeyValueState,
  snapshot: database.DataSnapshot,
  previousKey?: string | null
): KeyValueState => {
  if (!snapshot.key) {
    return currentState;
  }

  const { keys, values } = currentState;
// add this:
  if (keys.includes(snapshot.key)) {
    return currentState
  } 
// ...

Somehow I think this might also fix an issue I have with developing an application, where the hot reload of the changed source code does not seem to dispatch the {type:empty} action under certain circumstances. This results in duplicate records, which results in duplicate key attributes on some repeated components.

Also, it seems to me better to have this check done in the reducer itself rather than useList like you do with childrenToProcess

Thanks and regards

caleb-harrelson · 2020-12-08T19:55:55Z

Firebase itself offers both mechanism, listen once to value or listen multiple times on child_added and I imagine they did this on purpose so we should offer both options as well.

This library is an abstraction so that you don't have to worry about value and the various update events, you just get the whole list whenever it changes. If one cared about those events in particular they can still subscribe to them using firebase SDK directly, so I don't think it makes sense to provide an option here that intentionally renders once per child.

To keep backward compatibility, having this as an option would be better.

I consider the excessive renders to be a bug, not something that needs backward compatibility.

Instead of counting, wouldn't it be safer to have, besides returning the results of the initial load at once...

I've addressed that in my comment on #74, but the gist is that the object returned by value is not compatible with what this library is storing in state, so the values coming back from child_added must be used instead. The key.includes(snapshot.key) code you provided as an example also introduces a nested loop (an O(n²) inefficiency) on the first fetch, since the array must be searched every time an item is added.

And the counting is still safe because we know child_added will be fired that many times. If more children come in before that finishes they will go through the normal process for adding a child and trigger a re-render, but the first N children will not.

Satyam · 2020-12-09T10:59:30Z

As for making fast loading an option, the hook should offer the same features as the functionality it wraps, and individual calls is one option. Some UI designers prefer to have the users entertained with some screen action rather than waiting a fraction of a second with a static display while the data shows up. They go for the perception of speed rather than actual speed, and my guess is that is the reasoning behind offering both options on Firebase. I wouldn't go for that bogus perception option myself, but I wouldn't consider it a bug, just a silly option I wouldn't go for.

And, indeed, the count should work, the relevant sentence in https://firebase.google.com/docs/reference/js/firebase.database.Reference#child_added-event is:

This event will trigger once with the initial data stored at this location, and then trigger again each time the data changes.

I read it many times and still don't fully trust it, but that is just me. I feel they might just be warning developers that it would duplicate the data on the first value event, not that the order of those calls would be preserved if records are inserted concurrently by other users in between the value event and the multiple child_added. Are those initial child_added events made based on a frozen set of data as it was when value was fired? Are those later concurrent inserts be queued after the first batch of child_added events or would they be fired on a live, constantly updated data set? That is where my concern comes from.

caleb-harrelson · 2020-12-09T14:26:33Z

the hook should offer the same features as the functionality it wraps

I think it abstracts and simplifies rather than wraps and exposes firebase’s way of handling events. It effectively becomes an efficient value event that is internally managed, and the actualvalue event does not fire once for each child initially rendered.

Some UI designers prefer to have the users entertained with some screen action rather than waiting a fraction of a second with a static display while the data shows up.

It wasn’t a fraction of a second, at all. I had it fetching a list of 500 items and it took dozens of seconds to finish rerendering them all. The first child is rendered 500 times, the second 499, the third 498, etc. Instead of rendering children 500 times as it would with one initial load it renders children 125,250 times. It’s incredibly faster with the PR code.

The hook could expose variables for how many items exist and how many have been loaded for UI progress bars, but that will again cause one render per child (albeit a more efficient one since the child isn’t present yet).

Are those later concurrent inserts be queued after the first batch of child_added events or would they be fired on a live, constantly updated data set?

This shouldn’t matter. If 500 children exist on load and an event for child #501 is fired out of order while processing them then children 1-499 and 501 will exist in the first render followed immediately by a render with child 500 added (or whatever child is fired last).

Satyam · 2020-12-09T16:08:44Z

It wasn’t a fraction of a second, at all.

I was poking fun at the users who can't wait for their devices just to do something whom the UI designers have to cater to. My guess is that the Firebase guys fire so many events to satisfy that need. I know your PR saves more than a fraction of a second, orders of magnitude more than that! That is why I love it.

As for the soundness of the counting strategy, imagine this situation:

Table has records with keys 1, 3, 5, 7, 9.
value event fires with keys 1, 3, 5, 7, 9.
child_added fires for items 1 and 3. They get discarded, pending count is now 3.
Item with key 4 gets inserted. Now, here is where things might get tricky. Imagine the worst scenario:
child_added fires for items 4, 5, 7. They all get discarded, pending count is now down to zero.
child_addedis fired for item 9. It gets through.

The hook would ignore item 4 and report item 9 twice. Is this scenario possible? I don't know, nothing assures me it is not possible. For all it knows, it fired child_added once for each item, even for those already reported in the initial value event. That is what the questionable sentence I quoted above states. It doesn't say in which order it will fire them.

Obviously, the other alternative is that the child_added events for 1, 3, 5, 7, 9 get queued and will be fired in that order, and then the event for 4 gets queued, which is what might be expected. But then you might insert items into queues instead of just appending to them. Concurrency often wrecks our assumptions and it produces very sporadic errors that are almost impossible to test and trace.. I got my fingers burned more than once. With all the savings in renderings your PR provides, it might be worth spending a very tiny amount on keeping on the safe side.

caleb-harrelson · 2020-12-09T16:12:48Z

Using your example one of two things would happen:

If firebase preserves the order of events:

One render with value of [1, 3, 5, 7, 9]
Second render with value of [1, 3, 4, 5, 7, 9]

If firebase does not preserve the order of events:

One render with value of [1, 3, 4, 5, 7]
Second render with value of [1, 3, 4, 5, 7, 9]

In both cases the final render is correct.

caleb-harrelson · 2020-12-09T16:15:05Z

To expand on that a little bit, because it's using a count and not keys it doesn't matter which children are fired, just that that many children do get fired. Any children that fire the event afterward will cause one render per event, as expected.

Satyam · 2020-12-09T16:48:20Z

If firebase does not preserve the order of events:

One render with value of [1, 3, 4, 5, 7]
Second render with value of [1, 3, 4, 5, 7, 9]

No, that might not be the case. It might be:

One render with values for 1, 3, 5, 7, 9 (that would never change as the whole batch of records be loaded into the reducer state before 4 is inserted).
Second render with values 1, 3, 5, 7, 9, 9.

The item with key 4 would have never been inserted into the keys or values array as the child_added´ event would have been ignored. There is nothing preventing two items with the same key being sliced and spliced into either the keysorvalues` arrays.

The highlighted 9 in the second render would actually belong to the first child_added event received after the pending count reached 0 and so it got processed. The 9 next to it would be the item initially loaded into the list through the value event.

caleb-harrelson · 2020-12-09T17:04:05Z

How would 4 be ignored? And how would 9 appear twice? The initial data from value is not used, only its count is used, so it doesn’t matter what keys come in. Unless child_added gets fired twice with 9, it will only appear once. Likewise, 4 will not be ignored whether it’s part of the five in the initial load or the 6th after the initial load.

I welcome a test to prove the behavior you’re describing, but I do not think it is possible with the PR’s code.

Satyam · 2020-12-09T17:34:33Z

Finally, I got it. Sorry for all the trouble. Indeed, you are reading all the data on all the child_added events without discarding any. You are discarding the values in the initial value event after finding out how many of them are there.

Sorry and thanks for your patience.

caleb-harrelson · 2020-12-09T19:21:10Z

Glad it makes sense (:

And for some context on throwing away the value data: the library was already doing that, because the state it stores requires the DataSnapshot to be at the child-level, not at the parent-level like value provides. If not for that, the value data could be used and some memoization could prevent re-renders when all the child_added events fire. It's the way the library stores state that requires this roundabout way of loading the initial data.

Satyam · 2020-12-09T22:48:13Z

I think I have an idea that might help, at least it will compensate for all the trouble I gave you, so I hope. Somewhere above I said:

Also, it seems to me better to have this check done in the reducer itself rather than in useList

I was referring to childrenToProcess and all that happens around it. Now that I understand what it is you are doing, let me try to simplify it.

In the one time listener for the value event, dispatch a value action along the snapshot and, as you do now, subscribe to child_added, which was a smart thing to do on your part to ensure they always go in the same order.
On the child_added event handler, dispatch the added action along the snapshot and previous key and do nothing more.

Now, lets go to the reducer.

For the value action, get the length from the snapshot in the payload and store it in an extra childrenToProcess variable in the state.
For the added action,
- process the snapshot normally, whether it is one to be delayed or not.
- If the childrenToProcess is positive, decrement it.

At the end of the useList hook, instead of the current:

return [state.value.values, state.loading, state.error];

Do:

return [
   state.childrenToProcess  ? undefined: state.value.values, 
   state.loading, 
   state.error
]

What do you think, would that work? Isn't it simpler? It would make useList much shorter, it would not require a separate interim storage place to quietly hold the first records (the children array in useList) and the setValue function in the reducer would disappear.

I am sorry I cannot test any of this as I seem to be one of the affected by this issue. Opened more than a year ago and still active as recently as 3 weeks ago with 52 replies, I tried all that has been suggested there (except downgrading the nodejs version) and still can't make it work. It works if I simply switch to a newer Firebase, which is the package that depends on this gRpc, but then I would be testing a completely different package, so I'm stuck there. I am still unsure if that change to current firebase 8.x.x would make any difference, but I was trying to go with the original as it is now. It seems to use only the type definitions from firebase and those don't seem to be any different so I should be fine.

Satyam · 2020-12-10T06:58:30Z

I think I have an idea that might help, at least it will compensate for all the trouble I gave you, so I hope. ...
...

No, it would not. It would help is I kept my mouth shut.

caleb-harrelson · 2020-12-10T16:45:31Z

lol, I enjoy debates and code reviews, it's fine :p

I think the event is the right place to handle it though, rather than the reducer. You're always welcome to implement the reducer method as a PR and the maintainer, should they ever return, can choose which they prefer.

Personally, I think I'll be switching over to reactfire due to the stagnancy on this repo.

chrisbianca · 2020-12-14T18:01:24Z

@caleb-harrelson thanks for the submission and apologies that it's taken longer than I'd like to review this properly. In short, it's been a tough year!

Previously, Firebase would only fire the value event once all the child_added events had finished firing, so the library was designed to maintain a loading state until the value event had been received. This would have prevented re-renders, providing that the code making use of the hook was monitoring the loading state.

For this PR to work correctly, the behaviour must have changed as otherwise the child_added events would have already been missed, so the hook would sit there waiting indefinitely.

I'll do some testing, but if this behaviour has changed, then this is a nice way to resolve the issue.

caleb-harrelson · 2020-12-14T18:08:04Z

@chrisbianca thanks for coming back and injecting some life into the repo! Here's to a better 2021.

chrisbianca · 2020-12-16T07:21:58Z

@caleb-harrelson thanks again for this submission. I've done some testing and identified one change that was needed to ensure that the hook correctly updated if the supplied list reference changed. I'll get this merged now and plan on releasing a version 3.0.0 at some point in the next week once I've been through and addressed some more of the issue list.

achuinard · 2021-02-03T04:16:29Z

@chrisbianca is this not live in the main package yet? Would be useful for me.

achuinard · 2021-02-07T01:55:10Z

@caleb-harrelson @chrisbianca I forked the repo because I wanted to use the latest code in my app (no release has been made yet) but I noticed some bugs with the new library. Specifically, if I have a list with 0 values and write to it quickly and also subscribe to it on the screen at the same time, that 1 item never comes through. If I restart the app it shows up. Guessing something could be off with the joint once / child_added logic. For now though I must stick with the latest npm release and wanted to bring this to your attention.

chrisbianca · 2021-02-13T08:06:23Z

@achuinard thanks for the report. You're right, there was an issue with initialisation when the list is initially empty. I've just pushed up a fix to master - would you be able to give it a try to confirm that this fixes your issue? If so, I'll look at getting this cut into a release

chrisbianca · 2021-03-02T08:42:44Z

This has now been released in https://github.com/CSFrequency/react-firebase-hooks/releases/tag/v3.0.1

caleb-harrelson added 2 commits November 13, 2020 01:14

Fix type references

dfdee4a

chrisbianca and others added 3 commits December 11, 2020 07:26

Merge branch 'master' into useList-efficiency

2094d86

Update useListReducer.ts

1bd4391

Update useList.ts

34ca031

chrisbianca and others added 2 commits December 16, 2020 07:09

Fix an issue with the hook not updating if the query ref changes

49abf45

Merge branch 'master' into useList-efficiency

225aa0d

chrisbianca merged commit a1f2b64 into CSFrequency:master Dec 16, 2020

chrisbianca added a commit that referenced this pull request Feb 13, 2021

Fix initialisation issues with an empty list #90

e31a581

Improve useList efficiency #90

Improve useList efficiency #90

Uh oh!

Conversation

caleb-harrelson commented Nov 13, 2020

Uh oh!

caleb-harrelson commented Nov 13, 2020

Uh oh!

Satyam commented Nov 30, 2020

Uh oh!

Satyam commented Dec 8, 2020

Uh oh!

caleb-harrelson commented Dec 8, 2020

Uh oh!

Satyam commented Dec 9, 2020

Uh oh!

caleb-harrelson commented Dec 9, 2020

Uh oh!

Satyam commented Dec 9, 2020

Uh oh!

caleb-harrelson commented Dec 9, 2020

Uh oh!

caleb-harrelson commented Dec 9, 2020

Uh oh!

Satyam commented Dec 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

caleb-harrelson commented Dec 9, 2020

Uh oh!

Satyam commented Dec 9, 2020

Uh oh!

caleb-harrelson commented Dec 9, 2020

Uh oh!

Satyam commented Dec 9, 2020

Uh oh!

Satyam commented Dec 10, 2020

Uh oh!

caleb-harrelson commented Dec 10, 2020

Uh oh!

chrisbianca commented Dec 14, 2020

Uh oh!

caleb-harrelson commented Dec 14, 2020

Uh oh!

chrisbianca commented Dec 16, 2020

Uh oh!

achuinard commented Feb 3, 2021

Uh oh!

achuinard commented Feb 7, 2021

Uh oh!

chrisbianca commented Feb 13, 2021

Uh oh!

chrisbianca commented Mar 2, 2021

Uh oh!

Uh oh!

Satyam commented Dec 9, 2020 •

edited

Loading