Add reshuffle_each_epoch argument to control mini-batch shuffling per… #99

Kyu3224 · 2025-06-25T05:08:02Z

Summary

This PR improves the mini-batch generator in the reinforcement learning training pipeline by introducing the reshuffle_each_epoch parameter. This parameter controls whether data indices are reshuffled at each epoch or kept fixed.

Motivation

In reinforcement learning, especially in PPO-style policy optimization, shuffling the training data indices at each epoch can improve generalization and reduce correlation between samples. However, some use cases require fixed mini-batch ordering across epochs to enable reproducible experiments and debugging. This PR introduces an explicit toggle to support both workflows.

Details

The reshuffle_each_epoch flag defaults to False to maintain deterministic iteration over mini-batches across epochs.
When reshuffle_each_epoch=True, mini-batch indices are reshuffled at the start of every epoch, enabling new mini-batch orders per epoch and improved generalization.

References

The reshuffle_each_epoch argument implemented here serves a role analogous to the shuffle parameter in PyTorch's torch.utils.data.DataLoader. Setting shuffle=True causes the data sampler to reshuffle dataset indices at the start of each epoch, which helps reduce model overfitting and improves generalization.

Similarly, this PR's reshuffle_each_epoch flag controls whether mini-batch indices are reshuffled every epoch (True), or fixed after the initial shuffle (False), providing flexibility in how training data is fed during reinforcement learning updates.

Testing

Verified that reshuffle_each_epoch=True produces new mini-batch orders per epoch.
Verified that reshuffle_each_epoch=False preserves mini-batch order across epochs.
All existing unit tests and pre-commit hooks pass successfully.

Please review and advise if further adjustments are necessary.

… epoch

ClemensSchwarke · 2025-07-16T15:25:36Z

Hi @Kyu3224,
Thanks a lot for your PR! Do you have evidence that this improves performance by any chance?

Kyu3224 · 2025-07-18T03:44:01Z

Thank you for your feedback. While I do not have the strong supporting evidence ready at this moment, I plan to prepare and share the relevant data within the next two weeks.

ClemensSchwarke · 2025-07-18T07:23:19Z

Awesome, looking forward to that!

Add reshuffle_each_epoch argument to control mini-batch shuffling per…

041c13c

… epoch

ClemensSchwarke force-pushed the main branch from 3c12abb to 830fa98 Compare July 18, 2025 09:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add reshuffle_each_epoch argument to control mini-batch shuffling per… #99

Add reshuffle_each_epoch argument to control mini-batch shuffling per… #99

Kyu3224 commented Jun 25, 2025

Uh oh!

ClemensSchwarke commented Jul 16, 2025

Uh oh!

Kyu3224 commented Jul 18, 2025

Uh oh!

ClemensSchwarke commented Jul 18, 2025

Uh oh!

Uh oh!

Add reshuffle_each_epoch argument to control mini-batch shuffling per… #99

Are you sure you want to change the base?

Add reshuffle_each_epoch argument to control mini-batch shuffling per… #99

Conversation

Kyu3224 commented Jun 25, 2025

Summary

Motivation

Details

References

Testing

Uh oh!

ClemensSchwarke commented Jul 16, 2025

Uh oh!

Kyu3224 commented Jul 18, 2025

Uh oh!

ClemensSchwarke commented Jul 18, 2025

Uh oh!

Uh oh!