Skip to content

fix: Enforce input length to match the modulus #1905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vadorovsky
Copy link
Contributor

@vadorovsky vadorovsky commented Aug 7, 2025

To avoid misunderstandings related to padding in Poseidon hash inputs, enforce them to have size 32, which is the number of bytes in Fr modulus.

Summary by CodeRabbit

  • New Features

    • Added trait-based integer-to-byte-array support for isize and usize.
  • Bug Fixes

    • Enforced strict 32-byte input validation for hashing to prevent invalid inputs.
    • Standardized integer-to-byte-array conversions and introduced explicit padding to ensure consistent hash inputs.
  • Tests

    • Updated tests and utilities to use the new conversion and aligned byte representations.

Copy link
Contributor

coderabbitai bot commented Aug 7, 2025

Walkthrough

Standardizes integer/BigInt byte serialization to ToByteArray::to_byte_array().unwrap() across tests and utilities, implements ToByteArray for isize/usize, pads certain 31-byte data to 32 bytes before hashing in two tests, and adds unconditional 32-byte input validation in Poseidon’s hashv.

Changes

Cohort / File(s) Change Summary
Integer/BigInt → ToByteArray (tests & utils)
program-libs/compressed-account/src/compressed_account.rs, program-libs/indexed-merkle-tree/src/array.rs, program-libs/indexed-merkle-tree/tests/tests.rs, program-tests/utils/src/mock_batched_forester.rs, program-tests/utils/src/test_batch_forester.rs, prover/client/src/proof_types/batch_update/proof_inputs.rs, sdk-libs/macros/tests/hasher.rs, program-libs/batched-merkle-tree/tests/merkle_tree.rs, program-tests/merkle-tree/tests/indexed.rs, prover/client/tests/batch_update.rs
Replaced .to_be_bytes() / BigInt .to_bytes_be() usages with ToByteArray::to_byte_array().unwrap() and added imports; callsites now pass trait-produced byte arrays (commonly 32 bytes) into Poseidon hash calls.
Extend ToByteArray impls
program-libs/hasher/src/to_byte_array.rs
Added ToByteArray implementations for isize and usize via existing macro.
Poseidon input validation
program-libs/hasher/src/poseidon.rs
Added unconditional pre-check in hashv to return HasherError::InvalidInputLength(32, len) if any input slice length != 32; removed per-input checks inside the platform-specific branch.
Pad RegisteredUser data to 32 bytes before hashing
program-tests/create-address-test-program/src/create_pda.rs, program-tests/system-cpi-test/src/create_pda.rs
Replace hashing over a 31-byte data slice with hashing a new 32-byte buffer where original 31 bytes are copied starting at index 1 (leading zero remains at index 0); no signature changes.

Sequence Diagram(s)

sequenceDiagram
    participant Callsite as Test / Callsite
    participant ToByte as ToByteArray impl
    participant Poseidon as Poseidon::hashv

    Callsite->>ToByte: .to_byte_array().unwrap() on integer/BigInt
    ToByte-->>Callsite: returns byte-array (typically 32 bytes)
    Callsite->>Poseidon: hashv(&[... byte slices ...])
    Poseidon->>Poseidon: validate each slice.len() == 32 -> Err if any mismatch
    alt all slices valid
        Poseidon-->>Callsite: Ok(Hash)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15–20 minutes

Suggested reviewers

  • sergeytimoshin
  • SwenSchaeferjohann

Poem

I nibble bytes in fields of green,
I pad a gap where thirty-one had been.
Integers don cloaks of thirty-two,
Poseidon checks each stitch anew.
A rabbit hums — the hashes purr and preen. 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Comment on lines +136 to +139
let mut data = [0u8; 32];
data[1..].copy_from_slice(&self.data);

H::hashv(&[truncated_user_pubkey.as_slice(), self.data.as_slice()])
H::hashv(&[truncated_user_pubkey.as_slice(), data.as_slice()])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure this is necessary?
hash_to_bn254_field_size_be should already truncate to less than field size.

Copy link
Contributor Author

@vadorovsky vadorovsky Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truncated_user_pubkey variable and the result of hash_to_bn254_field_size_be are OK, they are not a problem.

But self.data is of type [0; 31], so I need to "resize" it to 32. Another option would be changing it to [u8; 32] in the struct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right got it

@vadorovsky vadorovsky changed the title hasher: Enforce input length to match the modulus fix(hasher): Enforce input length to match the modulus Aug 7, 2025
@vadorovsky vadorovsky force-pushed the enforce-modulus-size branch from 25f0a77 to ed718cf Compare August 7, 2025 13:43
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7fb1a83 and ed718cf.

📒 Files selected for processing (11)
  • program-libs/compressed-account/src/compressed_account.rs (2 hunks)
  • program-libs/hasher/src/poseidon.rs (1 hunks)
  • program-libs/hasher/src/to_byte_array.rs (1 hunks)
  • program-libs/indexed-merkle-tree/src/array.rs (5 hunks)
  • program-libs/indexed-merkle-tree/tests/tests.rs (1 hunks)
  • program-tests/create-address-test-program/src/create_pda.rs (1 hunks)
  • program-tests/system-cpi-test/src/create_pda.rs (1 hunks)
  • program-tests/utils/src/mock_batched_forester.rs (2 hunks)
  • program-tests/utils/src/test_batch_forester.rs (2 hunks)
  • prover/client/src/proof_types/batch_update/proof_inputs.rs (2 hunks)
  • sdk-libs/macros/tests/hasher.rs (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: system-programs (random-e2e-test, ["cargo-test-sbf -p e2e-test"])
  • GitHub Check: system-programs (system-cpi-test-v2-other, ["cargo-test-sbf -p system-cpi-v2-test -- --skip funct...
  • GitHub Check: system-programs (system-cpi-test, ["cargo-test-sbf -p system-cpi-test"])
  • GitHub Check: system-programs (system-cpi-test-v2-event, ["cargo-test-sbf -p system-cpi-v2-test -- event::parse"])
  • GitHub Check: system-programs (light-compressed-token, ["cargo-test-sbf -p compressed-token-test"])
  • GitHub Check: system-programs (light-system-program, ["cargo-test-sbf -p system-test"])
  • GitHub Check: system-programs (light-registry, ["cargo-test-sbf -p registry-test"])
  • GitHub Check: system-programs (account-compression, ["cargo-test-sbf -p account-compression-test"])
  • GitHub Check: system-programs (system-cpi-test-v2-functional, ["cargo-test-sbf -p system-cpi-v2-test -- functio...
  • GitHub Check: stateless-js-v1
  • GitHub Check: lint
  • GitHub Check: system-programs (sdk-test-program, ["cargo-test-sbf -p sdk-test"])
  • GitHub Check: system-programs (sdk-anchor-test-program, ["cargo-test-sbf -p sdk-anchor-test", "cargo-test-sbf -...
  • GitHub Check: stateless-js-v2
  • GitHub Check: cli-v2
  • GitHub Check: Test concurrent-merkle-tree
  • GitHub Check: Test sdk-libs
  • GitHub Check: Test batched-merkle-tree-simulate
  • GitHub Check: Test program-libs-fast
  • GitHub Check: Test program-libs-slow
  • GitHub Check: cli-v1
🔇 Additional comments (15)
program-libs/hasher/src/to_byte_array.rs (1)

70-71: LGTM! Consistent extension of ToByteArray trait.

The implementations for isize and usize follow the established pattern using the existing macro, ensuring consistent behavior with other integer types.

program-libs/indexed-merkle-tree/src/array.rs (2)

466-466: LGTM! Import addition for ToByteArray trait.

The import enables the standardized byte array conversion pattern used throughout the codebase.


565-565: LGTM! Consistent adoption of ToByteArray trait.

The replacement of .to_be_bytes() with .to_byte_array().unwrap() standardizes integer-to-byte-array conversions across the codebase. The use of .unwrap() is safe here since these are deterministic conversions of primitive types.

Also applies to: 635-635, 715-715, 810-810

prover/client/src/proof_types/batch_update/proof_inputs.rs (2)

1-3: LGTM! Import addition for ToByteArray trait.

The import enables standardized byte array conversion with proper 32-byte padding for hash inputs.


96-96: LGTM! Standardized byte array conversion for hash input.

The change from .to_be_bytes() to .to_byte_array().unwrap() ensures consistent 32-byte padding for the index before Poseidon hashing, aligning with the PR objective of enforcing input length to match the modulus.

program-libs/indexed-merkle-tree/tests/tests.rs (1)

642-644: LGTM! Proper standardization of byte array conversion for hashing.

The change from to_bytes_be() to bigint_to_be_bytes_array::<32>().unwrap() correctly enforces fixed-size 32-byte arrays for Poseidon hash inputs, which aligns with the PR objective to standardize input lengths and avoid padding-related misunderstandings.

Also applies to: 651-653

sdk-libs/macros/tests/hasher.rs (2)

144-145: LGTM! Consistent adoption of ToByteArray trait.

The conversion from .to_be_bytes() to .to_byte_array().unwrap() standardizes integer-to-byte-array conversions in the test code, ensuring consistent behavior with the broader codebase changes.

Also applies to: 166-167


498-509: LGTM! Proper standardization of u64 byte conversions.

The systematic replacement of .to_be_bytes() with .to_byte_array().unwrap() for all u64 values maintains test accuracy while adopting the standardized byte conversion approach used throughout the codebase.

program-tests/create-address-test-program/src/create_pda.rs (1)

136-139: LGTM! Proper padding of 31-byte data to 32 bytes for consistent hashing.

The implementation correctly addresses the need for fixed-size 32-byte inputs to the Poseidon hasher by padding the 31-byte data field with a leading zero byte. This approach ensures consistent input size while maintaining the original data integrity.

program-tests/utils/src/test_batch_forester.rs (2)

28-30: LGTM: Import standardization for byte array conversion.

The addition of the ToByteArray trait import supports the codebase-wide standardization of integer-to-byte-array conversions for hash operations.


274-274: LGTM: Standardized byte conversion for hash input.

The change from to_be_bytes() to to_byte_array().unwrap() standardizes integer-to-byte-array conversion for Poseidon hash inputs. The unwrap() is safe here since usize conversion should not fail.

program-tests/utils/src/mock_batched_forester.rs (2)

4-6: LGTM: Import standardization for byte array conversion.

The addition of the ToByteArray trait import is consistent with the codebase-wide standardization effort for integer-to-byte-array conversions.


191-191: LGTM: Standardized byte conversion for hash input.

The change from to_be_bytes() to to_byte_array().unwrap() aligns with the standardization of byte array conversions for Poseidon hash inputs. The unwrap() is safe for usize conversion.

program-libs/compressed-account/src/compressed_account.rs (1)

381-381: LGTM: Test import standardization for byte array conversion.

The addition of the ToByteArray trait import in the test module supports the standardization of byte array conversions for hash operations.

program-libs/hasher/src/poseidon.rs (1)

86-90: Excellent: Input validation enforces 32-byte requirement.

This validation perfectly aligns with the PR objective to enforce that Poseidon hash inputs have a size of 32 bytes (matching the Fr modulus). The fail-fast approach with clear error messaging will help avoid misunderstandings related to input padding.

@vadorovsky vadorovsky changed the title fix(hasher): Enforce input length to match the modulus fix: Enforce input length to match the modulus Aug 12, 2025
@vadorovsky vadorovsky force-pushed the enforce-modulus-size branch from ed718cf to 5e98e1e Compare August 12, 2025 11:13
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
program-libs/indexed-merkle-tree/tests/tests.rs (1)

637-656: Update non-32-byte Poseidon::hashv inputs

A few test callsites still pass variable-length slices into Poseidon::hashv. Please update them to use 32-byte helpers (.to_byte_array() or bigint_to_be_bytes_array::<32>()):

  • prover/client/tests/batch_update.rs:36

    // before: yields an 8-byte slice
    Poseidon::hashv(&[&leaf, &(i as usize).to_be_bytes(), &tx_hash]).unwrap();

    Replace with:

    let index_bytes = i.to_byte_array().unwrap();  // [u8;32]
    Poseidon::hashv(&[&leaf, &index_bytes, &tx_hash]).unwrap();
  • program-tests/merkle-tree/tests/indexed.rs:40–43 & 48–51

    // before: variable-length Vec<u8>
    Poseidon::hashv(&[
        &0_u32.to_biguint().unwrap().to_bytes_be(),
        &30_u32.to_biguint().unwrap().to_bytes_be(),
    ])

    Replace each .to_bytes_be() with:

    &bigint_to_be_bytes_array::<32>(&0_u32.to_biguint().unwrap()).unwrap(),
    &bigint_to_be_bytes_array::<32>(&30_u32.to_biguint().unwrap()).unwrap(),

After these changes, all Poseidon::hashv calls will receive exactly 32-byte slices.

🧹 Nitpick comments (1)
program-libs/indexed-merkle-tree/src/array.rs (1)

91-99: Optional: unify next_index encoding with ToByteArray in production code.

Currently, next_index is manually expanded into a 32-byte buffer using to_be_bytes(). This is correct. For consistency with tests and other callsites, consider using ToByteArray for next_index too, if it’s infallible for the concrete I types you support (e.g., u16/usize). If it’s fallible, keeping the current approach is fine.

Example change (only if ToByteArray for I is infallible in your contexts):

-        let mut bytes = [0u8; 32];
-        let len = std::mem::size_of::<I>();
-        bytes[32 - len..].copy_from_slice(self.next_index.to_be_bytes().as_ref());
+        // If ToByteArray<I> is available and infallible, prefer it for consistency:
+        // let bytes = self.next_index.to_byte_array().expect("infallible for integer indices");
+        // Otherwise, keep the current explicit big-endian padding:
+        let mut bytes = [0u8; 32];
+        let len = core::mem::size_of::<I>();
+        bytes[32 - len..].copy_from_slice(self.next_index.to_be_bytes().as_ref());
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed718cf and 5e98e1e.

📒 Files selected for processing (12)
  • program-libs/batched-merkle-tree/tests/merkle_tree.rs (2 hunks)
  • program-libs/compressed-account/src/compressed_account.rs (2 hunks)
  • program-libs/hasher/src/poseidon.rs (1 hunks)
  • program-libs/hasher/src/to_byte_array.rs (1 hunks)
  • program-libs/indexed-merkle-tree/src/array.rs (5 hunks)
  • program-libs/indexed-merkle-tree/tests/tests.rs (1 hunks)
  • program-tests/create-address-test-program/src/create_pda.rs (1 hunks)
  • program-tests/system-cpi-test/src/create_pda.rs (1 hunks)
  • program-tests/utils/src/mock_batched_forester.rs (2 hunks)
  • program-tests/utils/src/test_batch_forester.rs (2 hunks)
  • prover/client/src/proof_types/batch_update/proof_inputs.rs (2 hunks)
  • sdk-libs/macros/tests/hasher.rs (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • program-libs/compressed-account/src/compressed_account.rs
  • program-tests/create-address-test-program/src/create_pda.rs
  • program-tests/utils/src/mock_batched_forester.rs
  • program-tests/utils/src/test_batch_forester.rs
  • sdk-libs/macros/tests/hasher.rs
  • program-libs/hasher/src/poseidon.rs
  • program-libs/hasher/src/to_byte_array.rs
  • program-tests/system-cpi-test/src/create_pda.rs
  • prover/client/src/proof_types/batch_update/proof_inputs.rs
🔇 Additional comments (9)
program-libs/indexed-merkle-tree/tests/tests.rs (2)

641-647: Aligned Poseidon inputs to fixed 32-byte encoding (LGTM).

Switching to bigint_to_be_bytes_array::<32>(...) ensures Poseidon::hashv receives exactly 32-byte big-endian inputs and matches the Fr modulus width. This avoids padding ambiguities and aligns with the new input-length enforcement.


649-656: Consistent 32-byte representation for all tuple elements (LGTM).

The second assertion mirrors the first with consistent 32-byte inputs. Good job keeping the representation uniform across value, index, and next value.

program-libs/indexed-merkle-tree/src/array.rs (5)

466-466: Importing ToByteArray for tests is appropriate (LGTM).

This enables 32-byte index encoding in test vectors to align with Poseidon’s stricter input validation.


561-570: Use of ToByteArray for next_index in expected hash (LGTM).

0_usize.to_byte_array().unwrap() makes the expected input width explicit (32 bytes). This matches how hash_element encodes next_index internally.


631-640: Consistent 32-byte encoding for index 1 (LGTM).

The 1_usize.to_byte_array().unwrap() usage keeps the test deterministic under the new hash input contract.


711-721: 32-byte index encoding maintained for later assertions (LGTM).

Staying with ToByteArray prevents accidental regressions to non-32-byte encodings.


806-815: Final expected-hash computation adheres to 32-byte rule (LGTM).

All three Poseidon inputs are 32 bytes; good consistency.

program-libs/batched-merkle-tree/tests/merkle_tree.rs (2)

39-39: Trait import is correct and necessary (LGTM).

Bringing ToByteArray into scope ensures the leaf_index can be serialized to a 32-byte array for Poseidon hashing.


65-66: Poseidon nullifier inputs now strictly 32 bytes (LGTM).

  • leaf_index serialized via to_byte_array() → 32 bytes
  • insert_value and tx_hash are [u8; 32]
    This aligns with Poseidon::hashv’s input validation and avoids implicit padding differences.

Comment on lines 63 to 68
let mut leaf_hash_chain_insert_values = vec![];
for (insert_value, leaf_index) in bloom_filter_insert_values.iter().zip(leaf_indices.iter()) {
let nullifier =
Poseidon::hashv(&[insert_value.as_slice(), &leaf_index.to_be_bytes(), &tx_hash])
.unwrap();
let leaf_index = leaf_index.to_byte_array().unwrap();
let nullifier = Poseidon::hashv(&[insert_value.as_slice(), &leaf_index, &tx_hash]).unwrap();
leaf_hash_chain_insert_values.push(nullifier);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Quick repo-wide audit to catch non-32-byte Poseidon inputs.

To reduce the chance of a missed callsite (especially in tests/fixtures), run the search below and adjust any remaining .to_be_bytes() or variable-length to_bytes_be() inputs to 32-byte encodings.


🏁 Script executed:

#!/bin/bash
# Show Poseidon::hashv callsites with context
rg -n -A 3 -B 3 $'Poseidon::hashv\\('

# Identify risky conversions that commonly produce non-32-byte inputs
rg -n '\.to_be_bytes\('
rg -n 'to_bytes_be\('

# Confirm adoption of 32-byte conversion helpers
rg -n 'to_byte_array\('
rg -n 'bigint_to_be_bytes_array::<32>'

Length of output: 75413


Enforce 32-byte encodings for all Poseidon::hashv inputs

Several hash callsites still pass variable-length byte slices. Please update the following to use fixed 32-byte arrays (via to_byte_array() or bigint_to_be_bytes_array::<32>()):

• prover/client/tests/batch_update.rs:36

- let nullifier = Poseidon::hashv(&[&leaf, &(i as usize).to_be_bytes(), &tx_hash]).unwrap();
+ let index_bytes = (i as usize).to_byte_array().unwrap();
+ let nullifier = Poseidon::hashv(&[&leaf, &index_bytes, &tx_hash]).unwrap();

• program-tests/merkle-tree/tests/indexed.rs:40–43 & 48–51

- let leaf_0 = Poseidon::hashv(&[
-     &0_u32.to_biguint().unwrap().to_bytes_be(),
-     &30_u32.to_biguint().unwrap().to_bytes_be(),
- ]).unwrap();
+ let leaf_0 = Poseidon::hashv(&[
+     &bigint_to_be_bytes_array::<32>(&0_u32.to_biguint().unwrap()).unwrap(),
+     &bigint_to_be_bytes_array::<32>(&30_u32.to_biguint().unwrap()).unwrap(),
+ ]).unwrap();

- let leaf_1 = Poseidon::hashv(&[
-     &30_u32.to_biguint().unwrap().to_bytes_be(),
-     &0_u32.to_biguint().unwrap().to_bytes_be(),
- ]).unwrap();
+ let leaf_1 = Poseidon::hashv(&[
+     &bigint_to_be_bytes_array::<32>(&30_u32.to_biguint().unwrap()).unwrap(),
+     &bigint_to_be_bytes_array::<32>(&0_u32.to_biguint().unwrap()).unwrap(),
+ ]).unwrap();

After making these changes, rerun the repo-wide audit to confirm no direct .to_be_bytes() or variable-length .to_bytes_be() slices are passed to Poseidon::hashv.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In program-libs/batched-merkle-tree/tests/merkle_tree.rs around lines 63–68, the
Poseidon::hashv call passes variable-length byte slices; convert each input to a
fixed 32-byte array and pass those. Replace the current usage of
insert_value.as_slice(), the local leaf_index Vec, and tx_hash slice with
32-byte arrays (e.g. call insert_value.to_byte_array().unwrap().try_into() or
preferably bigint_to_be_bytes_array::<32>() for numeric types, and use
leaf_index.to_byte_array().unwrap().try_into() or
bigint_to_be_bytes_array::<32>() for the index, and the same for tx_hash) and
then pass their references (as &[u8;32] or .as_slice()) into Poseidon::hashv.

@vadorovsky vadorovsky force-pushed the enforce-modulus-size branch 2 times, most recently from 2e69dee to 10710fc Compare August 12, 2025 12:19
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
program-libs/batched-merkle-tree/tests/merkle_tree.rs (1)

65-66: Avoid shadowing and clarify intent when encoding the index to 32 bytes.

The conversion is correct and satisfies the new 32-byte requirement. Minor readability nit: don’t shadow leaf_index; name the encoded bytes explicitly and prefer expect(...) for a clearer panic on impossible failures.

Apply this diff:

-        let leaf_index = leaf_index.to_byte_array().unwrap();
-        let nullifier = Poseidon::hashv(&[insert_value.as_slice(), &leaf_index, &tx_hash]).unwrap();
+        let leaf_index_bytes = leaf_index
+            .to_byte_array()
+            .expect("u64 -> 32-byte array conversion should never fail");
+        let nullifier =
+            Poseidon::hashv(&[insert_value.as_slice(), &leaf_index_bytes, &tx_hash]).unwrap();
program-tests/merkle-tree/tests/indexed.rs (1)

1-3: Optional: unify remaining bigint_to_be_bytes_array usages to ToByteArray for consistency

To fully standardize on ToByteArray in this test, consider replacing:

  • bigint_to_be_bytes_array::<32>(&10_u32.to_biguint().unwrap()).unwrap()
  • bigint_to_be_bytes_array::<32>(&30_u32.to_biguint().unwrap()).unwrap()

with direct ToByteArray calls:

assert_eq!(non_inclusion_proof.value, 10_u32.to_byte_array().unwrap());
assert_eq!(non_inclusion_proof.leaf_higher_range_value, 30_u32.to_byte_array().unwrap());

This removes the BigUint roundtrip, keeps everything consistently 32-byte big-endian, and aligns with the rest of the changes. If you do this, you can also drop the bigint_to_be_bytes_array import if it becomes unused.

prover/client/tests/batch_update.rs (1)

34-36: Remove unnecessary cast and misapplied clippy allow.

  • i is already a usize (inferred from vec![tx_hash; num_insertions] and the range), so (i as usize) is redundant.
  • The #[allow(clippy::unnecessary_cast)] attribute does not apply to the cast on the previous line and is now stale.

Apply this minimal diff:

-            let index_bytes = (i as usize).to_byte_array().unwrap();
-            #[allow(clippy::unnecessary_cast)]
+            let index_bytes = i.to_byte_array().unwrap();
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e98e1e and 10710fc.

📒 Files selected for processing (14)
  • program-libs/batched-merkle-tree/tests/merkle_tree.rs (2 hunks)
  • program-libs/compressed-account/src/compressed_account.rs (2 hunks)
  • program-libs/hasher/src/poseidon.rs (1 hunks)
  • program-libs/hasher/src/to_byte_array.rs (1 hunks)
  • program-libs/indexed-merkle-tree/src/array.rs (5 hunks)
  • program-libs/indexed-merkle-tree/tests/tests.rs (1 hunks)
  • program-tests/create-address-test-program/src/create_pda.rs (1 hunks)
  • program-tests/merkle-tree/tests/indexed.rs (2 hunks)
  • program-tests/system-cpi-test/src/create_pda.rs (1 hunks)
  • program-tests/utils/src/mock_batched_forester.rs (2 hunks)
  • program-tests/utils/src/test_batch_forester.rs (2 hunks)
  • prover/client/src/proof_types/batch_update/proof_inputs.rs (2 hunks)
  • prover/client/tests/batch_update.rs (1 hunks)
  • sdk-libs/macros/tests/hasher.rs (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
  • program-libs/hasher/src/to_byte_array.rs
  • program-tests/create-address-test-program/src/create_pda.rs
  • sdk-libs/macros/tests/hasher.rs
  • program-libs/compressed-account/src/compressed_account.rs
  • program-libs/indexed-merkle-tree/tests/tests.rs
  • program-tests/system-cpi-test/src/create_pda.rs
  • program-tests/utils/src/mock_batched_forester.rs
  • program-libs/indexed-merkle-tree/src/array.rs
  • program-tests/utils/src/test_batch_forester.rs
  • prover/client/src/proof_types/batch_update/proof_inputs.rs
  • program-libs/hasher/src/poseidon.rs
🔇 Additional comments (4)
program-libs/batched-merkle-tree/tests/merkle_tree.rs (1)

39-39: Adopts canonical 32-byte conversions for Poseidon inputs — good.

Importing ToByteArray here aligns the test helper with the PR’s enforcement of 32-byte Poseidon inputs.

program-tests/merkle-tree/tests/indexed.rs (2)

1-3: Imports updated to bring ToByteArray into scope — aligns with 32-byte enforcement

Good change. This makes the test inputs consistent with Poseidon’s 32-byte requirement and the repo-wide shift to ToByteArray.


43-45: Use of ToByteArray here is correct and enforces 32-byte inputs to Poseidon::hashv

Switching from biguint.to_bytes_be() to u32.to_byte_array().unwrap() ensures fixed 32-byte big-endian encoding and prevents accidental length mismatches. This directly satisfies the PR objective.

Also applies to: 51-53

prover/client/tests/batch_update.rs (1)

34-36: Switching to 32-byte serialization for the index is correct and aligns with Poseidon’s input contract.

Using to_byte_array().unwrap() ensures a canonical 32-byte representation and satisfies the new hashv length enforcement. The ordering [leaf, index, tx_hash] remains unchanged.

Comment on lines 41 to 47
leaf_0,
Poseidon::hashv(&[
&0_u32.to_biguint().unwrap().to_bytes_be(),
&30_u32.to_biguint().unwrap().to_bytes_be()
&0_u32.to_byte_array().unwrap(),
&30_u32.to_byte_array().unwrap(),
])
.unwrap()
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify leaf hashing arity vs. inline comment (“H(0, 1, 30)” vs. two-field hash calls)

The inline comments say the leaf is H(value, next_index, next_value), but the assertions hash only two inputs. If the leaf really includes next_index, this test may be under-specifying the expected hash. If the design intentionally moved to two inputs, please update the comment to avoid confusion.

Run this script to find Poseidon::hashv call sites and check how many fields are hashed for leaves:

Also applies to: 49-55


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show Poseidon::hashv usage with a bit of context
rg -n "Poseidon::hashv\\(&\\[" -A 6 -B 2

# Inspect leaf construction sites in the indexed merkle tree library to confirm arity
rg -n "leaf\\(|leaf_hash|hash_leaf|Hash\\(value, next_index, next_value\\)|H\\(" -S -A 6 -B 2

Length of output: 304588


Update inline comments to match two-input leaf hash

The test’s comments still describe a three-input hash (H(value, next_index, next_value)), but the code now calls Poseidon::hashv(&[value, next_value]). Please update both comment lines so they reflect the two-field hash arity:

• File: program-tests/merkle-tree/tests/indexed.rs
Lines ~27–28

Suggested diff:

- // leaf index: 0 = H(0, 1, 30) // Hash(value, next_index, next_value)
+ // leaf index: 0 = H(0, 30)    // Hash(value, next_value)
- // leaf index: 1 = H(30, 0, 0)
+ // leaf index: 1 = H(30, 0)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
leaf_0,
Poseidon::hashv(&[
&0_u32.to_biguint().unwrap().to_bytes_be(),
&30_u32.to_biguint().unwrap().to_bytes_be()
&0_u32.to_byte_array().unwrap(),
&30_u32.to_byte_array().unwrap(),
])
.unwrap()
);
// leaf index: 0 = H(0, 30) // Hash(value, next_value)
// leaf index: 1 = H(30, 0) // Hash(value, next_value)
🤖 Prompt for AI Agents
In program-tests/merkle-tree/tests/indexed.rs around lines 27–28 and 41–47, the
inline comment still describes a three-input leaf hash H(value, next_index,
next_value) while the code uses Poseidon::hashv(&[value, next_value]) (two
inputs); update the comment text to reflect the two-field hash arity (e.g.,
change to H(value, next_value) and any explanatory text referencing three fields
to reference two), keeping surrounding wording consistent with the new two-input
hash.

@vadorovsky vadorovsky force-pushed the enforce-modulus-size branch from 10710fc to 2bbb4d9 Compare August 14, 2025 12:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (6)
sdk-libs/macros/tests/hasher.rs (6)

166-176: Use Option::to_byte_array() instead of manually composing Some-marker + to_be_bytes

Manual construction risks drifting from the canonical encoding and is harder to maintain. Leverage the ToByteArray impl for Option<u64> to encode the marker and value consistently.

Apply this diff:

-            {
-                let mut bytes = vec![0; 32];
-                bytes[24..].copy_from_slice(&account.f.unwrap().to_be_bytes());
-                bytes[23] = 1; // Suffix with 1 for Some
-                bytes
-            },
+            account.f.to_byte_array().unwrap().to_vec(),

Also, the comment says “Suffix” but the marker byte is placed just before the value bytes (effectively a prefix/marker slot). Using the trait avoids this ambiguity.


198-209: Use Option::to_byte_array() in zero-values case for consistency

Replace the manual Some(0) encoding with the trait implementation.

             {
-                let mut bytes = [0u8; 32];
-                bytes[24..].copy_from_slice(&zero_account.f.unwrap().to_be_bytes());
-                bytes[23] = 1; // Suffix with 1 for Some
-                bytes
+                zero_account.f.to_byte_array().unwrap()
             },

609-626: Prefer Option::{to_byte_array} over manual marker/value encoding for primitives

Replace both u32/u64 manual encodings with the canonical trait impls to avoid skew if encoding rules evolve.

-            {
-                let mut bytes = [0u8; 32];
-                bytes[28..].copy_from_slice(&42u32.to_be_bytes());
-                bytes[27] = 1; // Prefix with 1 for Some
-                bytes
-            },
-            {
-                let mut bytes = [0u8; 32];
-                bytes[24..].copy_from_slice(&u64::MAX.to_be_bytes());
-                bytes[23] = 1; // Prefix with 1 for Some
-                bytes
-            },
+            test_struct.small.to_byte_array().unwrap(),
+            test_struct.large.to_byte_array().unwrap(),

828-835: MixedOptions: use Option::to_byte_array() for u32

Same rationale as above: fewer chances of divergence and cleaner tests.

-            {
-                let mut bytes = [0u8; 32];
-                bytes[28..].copy_from_slice(&42u32.to_be_bytes());
-                bytes[27] = 1;
-                bytes
-            },
+            test_struct.basic.to_byte_array().unwrap(),

918-923: OuterStruct: use Option::to_byte_array() for u64

Let the trait handle marker placement and padding.

-        {
-            let mut bytes = [0u8; 32];
-            bytes[24..].copy_from_slice(&u64::MAX.to_be_bytes());
-            bytes[23] = 1;
-            bytes
-        }
+        test_struct.basic.to_byte_array().unwrap()

1-10: Fix: replace manual to_be_bytes() encodings with ToByteArray in hasher tests

rg found lingering manual encodings that should use ToByteArray to guarantee 32-byte inputs for the hasher. Please migrate these usages.

  • Files/locations to fix:
    • sdk-libs/macros/tests/hasher.rs — lines ~172, 204–205, 611–618, 830–831, 919–920 (manual to_be_bytes() + building 32-byte buffers / copy_from_slice)
  • Suggested replacement pattern:
    • Replace
      let mut bytes = [0u8; 32];
      bytes[24..].copy_from_slice(&value.to_be_bytes());
    • With
      let bytes = value.to_byte_array();

Also note: the repo-wide search shows other to_be_bytes() / to_bytes_be() occurrences — audit those as well and migrate where the value is intended as a 32-byte hasher input.

🧹 Nitpick comments (2)
sdk-libs/macros/tests/hasher.rs (2)

161-164: Remove duplicate assertion

assert_eq!(nested_hash_result, manual_nested_hash); appears twice; keep one.

         assert_eq!(nested_hash_result, manual_nested_hash);
-        assert_eq!(manual_nested_hash, nested_reference_hash);
-        assert_eq!(nested_hash_result, manual_nested_hash);
+        assert_eq!(manual_nested_hash, nested_reference_hash);

396-399: Remove stray debug prints in tests

These println!("1"); calls add noise to test output and provide no value.

-            println!("1");
             assert_eq!(with_none.to_byte_array().unwrap(), none_bytes[0]);
-            println!("1");
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 10710fc and 2bbb4d9.

📒 Files selected for processing (14)
  • program-libs/batched-merkle-tree/tests/merkle_tree.rs (2 hunks)
  • program-libs/compressed-account/src/compressed_account.rs (2 hunks)
  • program-libs/hasher/src/poseidon.rs (1 hunks)
  • program-libs/hasher/src/to_byte_array.rs (1 hunks)
  • program-libs/indexed-merkle-tree/src/array.rs (5 hunks)
  • program-libs/indexed-merkle-tree/tests/tests.rs (1 hunks)
  • program-tests/create-address-test-program/src/create_pda.rs (1 hunks)
  • program-tests/merkle-tree/tests/indexed.rs (2 hunks)
  • program-tests/system-cpi-test/src/create_pda.rs (1 hunks)
  • program-tests/utils/src/mock_batched_forester.rs (2 hunks)
  • program-tests/utils/src/test_batch_forester.rs (2 hunks)
  • prover/client/src/proof_types/batch_update/proof_inputs.rs (2 hunks)
  • prover/client/tests/batch_update.rs (2 hunks)
  • sdk-libs/macros/tests/hasher.rs (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (13)
  • program-tests/create-address-test-program/src/create_pda.rs
  • program-tests/utils/src/test_batch_forester.rs
  • program-libs/compressed-account/src/compressed_account.rs
  • program-libs/indexed-merkle-tree/src/array.rs
  • program-tests/merkle-tree/tests/indexed.rs
  • program-libs/indexed-merkle-tree/tests/tests.rs
  • prover/client/tests/batch_update.rs
  • program-libs/hasher/src/to_byte_array.rs
  • program-tests/system-cpi-test/src/create_pda.rs
  • prover/client/src/proof_types/batch_update/proof_inputs.rs
  • program-libs/hasher/src/poseidon.rs
  • program-tests/utils/src/mock_batched_forester.rs
  • program-libs/batched-merkle-tree/tests/merkle_tree.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
sdk-libs/macros/tests/hasher.rs (1)
program-libs/hasher/src/to_byte_array.rs (4)
  • to_byte_array (6-6)
  • to_byte_array (32-47)
  • to_byte_array (55-59)
  • to_byte_array (152-161)
🔇 Additional comments (2)
sdk-libs/macros/tests/hasher.rs (2)

144-150: Switch to ToByteArray for primitive fields — aligns with 32-byte invariant (LGTM)

Using to_byte_array().unwrap() for i32 and u32 ensures each Poseidon input is exactly 32 bytes and consistent with the enforced length check.


498-510: Expected-hash inputs now use 32-byte ToByteArray — matches Poseidon::hashv enforcement (LGTM)

Good update to ensure each input slice is exactly 32 bytes.

To avoid misunderstandings related to padding in Poseidon hash inputs,
enforce them to have size 32, which is the number of bytes in `Fr`
modulus.
@vadorovsky vadorovsky force-pushed the enforce-modulus-size branch from 2bbb4d9 to adc8d1c Compare August 15, 2025 15:51
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
prover/client/tests/batch_update.rs (1)

36-39: Switched index encoding to 32 bytes for Poseidon — compliant with new validation

  • Using (i as usize).to_byte_array().unwrap() produces a 32-byte big-endian representation, ensuring Poseidon::hashv receives only 32-byte slices.
  • Replacing the prior to_be_bytes() (8 bytes) removes the prior mismatch with the hash input validator.

Minor optional: for consistency, you could also derive the leaf preimage via ToByteArray (outside this hunk), e.g., let bn = (i as u8).to_byte_array().unwrap(); to avoid manual padding.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2bbb4d9 and adc8d1c.

📒 Files selected for processing (14)
  • program-libs/batched-merkle-tree/tests/merkle_tree.rs (2 hunks)
  • program-libs/compressed-account/src/compressed_account.rs (2 hunks)
  • program-libs/hasher/src/poseidon.rs (1 hunks)
  • program-libs/hasher/src/to_byte_array.rs (1 hunks)
  • program-libs/indexed-merkle-tree/src/array.rs (5 hunks)
  • program-libs/indexed-merkle-tree/tests/tests.rs (1 hunks)
  • program-tests/create-address-test-program/src/create_pda.rs (1 hunks)
  • program-tests/merkle-tree/tests/indexed.rs (2 hunks)
  • program-tests/system-cpi-test/src/create_pda.rs (1 hunks)
  • program-tests/utils/src/mock_batched_forester.rs (2 hunks)
  • program-tests/utils/src/test_batch_forester.rs (2 hunks)
  • prover/client/src/proof_types/batch_update/proof_inputs.rs (2 hunks)
  • prover/client/tests/batch_update.rs (2 hunks)
  • sdk-libs/macros/tests/hasher.rs (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
  • program-libs/hasher/src/to_byte_array.rs
  • program-libs/compressed-account/src/compressed_account.rs
  • program-tests/create-address-test-program/src/create_pda.rs
  • program-tests/utils/src/mock_batched_forester.rs
  • program-libs/batched-merkle-tree/tests/merkle_tree.rs
  • program-libs/indexed-merkle-tree/tests/tests.rs
  • program-libs/indexed-merkle-tree/src/array.rs
  • program-tests/utils/src/test_batch_forester.rs
  • prover/client/src/proof_types/batch_update/proof_inputs.rs
  • program-libs/hasher/src/poseidon.rs
  • sdk-libs/macros/tests/hasher.rs
  • program-tests/merkle-tree/tests/indexed.rs
🧰 Additional context used
🧬 Code Graph Analysis (2)
program-tests/system-cpi-test/src/create_pda.rs (2)
program-libs/hasher/src/poseidon.rs (1)
  • hashv (85-124)
program-libs/hasher/src/lib.rs (1)
  • hashv (28-28)
prover/client/tests/batch_update.rs (4)
program-libs/hasher/src/hash_chain.rs (1)
  • create_hash_chain_from_slice (23-32)
program-libs/hasher/src/to_byte_array.rs (4)
  • to_byte_array (6-6)
  • to_byte_array (32-47)
  • to_byte_array (55-59)
  • to_byte_array (152-161)
program-libs/hasher/src/poseidon.rs (1)
  • hashv (85-124)
program-tests/merkle-tree/src/lib.rs (1)
  • leaf (291-296)
🔇 Additional comments (2)
program-tests/system-cpi-test/src/create_pda.rs (1)

546-550: Poseidon input now correctly padded to 32 bytes and used in hashing — looks good

  • Left-padding self.data (31 bytes) into a 32-byte buffer and passing the padded data to hashv aligns with the new 32-byte requirement.
  • truncated_user_pubkey is already a 32-byte slice, so both inputs satisfy Poseidon::hashv’s length checks.
prover/client/tests/batch_update.rs (1)

1-3: Correctly imports the ToByteArray trait to enable .to_byte_array()

Bringing to_byte_array::ToByteArray into scope ensures method resolution and aligns the test with the repo-wide 32-byte input standardization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants