Skip to content

chore: hasher and light hasher macro support for sha256 #1892

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

SwenSchaeferjohann
Copy link
Contributor

@SwenSchaeferjohann SwenSchaeferjohann commented Jul 29, 2025

  • macro has test to hash large struct with sha

Summary by CodeRabbit

  • New Features

    • Introduced SHA256-based variants of LightHasher and LightDiscriminator derive macros for struct hashing.
    • Added SHA256-based account wrappers with generic hashing support alongside existing Poseidon-based accounts.
    • Provided SHA256-specific modules, type aliases, and macro re-exports for streamlined integration.
  • Improvements

    • Generalized hashing macros and account wrappers to support multiple hashing algorithms via generics.
    • Removed field count limits and relaxed attribute constraints for SHA256 hashing variant.
    • Added consistent error handling and truncation logic across hashing implementations.
  • Bug Fixes

    • Enhanced validation to reject unsupported attributes like #[flatten] in SHA256 hashing macros.

Copy link
Contributor

coderabbitai bot commented Jul 29, 2025

Walkthrough

This change introduces SHA256 as an alternative hashing algorithm alongside Poseidon throughout the hasher libraries, macro derive system, and SDK. It generalizes the Hasher trait and its implementations, provides SHA256-specific derive macros and validation logic, and updates account abstractions to support generic hashers, including both Poseidon and SHA256 variants.

Changes

Cohort / File(s) Change Summary
Hasher trait and implementations
program-libs/hasher/src/lib.rs, program-libs/hasher/src/keccak.rs, program-libs/hasher/src/poseidon.rs, program-libs/hasher/src/sha256.rs
Added associated constant ID to Hasher trait and each hasher implementation (Poseidon, Keccak, Sha256) for algorithm identification.
Macro: DataHasher and SHA256 support
sdk-libs/macros/src/hasher/data_hasher.rs, sdk-libs/macros/src/hasher/input_validator.rs
Added SHA256-specific data hasher codegen and input validation logic, including error handling and field truncation for non-Poseidon hashers.
Macro: LightHasher derive and SHA256 variant
sdk-libs/macros/src/hasher/light_hasher.rs, sdk-libs/macros/src/hasher/mod.rs, sdk-libs/macros/src/hasher/to_byte_array.rs, sdk-libs/macros/src/lib.rs
Added LightHasherSha derive macro for SHA256, generalized internal codegen to support arbitrary hashers, and updated macro exports.
Macro: Discriminator SHA256 variant
sdk-libs/macros/src/discriminator.rs
Refactored discriminator generation to support SHA256 variant via a shared internal function; added tests for SHA256 discriminator.
SDK Account abstraction
sdk-libs/sdk/src/account.rs, sdk-libs/sdk/src/lib.rs
Generalized LightAccount to support generic hashers, added SHA256-based variants, and updated methods and type aliases accordingly.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Macro
    participant HasherTrait
    participant Poseidon
    participant Sha256

    User->>Macro: #[derive(LightHasher)] or #[derive(LightHasherSha)]
    Macro->>HasherTrait: Use Hasher::ID to select hasher
    alt Poseidon path
        Macro->>Poseidon: Hash struct fields
        Poseidon-->>Macro: Poseidon hash result
    else SHA256 path
        Macro->>Sha256: Hash serialized struct
        Sha256-->>Macro: SHA256 hash result (first byte set to zero)
    end
    Macro-->>User: Implements DataHasher/ToByteArray for struct
Loading
sequenceDiagram
    participant User
    participant LightAccount
    participant HasherTrait

    User->>LightAccount: Create LightAccountInner with H: Hasher
    LightAccount->>HasherTrait: H::hash(account data)
    HasherTrait-->>LightAccount: Hash result (first byte zeroed if not Poseidon)
    LightAccount-->>User: Provides account info with hashed data
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Lightprotocol/light-protocol#1795: Introduces AccountInfoTrait and refactors account handling, which is closely related to this PR's generalization of account abstractions and hasher support.

Suggested reviewers

  • sergeytimoshin

Poem

In fields of hash and bytes anew,
The rabbit hops with SHA in view.
Poseidon waves, but SHA256 joins in,
Structs now hashed with a broader grin.
Accounts adapt, macros grow—
The garden of hashes begins to glow!
🐇✨

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch swen/flex-hasher

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (7)
program-libs/hasher/src/poseidon.rs (1)

81-82: LGTM - Correct trait implementation with minor formatting suggestion.

The ID = 0 assignment for Poseidon correctly implements the required trait constant. Assigning 0 to Poseidon makes sense as the baseline hasher that doesn't require truncation in macro-generated code.

Consider removing the extra blank line for consistency:

 impl Hasher for Poseidon {
     const ID: u8 = 0;
-
     fn hash(val: &[u8]) -> Result<Hash, HasherError> {
sdk-libs/macros/src/lib.rs (1)

259-283: Documentation needs correction.

The example imports from light_sdk::sha::LightHasher, but based on the SDK file structure, this should be light_sdk::sha::LightHasher or users should use light_sdk_macros::LightHasherSha directly.

 /// ```ignore
-/// use light_sdk::sha::LightHasher;
+/// use light_sdk_macros::LightHasherSha;
 ///
-/// #[derive(LightHasher)]
+/// #[derive(LightHasherSha)]
 /// pub struct GameState {
sdk-libs/macros/src/hasher/data_hasher.rs (2)

40-48: Clarify the truncation comment.

The comment mentions "field size truncation" but the code is zeroing the first byte of the hash result. This should be clarified to explain why this truncation is necessary for non-Poseidon hashers.

-                    // Apply field size truncation for non-Poseidon hashers
+                    // Zero the first byte for non-Poseidon hashers to ensure consistent hash output format
                     if H::ID != 0 {
                         result[0] = 0;
                     }

Also applies to: 69-82


87-118: SHA256 implementation looks good, same comment clarification needed.

The approach of serializing the entire struct for SHA256 hashing is appropriate. Consider updating the truncation comment here as well for consistency.

-                // Truncate field size for non-Poseidon hashers
+                // Zero the first byte for non-Poseidon hashers to ensure consistent hash output format
                 if H::ID != 0 {
                     result[0] = 0;
                 }
sdk-libs/macros/src/hasher/to_byte_array.rs (1)

7-61: Good refactoring to support multiple hashers.

The parameterization with hasher makes the implementation flexible for different hashing algorithms. Consider clarifying the truncation comment as suggested in other files.

-                    // Truncate field size for non-Poseidon hashers
+                    // Zero the first byte for non-Poseidon hashers to ensure consistent hash output format
                     if #hasher::ID != 0 {
                         result[0] = 0;
                     }
sdk-libs/sdk/src/account.rs (1)

88-91: Remove unused Size trait.

The Size trait is defined but not implemented or used anywhere in this file. Consider removing it if it's not needed, or moving it to where it's actually used.

-pub trait Size {
-    fn size(&self) -> usize;
-}
-
sdk-libs/macros/src/hasher/light_hasher.rs (1)

441-563: Comprehensive test coverage for SHA256 variant.

Excellent tests that clearly demonstrate the differences between SHA256 and Poseidon hashers. The tests effectively validate:

  • No field limit for SHA256
  • Different hashing approaches
  • Attribute support differences

Consider adding a test for the #[skip] attribute behavior with SHA256 to ensure complete attribute coverage documentation.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 50778df and 8634ec1.

📒 Files selected for processing (12)
  • program-libs/hasher/src/keccak.rs (1 hunks)
  • program-libs/hasher/src/lib.rs (1 hunks)
  • program-libs/hasher/src/poseidon.rs (1 hunks)
  • program-libs/hasher/src/sha256.rs (1 hunks)
  • sdk-libs/macros/src/hasher/data_hasher.rs (2 hunks)
  • sdk-libs/macros/src/hasher/input_validator.rs (1 hunks)
  • sdk-libs/macros/src/hasher/light_hasher.rs (4 hunks)
  • sdk-libs/macros/src/hasher/mod.rs (1 hunks)
  • sdk-libs/macros/src/hasher/to_byte_array.rs (2 hunks)
  • sdk-libs/macros/src/lib.rs (2 hunks)
  • sdk-libs/sdk/src/account.rs (10 hunks)
  • sdk-libs/sdk/src/lib.rs (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: stateless-js-v2
  • GitHub Check: system-programs (sdk-test-program, ["cargo-test-sbf -p sdk-test"])
  • GitHub Check: system-programs (sdk-anchor-test-program, ["cargo-test-sbf -p sdk-anchor-test", "cargo-test-sbf -...
  • GitHub Check: cli-v1
  • GitHub Check: cli-v2
  • GitHub Check: stateless-js-v1
  • GitHub Check: lint
  • GitHub Check: Test batched-merkle-tree-simulate
  • GitHub Check: Test program-libs-slow
  • GitHub Check: Test sdk-libs
  • GitHub Check: Test program-libs-fast
  • GitHub Check: Test concurrent-merkle-tree
🔇 Additional comments (13)
program-libs/hasher/src/lib.rs (1)

27-27: LGTM - Clean trait extension for hasher identification.

The addition of the ID constant to the Hasher trait provides a clean way to identify different hasher implementations at compile time. This enables conditional logic in macro-generated code and supports the multi-hasher architecture.

program-libs/hasher/src/keccak.rs (1)

12-13: LGTM - Correct implementation of required trait constant.

The ID = 2 assignment for Keccak properly implements the new Hasher trait requirement and maintains consistency with the numbering scheme (Poseidon=0, SHA256=1, Keccak=2).

program-libs/hasher/src/sha256.rs (1)

12-12: LGTM - Proper trait implementation.

The ID = 1 assignment for SHA256 correctly implements the new Hasher trait requirement and fits well within the established numbering scheme.

sdk-libs/macros/src/hasher/mod.rs (1)

7-7: LGTM - Proper export of new SHA256 derive macro.

The addition of derive_light_hasher_sha export enables the SHA256 variant of the light hasher derive macro to be used throughout the codebase, complementing the existing Poseidon-based derive_light_hasher.

sdk-libs/macros/src/hasher/input_validator.rs (1)

63-91: LGTM! Well-structured SHA256 validation function.

The relaxed constraints for SHA256 (no field count limit) appropriately reflect its ability to handle larger inputs compared to Poseidon. The error messages clearly indicate this is for the SHA256 hasher variant.

sdk-libs/macros/src/lib.rs (1)

3-3: Import addition looks good.

The new derive_light_hasher_sha import is properly added alongside the existing hasher derive function.

sdk-libs/sdk/src/lib.rs (1)

106-113: Clean module structure for SHA256 variants.

The new sha submodule provides a clear namespace separation for SHA256-based implementations while maintaining backward compatibility. The re-export pattern is well-structured.

sdk-libs/sdk/src/account.rs (4)

92-99: Well-designed type alias pattern for backward compatibility.

The type alias approach maintains backward compatibility while introducing generic hasher support. The separate sha module provides a clean namespace for SHA256 variants.


109-110: Clarify the purpose and security implications of should_remove_data.

The should_remove_data field controls whether account data is cleared in to_account_info(). The TODO comment on line 271 suggests security concerns need review. Please document:

  1. When and why this flag should be set
  2. Security implications of removing vs. keeping data
  3. Any validation requirements

119-221: Consistent and correct generic hasher implementation.

All constructor methods properly updated to support generic hashers with consistent initialization of new fields.


288-308: LGTM!

Deref implementations correctly updated for the generic struct.

sdk-libs/macros/src/hasher/light_hasher.rs (2)

51-53: Clean refactoring to support multiple hashers.

Good use of the DRY principle by extracting common logic into derive_light_hasher_with_hasher.


78-120: Well-structured generic hasher implementation.

The helper function cleanly separates hasher-specific logic while reusing common code generation.

Comment on lines +55 to +76
pub(crate) fn derive_light_hasher_sha(input: ItemStruct) -> Result<TokenStream> {
// Use SHA256-specific validation (no field count limits)
validate_input_sha(&input)?;

let generics = input.generics.clone();

let fields = match &input.fields {
Fields::Named(fields) => fields.clone(),
_ => unreachable!("Validation should have caught this"),
};

let field_count = fields.named.len();

let to_byte_array_impl = generate_to_byte_array_impl_sha(&input.ident, &generics, field_count)?;
let data_hasher_impl = generate_data_hasher_impl_sha(&input.ident, &generics)?;

Ok(quote! {
#to_byte_array_impl

#data_hasher_impl
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify SHA256 implementation completeness.

The SHA256 implementation is simpler than Poseidon and doesn't process field attributes. While the tests show #[flatten] is explicitly rejected, what about other attributes like #[hash] and #[skip]?


🏁 Script executed:

#!/bin/bash
# Check if SHA256 validation handles other attributes
rg -A 10 "validate_input_sha" sdk-libs/macros/src/hasher/

Length of output: 2582


🏁 Script executed:

#!/bin/bash
# Display validate_input_sha implementation and surrounding context
sed -n '1,200p' sdk-libs/macros/src/hasher/input_validator.rs

Length of output: 3700


Implement or reject #[skip] and #[hash] in the SHA256 derive

The SHA256 path only validates against #[flatten] and ignores other field attributes, so fields annotated with #[skip] or #[hash] will be treated exactly like un-annotated fields. To avoid surprising behavior or silently including/excluding fields, please clarify and address attribute handling:

• In validate_input_sha (sdk-libs/macros/src/hasher/input_validator.rs), either
– Extend validation to reject unsupported attributes (FieldAttribute::Skip and FieldAttribute::Hash), or
– Allow them explicitly (and document that they’re no-ops).
• In derive_light_hasher_sha (sdk-libs/macros/src/hasher/light_hasher.rs) and its helpers (generate_to_byte_array_impl_sha & generate_data_hasher_impl_sha), update serialization logic if #[skip] should omit a field or #[hash] should customize hashing.
• Otherwise, update the doc-comments on the SHA256 derive to call out exactly which attributes are supported or ignored.

Please implement the intended behavior or surface it clearly in validation/tests.

🤖 Prompt for AI Agents
In sdk-libs/macros/src/hasher/light_hasher.rs around lines 55 to 76, the
derive_light_hasher_sha function and its helpers currently ignore field
attributes #[skip] and #[hash], causing unexpected behavior. To fix this, update
validate_input_sha in sdk-libs/macros/src/hasher/input_validator.rs to either
reject these unsupported attributes or explicitly allow them with clear
documentation. Then, modify derive_light_hasher_sha and its helper functions
generate_to_byte_array_impl_sha and generate_data_hasher_impl_sha to handle
these attributes properly—skip fields annotated with #[skip] during
serialization and apply custom hashing logic for #[hash] if intended. If no
special handling is desired, update the SHA256 derive doc-comments to clearly
state which attributes are supported or ignored to avoid confusion.

Comment on lines +63 to +90
/// SHA256-specific ToByteArray implementation that serializes the whole struct
pub(crate) fn generate_to_byte_array_impl_sha(
struct_name: &syn::Ident,
generics: &syn::Generics,
field_count: usize,
) -> Result<TokenStream> {
let (impl_gen, type_gen, where_clause) = generics.split_for_impl();

Ok(quote! {
impl #impl_gen ::light_hasher::to_byte_array::ToByteArray for #struct_name #type_gen #where_clause {
const NUM_FIELDS: usize = #field_count;

fn to_byte_array(&self) -> ::std::result::Result<[u8; 32], ::light_hasher::HasherError> {
use borsh::BorshSerialize;
use ::light_hasher::Hasher;

// For SHA256, we can serialize the whole struct and hash it in one go
let serialized = self.try_to_vec().map_err(|_| ::light_hasher::HasherError::BorshError)?;
let mut result = ::light_hasher::Sha256::hash(&serialized)?;

// Truncate field size for non-Poseidon hashers
result[0] = 0;

Ok(result)
}
}
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Inconsistency in truncation logic.

This implementation always zeros the first byte (line 84) without checking the hasher ID, while other implementations only do this when ID != 0. This should be consistent across all implementations.

                 // For SHA256, we can serialize the whole struct and hash it in one go
                 let serialized = self.try_to_vec().map_err(|_| ::light_hasher::HasherError::BorshError)?;
                 let mut result = ::light_hasher::Sha256::hash(&serialized)?;

-                // Truncate field size for non-Poseidon hashers
-                result[0] = 0;
+                // Zero the first byte for non-Poseidon hashers to ensure consistent hash output format
+                if ::light_hasher::Sha256::ID != 0 {
+                    result[0] = 0;
+                }

                 Ok(result)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// SHA256-specific ToByteArray implementation that serializes the whole struct
pub(crate) fn generate_to_byte_array_impl_sha(
struct_name: &syn::Ident,
generics: &syn::Generics,
field_count: usize,
) -> Result<TokenStream> {
let (impl_gen, type_gen, where_clause) = generics.split_for_impl();
Ok(quote! {
impl #impl_gen ::light_hasher::to_byte_array::ToByteArray for #struct_name #type_gen #where_clause {
const NUM_FIELDS: usize = #field_count;
fn to_byte_array(&self) -> ::std::result::Result<[u8; 32], ::light_hasher::HasherError> {
use borsh::BorshSerialize;
use ::light_hasher::Hasher;
// For SHA256, we can serialize the whole struct and hash it in one go
let serialized = self.try_to_vec().map_err(|_| ::light_hasher::HasherError::BorshError)?;
let mut result = ::light_hasher::Sha256::hash(&serialized)?;
// Truncate field size for non-Poseidon hashers
result[0] = 0;
Ok(result)
}
}
})
}
/// SHA256-specific ToByteArray implementation that serializes the whole struct
pub(crate) fn generate_to_byte_array_impl_sha(
struct_name: &syn::Ident,
generics: &syn::Generics,
field_count: usize,
) -> Result<TokenStream> {
let (impl_gen, type_gen, where_clause) = generics.split_for_impl();
Ok(quote! {
impl #impl_gen ::light_hasher::to_byte_array::ToByteArray for #struct_name #type_gen #where_clause {
const NUM_FIELDS: usize = #field_count;
fn to_byte_array(&self) -> ::std::result::Result<[u8; 32], ::light_hasher::HasherError> {
use borsh::BorshSerialize;
use ::light_hasher::Hasher;
// For SHA256, we can serialize the whole struct and hash it in one go
let serialized = self.try_to_vec().map_err(|_| ::light_hasher::HasherError::BorshError)?;
let mut result = ::light_hasher::Sha256::hash(&serialized)?;
// Zero the first byte for non-Poseidon hashers to ensure consistent hash output format
if ::light_hasher::Sha256::ID != 0 {
result[0] = 0;
}
Ok(result)
}
}
})
}
🤖 Prompt for AI Agents
In sdk-libs/macros/src/hasher/to_byte_array.rs between lines 63 and 90, the code
always sets the first byte of the result to zero without checking the hasher ID,
unlike other implementations that conditionally truncate only when ID is not
zero. Modify the code to include a check for the hasher ID before zeroing the
first byte, ensuring the truncation logic matches the pattern used in other
ToByteArray implementations.

lint

remove unused _output_account_info

update lightdiscriminator macro
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
sdk-libs/macros/src/hasher/to_byte_array.rs (1)

83-84: Fix inconsistent truncation logic in SHA256 implementation.

The SHA256 implementation always zeros the first byte without checking the hasher ID, while other implementations only do this conditionally. This should be consistent across all implementations.

Apply this diff to fix the inconsistency:

-                // Truncate field size for non-Poseidon hashers
-                result[0] = 0;
+                // Zero the first byte for non-Poseidon hashers to ensure consistent hash output format
+                if ::light_hasher::Sha256::ID != 0 {
+                    result[0] = 0;
+                }
sdk-libs/macros/src/hasher/light_hasher.rs (1)

55-76: Address incomplete field attribute handling in SHA256 implementation.

The SHA256 derive path only validates against #[flatten] and ignores other field attributes (#[skip] and #[hash]), causing them to be silently ignored rather than properly handled or explicitly rejected.

The current implementation will treat fields with #[skip] or #[hash] attributes exactly like regular fields, which may surprise users. Please either:

  1. Extend validation to explicitly reject unsupported attributes:
// In validate_input_sha, add checks for other attributes
if matches!(attr, FieldAttribute::Skip | FieldAttribute::Hash) {
    return Err(syn::Error::new_spanned(field, "Attribute not supported in SHA256 variant"));
}
  1. Implement proper handling in the serialization logic to respect these attributes

  2. Document clearly which attributes are supported/ignored in the SHA256 variant

This ensures consistent behavior and prevents silent failures.

🧹 Nitpick comments (1)
sdk-libs/macros/src/discriminator.rs (1)

23-25: Remove unused variable and clarify discriminator calculation.

The _variant_marker variable is assigned but never used, creating dead code. Additionally, both discriminator variants use SHA256 hash calculation regardless of the is_sha parameter, which may be confusing.

Apply this diff to remove the unused variable:

-    // For SHA256 variant, we could add specific logic here if needed
-    // Currently both variants work the same way since discriminator is just based on struct name
-    let _variant_marker = if is_sha { "sha256" } else { "poseidon" };

Consider adding a comment to clarify that discriminators always use SHA256 hash of the struct name regardless of the hasher variant:

+    // Discriminators always use SHA256 hash of struct name for consistency
+    // regardless of the hasher variant used for the actual data hashing
     let mut discriminator = [0u8; 8];
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8634ec1 and 7f6196f.

📒 Files selected for processing (13)
  • program-libs/hasher/src/keccak.rs (1 hunks)
  • program-libs/hasher/src/lib.rs (1 hunks)
  • program-libs/hasher/src/poseidon.rs (1 hunks)
  • program-libs/hasher/src/sha256.rs (1 hunks)
  • sdk-libs/macros/src/discriminator.rs (3 hunks)
  • sdk-libs/macros/src/hasher/data_hasher.rs (2 hunks)
  • sdk-libs/macros/src/hasher/input_validator.rs (1 hunks)
  • sdk-libs/macros/src/hasher/light_hasher.rs (5 hunks)
  • sdk-libs/macros/src/hasher/mod.rs (1 hunks)
  • sdk-libs/macros/src/hasher/to_byte_array.rs (2 hunks)
  • sdk-libs/macros/src/lib.rs (3 hunks)
  • sdk-libs/sdk/src/account.rs (10 hunks)
  • sdk-libs/sdk/src/lib.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (10)
  • program-libs/hasher/src/keccak.rs
  • program-libs/hasher/src/poseidon.rs
  • program-libs/hasher/src/sha256.rs
  • program-libs/hasher/src/lib.rs
  • sdk-libs/macros/src/hasher/input_validator.rs
  • sdk-libs/macros/src/hasher/mod.rs
  • sdk-libs/macros/src/lib.rs
  • sdk-libs/macros/src/hasher/data_hasher.rs
  • sdk-libs/sdk/src/lib.rs
  • sdk-libs/sdk/src/account.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: lint
  • GitHub Check: Test concurrent-merkle-tree
  • GitHub Check: Test program-libs-slow
  • GitHub Check: Test sdk-libs
  • GitHub Check: Test program-libs-fast
  • GitHub Check: Test batched-merkle-tree-simulate
  • GitHub Check: stateless-js-v1
  • GitHub Check: cli-v2
  • GitHub Check: system-programs (sdk-test-program, ["cargo-test-sbf -p sdk-test"])
  • GitHub Check: system-programs (sdk-anchor-test-program, ["cargo-test-sbf -p sdk-anchor-test", "cargo-test-sbf -...
  • GitHub Check: stateless-js-v2
  • GitHub Check: cli-v1
🔇 Additional comments (7)
sdk-libs/macros/src/discriminator.rs (3)

6-8: LGTM! Clean refactoring to support multiple hasher variants.

The delegation to a shared helper maintains backward compatibility while enabling code reuse for the new SHA256 discriminator variant.


10-12: LGTM! Consistent API design for SHA256 discriminator.

The new function follows the same pattern as the original discriminator, providing a clean API for SHA256-based discriminators.


63-103: Excellent test coverage for SHA256 discriminator functionality.

The new tests comprehensively verify:

  • Basic SHA256 discriminator functionality with expected output matching
  • Ability to handle large structs that would challenge regular discriminators
  • Proper code generation and trait implementation

The large struct test is particularly valuable as it demonstrates the key advantage of SHA256 discriminators for complex data structures.

sdk-libs/macros/src/hasher/to_byte_array.rs (1)

7-61: LGTM! Well-designed generalization to support multiple hashers.

The refactored function successfully:

  • Generalizes ToByteArray implementation for arbitrary hashers via the hasher parameter
  • Simplifies single-field case with direct code parsing
  • Implements correct conditional truncation logic that checks hasher ID before zeroing the first byte
  • Maintains backward compatibility while enabling new functionality

The truncation logic correctly follows the pattern: only truncate for non-Poseidon hashers (ID != 0).

sdk-libs/macros/src/hasher/light_hasher.rs (3)

51-53: LGTM! Clean refactoring maintains backward compatibility.

The delegation to a shared helper with the Poseidon hasher token preserves existing functionality while enabling code reuse for the new SHA256 variant.


78-120: LGTM! Comprehensive shared implementation with full attribute support.

The shared helper properly:

  • Validates input structure
  • Processes field attributes (#[hash], #[skip], #[flatten])
  • Generates implementations using the provided hasher parameter
  • Maintains all existing functionality while enabling hasher flexibility

This demonstrates the full-featured approach that the SHA256 variant should also follow for attribute handling.


441-712: Excellent comprehensive test coverage for SHA256 variant.

The new test suite effectively demonstrates and validates:

  • Large struct handling: SHA256's ability to process structs with >12 fields and multiple Pubkeys
  • Different hashing approaches: Serialization-based (SHA256) vs field-by-field (Poseidon)
  • Attribute validation: Proper rejection of unsupported #[flatten] attribute
  • No field limits: SHA256 handling 20+ fields without constraints
  • Real-world scenarios: Complex game state struct that would be impossible with Poseidon
  • Integration testing: Coordination between hasher and discriminator variants

These tests serve as excellent documentation of when and why to use each variant, and provide confidence in the implementation correctness.

Comment on lines -240 to +282
output.data_hash = self.account.hash::<Poseidon>()?;
output.data = self
.account
.try_to_vec()
.map_err(|_| LightSdkError::Borsh)?;
if self.should_remove_data {
// TODO: review security.
output.data_hash = DEFAULT_DATA_HASH;
} else {
output.data_hash = self.account.hash::<H>()?;
if H::ID != 0 {
output.data_hash[0] = 0;
}
output.data = self
.account
.try_to_vec()
.map_err(|_| LightSdkError::Borsh)?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we already truncate in the hasher we should revert this

Ok(quote! {
impl #impl_gen ::light_hasher::to_byte_array::ToByteArray for #struct_name #type_gen #where_clause {
const NUM_FIELDS: usize = #field_count;
const NUM_FIELDS: usize = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks wrong

@ananas-block
Copy link
Contributor

what are the expected changes of this pr?
(It looks like there are duplicate changes in multiple places.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants