Skip to content

Conversation

redmannequin
Copy link

@redmannequin redmannequin commented Apr 11, 2025

Problem

The predefined solana programs(system, token, etc.) only expose invoke and invoke_signed calls. It would be nice to have unsafe calls as well.

Solution

  • Adds Invoke trait that contains default implementations for 'invoke' and 'invoke_signed' as well as for the unsafe counter parts.
  • Invoke is auto implemented for any type that can convert into InvokeParts which contains all the necessary information to run an invoke call.
  • Implements From<T> for InvokeParts for all instructions

NOTES

@redmannequin redmannequin force-pushed the add-unchecked-calls-to-system-program branch from 96fb12e to 60d4104 Compare April 11, 2025 23:54
@redmannequin redmannequin force-pushed the add-unchecked-calls-to-system-program branch from 60d4104 to 755df53 Compare April 11, 2025 23:56
/// callee, then Rust's aliasing rules will be violated and cause undefined
/// behavior.
pub unsafe fn invoke_signed_access_unchecked<const ACCOUNTS: usize>(
instruction: &Instruction,
Copy link
Author

@redmannequin redmannequin Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be worth creating a FixedInstruction variant that takes in const generic for the account metas length. constraining the number of accounts to account metas at type level/compile time.

@redmannequin redmannequin marked this pull request as ready for review April 15, 2025 08:24
@redmannequin redmannequin changed the title WIP: add unchecked calls to system program Adds unchecked calls to system program Apr 15, 2025
///
/// `InvokeParts` is a helper structure that encapsulates all parts of a Solana instruction call,
/// including the program ID, account references, account metadata, and the instruction data payload.
pub struct InvokeParts<'a, const ACCOUNTS: usize, Data> {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should live in the SDK -- perhaps a pinocchio-client-helpers crate?

But for now let's just copy the trait + types into the other clients and see if it generalizes well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll need at least one more InstructionData for slices

Comment on lines +136 to +139
impl<'a, const ACCOUNTS: usize, T, Data> Invoke<'a, ACCOUNTS, Data> for T
where
T: Into<InvokeParts<'a, ACCOUNTS, Data>>,
Data: InstructionData,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These kinds of blanket implementations are considered bad for extensibility, are they not?
And the fact that it is defined for Into is even more suspicious.
Are you sure this would not cause any issues?

Similarly, the inheritance from Into is somewhat unexpected:

Invoke<'a, const ACCOUNTS: usize, Data>: Into<InvokeParts<'a, ACCOUNTS, Data>>

I am not sure that I fully understand the proposed change end to end, but looking at the usage site, it does not seem to be doing anything more than a simple trait implementation would not do:

impl<'a> From<AdvanceNonceAccount<'a>>
    for InvokeParts<'a, N_ACCOUNTS, FullInstructionData<DATA_LEN>>
{
    fn from(value: AdvanceNonceAccount<'a>) -> Self {

Whey is the above better than

trait<'accounts> Invoke<'accounts, const N_ACCOUNTS: usize> {
    type InstructionData: InstructionData;

    fn accounts(&self) -> Self::InstructionData;
    /* ... */
}

impl<'accounts> Invoke<'accounts, N_ACCOUNTS> for AdvanceNonceAccount<'a>
{
    type InstructionData: FullInstructionData<DATA_LEN>;

    fn accounts(&self) -> Self::InstructionData {

There might be some minor changes required.
But in the end, it seems that an instance of FullInstructionData<DATA_LEN> is being constructed.
And the From/Into conversion is not doing anything, except hiding some logic.

Or am I missing something?

Copy link
Author

@redmannequin redmannequin Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So initially I thought of having the invoke functions implemented under the InvokeParts struct but then I thought it would be nice to be able to invoke from the instruction struct directly:

// probably worth creating a `IntoInvokeParts` trait instead of using `Into/From`
Transfer<'a> {....}.into_parts().invoke()
// vs
Transfer<'a> {....}.invoke()

You raise a valid point about the blanket impl I was imagining the trait more of a ext trait, to prevent miss use we could seal the trait.

The Invoke trait can be removed entirely by implementing the invoke functions under InvokeParts. Meaning anything that can convert into InvokeParts can be invoked. The trait was really only to make the invoke calls a bit more ergonomic.

Now why use InvokeParts? Because it enforces a common structure that generalizes well creating a clear separation of concern between the instruction definition and invoking the instruction. Also It can help preven monomorphization bloating(most likely it wont though lol).

Similarly, the inheritance from Into is somewhat unexpected:

Invoke<'a, const ACCOUNTS: usize, Data>: Into<InvokeParts<'a, ACCOUNTS, Data>>

This isn't inheritance but a bound on the trait, it's saying that in order to implement Invoke you must have implemented Into<InvokeParts> . The bound isn't really necessary I just added it to lock it down a bit but it can be removed. Usually you want to keep traits flexible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So initially I thought of having the invoke functions implemented under the InvokeParts struct but then I thought it would be nice to be able to invoke from the instruction struct directly:

// probably worth creating a `IntoInvokeParts` trait instead of using `Into/From`
Transfer<'a> {....}.into_parts().invoke()
// vs
Transfer<'a> {....}.invoke()

If you implement a trait, you can call this trait methods directly.

You raise a valid point about the blanket impl I was imagining the trait more of a ext trait, to prevent miss use we could seal the trait.

The Invoke trait can be removed entirely by implementing the invoke functions under InvokeParts. Meaning anything that can convert into InvokeParts can be invoked. The trait was really only to make the invoke calls a bit more ergonomic.

I think the opposite is better.
A trait should implement the behavior, not a struct.
So the invoke function should be part of the Invoke trait.
This would remove the requirement of constructing the intermediate struct.

Now why use InvokeParts? Because it enforces a common structure that generalizes well creating a clear separation of concern between the instruction definition and invoking the instruction. Also It can help preven monomorphization bloating(most likely it wont though lol).

If there is a duplication, it can be extracted into a shared function that does not depend on the parameters that produce the unnecessary duplication.

Also, Invoke does not need to be parameterized by ACCOUNTS and Data.
Those should probably be associated types, as they are defined by a specific instance the Invoke is implemented for.
There is no reason to implement Invoke<1, FullInstructionData<12>> as well as Invoke<2, FullInstructionData<12>> for the same Transfer, is it?

Similarly, the inheritance from Into is somewhat unexpected:

Invoke<'a, const ACCOUNTS: usize, Data>: Into<InvokeParts<'a, ACCOUNTS, Data>>

This isn't inheritance but a bound on the trait, it's saying that in order to implement Invoke you must have implemented Into<InvokeParts>. The bound isn't really necessary I just added it to lock it down a bit but it can be removed. Usually you want to keep traits flexible.

If it is not necessary, there is no need to include it, no?

Copy link
Author

@redmannequin redmannequin Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take a look at the transfer instruction and the current invoke call:

pub struct Transfer<'a> {
    pub from: &'a AccountInfo,
    pub to: &'a AccountInfo,
    pub lamports: u64,
}

impl Transfer<'_> {
    pub fn invoke_signed(&self, signers: &[Signer]) -> ProgramResult {
    
        // NOTE: the number of account is always static and doesn't change. 
        let account_metas: [AccountMeta; 2] = [
            AccountMeta::writable_signer(self.from.key()),
            AccountMeta::writable(self.to.key()),
        ];
        
        // NOTE: the data is constructed as a byte array but due to `no_std` 
        // we cant use box/vec so we statically size the array 
        let mut instruction_data = [0; 12];
        instruction_data[0] = 2;
        instruction_data[4..12].copy_from_slice(&self.lamports.to_le_bytes());

        // NOTE: the sdk already provides a partial `InvokeParts` but the data is borrowed 
        let instruction = Instruction {
            program_id: &crate::ID,
            accounts: &account_metas,
            data: &instruction_data,
        };

        invoke_signed(&instruction, &[self.from, self.to], signers)
    }
}

Now let's construct the Invoke trait without InvokeParts providing part constructors:

pub trait Invoke {
    const ACCOUNTS: usize;
    type Data:: InstructionData;
    
    // NOTE: here you're just defining the `InvokeParts` struct but using functions
    // NOTE: this isn't defining behavior but pseudo getters
    fn program_id(&self) -> Pubkey;
    fn accounts<'a>(&'a self) -> [&'a AccountInfo; ACCOUNTS];
    fn account_metas(&self) -> [AccountMeta<'a>; Self::ACCOUNTS];
    fn instruction_data(&self) -> Self::Data;
    
    // NOTE: here you are calling function that take `&self` and in particular `x.accounts()` which
    // returns referenced data meaning that the likelihood of rust being able optimize out copies of data 
    // is reduced potentially increasing CU count. By using the intermediate `InvokeParts` you move data
    // which allows rust to optimize out copies and better structure the data layout  
    fn invoke_signed(self, signers: &[Signer]) -> pinocchio::ProgramResult {
        let account_metas = self.account_metas();
        let instruction_data = self.data();
        let accounts = self.accounts();
        
        let instruction = Instruction {
            program_id: &self.program_id(),
            accounts: &account_metas,
            data: instruction_data.as_slice(),
        };
        
        invoke_signed(&instruction, &accounts, signers)
    }
}

Ultimately this comes down to preference. From my experience with rust providing function based views via traits tend to cause less flexibility and its usually better to create an intermediate struct, its also a common pattern used by many crates.

If the use of generics is the issue. As stated before we could add a IntoInvokeParts trait, this can enables associated types removing some of the generics making the syntax overload a bit more manageable.

// NOTE: this ensures one impl per type 
pub trait IntoInvokeParts<'a> {
    const ACCOUNTS: usize;
    type Data:: InstructionData;
    
    fn into_parts(self) -> InvokeParts<'a, Self::ACCOUNTS,  Self::Data>;
}

#[sealed]
pub trait Invoke { .... }

impl Invoke for T where T: IntoInvokeParts { .... }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[...]

Ultimately this comes down to preference. From my experience with rust providing function based views via traits tend to cause less flexibility and its usually better to create an intermediate struct, its also a common pattern used by many crates.

Why is it less flexible?
I think an intermediate struct only adds complexity.
And while the optimizers are pretty good, it may create more code.

I think it is better to use a callback mechanism, if we want to get better optimization.
If you return a Pubkey or an array like [&'a AccountInfo; ACCOUNTS] the compiler may create unnecessary copies.

You are already using a callback in the invoke_invoker().
So why not just avoid all the intermediate transformations?
I might be missing their purpose, but if they bring no additional value, why not remove them altogether?

pub trait CanInvoke {
    const ACCOUNTS_LEN: usize;

    fn invoke_via<F>(&self, invoke: F) -> ProgramResult
    where 
        invoke: for<'invoke> FnOnce(
            /* program_id: */ &'invoke Pubkey,
            /* accounts: */ &'invoke [&'invoke AccountInfo; Self::ACCOUNTS_LEN],
            /* account_metas: */ &'invoke[AccountMeta<'invoke>; Self::ACCOUNTS_LEN],
            /* data: */ &'invoke [u8],
        )
    ;

}

pub struct Transfer<'a> {
    pub from: &'a AccountInfo,
    pub to: &'a AccountInfo,
    pub lamports: u64,
}

impl CanInvoke for Transfer<'_> {
    const ACCOUNTS_LEN: usize = 2;

    pub fn invoke_via<F>(&self, invoke: F) -> ProgramResult
    where
        invoke: for<'invoke> FnOnce(
            /* program_id: */ &'invoke Pubkey,
            /* accounts: */ &'invoke [&'invoke AccountInfo; Self::ACCOUNTS_LEN],
            /* metas: */ &'invoke[AccountMeta<'invoke>; Self::ACCOUNTS_LEN],
            /* data: */ &'invoke [u8],
        )
    {
        let metas: [AccountMeta; 2] = [
            AccountMeta::writable_signer(self.from.key()),
            AccountMeta::writable(self.to.key()),
        ];
        
        let mut data = [0; 12];
        data[0] = 2;
        data[4..12].copy_from_slice(&self.lamports.to_le_bytes());

        invoke(&crate::ID, &[self.from, self.to], &metas, &data)
    }
}

I do not see why the data needs to be generalized either.
It will always be a slice in the current implementation.
So why make it more complex?

The InstructionData::len() documentation says that it is an optimization that allows one to get the length without getting the slice.
But is this optimization really used?

Copy link
Author

@redmannequin redmannequin Jul 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CanInvoke trait is a bit too low-level for SDK users from my perspective, as it requires them to manage/understand the mechanics of invocation explicitly. Instead, using IntoInvokeParts for defining how a type maps to invocation components and the SDK managing the invocation via the sealed Invoke trait, gives a better use experience.

From my point of view InvokeParts and the Invoke trait should be public and should abstract away complexity from users of the SDK.

The InstructionData::len() documentation says that it is an optimization that allows one to get the length without > getting the slice.
But is this optimization really used?

This also comes from a point of view of the type being public and part of the SDK or for a pinoccchio-client-helper crate.

Why is it less flexible?
I think an intermediate struct only adds complexity.
And while the optimizers are pretty good, it may create more code.

You don't need a an explicit type to invoke, you can just construct a InvokeParts and call invoke on it. When you only have the trait you need to explicitly define a type to call the invoke method. Also InvokeParts can potentially be re-used for a grpc client via a InvokeGrpc trait.

Though you are correct there is more code but in the end the rust compiler is better able to identify intermediate data structures(Zero-Cost Abstractions) and optimize them away creating smaller binary sizes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CanInvoke trait is a bit too low-level for SDK users from my perspective, as it requires them to manage/understand the mechanics of invocation explicitly.

It provides exactly the same values as what you put in InvokeParts.
So, I do not think there is anything more to understand.
The only difference is that you need to be able to understand how a callback works.

Instead, using IntoInvokeParts for defining how a type maps to invocation components and the SDK managing the invocation via the sealed Invoke trait, gives a better use experience.

I can see why you might be seeing it this way.
But I also think it is a subjective opinion to a considerable degree.
And there is likely a difference in the CU consumption between these two approaches.
Though, I would also argue that an API with fewer abstractions is easier to understand for humans as well.

[...]

The InstructionData::len() documentation says that it is an optimization that allows one to get the length without > getting the slice.
But is this optimization really used?

This also comes from a point of view of the type being public and part of the SDK or for a pinoccchio-client-helper crate.

I do not understand this argument.
If this optimization is not useful, there is no benefit for adding an extra abstraction.
It is making the API more complex, which is bad both for users and the compiler.

Why is it less flexible?
I think an intermediate struct only adds complexity.
And while the optimizers are pretty good, it may create more code.

You don't need a an explicit type to invoke, you can just construct a InvokeParts and call invoke on it. When you only have the trait you need to explicitly define a type to call the invoke method.

This does not make it less flexible.
The difference is how you express the conversion.
Either as a trait implementation or as a construction of the InvokeParts instance.
In terms of flexibility, it is the same.

Also InvokeParts can potentially be re-used for a grpc client via a InvokeGrpc trait.

I do not think it would be a good idea to connect these things.
They are only loosely related, and so it is better to keep them described separately in code.

Though you are correct there is more code but in the end the rust compiler is better able to identify intermediate data structures (Zero-Cost Abstractions) and optimize them away creating smaller binary sizes.

This is, unfortunately, not the case, at least for the SBF target.
You can see a discussion here, when we found that even return of a Pubkey rather than a &Pubkey causes unnecessary copies.

I've talked to Lucas about this a bit more outside the PR discussion.
I think it is a bug, or at least a shortcoming, in the LLVM analysis.
LLVM seems to be loosing some information that would allow it to remove the unnecessary copy.
But the fact is: on SBF returning unnecessary objects is not necessary free.
On x86, on a modern superscalar CPU, small copies are essentially free, even if the compiler can not optimize them away.

But Pinocchio is designed as an SDK that is trying to avoid any overhead.
And in the PR that I quoted, the decision was to keep an API that can cause undefined behavior, rather than adding a single Pubkey copy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is mostly subjective, I'm happy to open a new PR exploring the simplified CanInkove trait.

I don't know how I feel about writing code that is tied to shortcoming of the LLVM that may be updated in the future but I do understand that programs may never be updated so it's best to have the best impl available at the moment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done on #212

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants