Skip to content

nvme: Add Function Level Reset (FLR) support to NVMe emulator #1720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Jul 18, 2025

This PR adds Function Level Reset (FLR) support to the NVMe emulator to enable VFIO to set the reset_method file in sysfs.

Problem

When testing with the NVMe emulator, VFIO didn't allow OpenHCL to modify the reset_method file because the emulator didn't advertise FLR support. This prevented proper device reset functionality in virtualized environments.

Solution

Added a minimal PCI Express capability implementation that:

  1. Advertises FLR support - Implements Device Capabilities register with FLR bit (bit 29) set
  2. Handles FLR requests - Processes writes to Device Control register FLR trigger bit (bit 15)
  3. Performs device reset - Resets NVMe controller state when FLR is initiated
  4. Maintains compatibility - FLR support is optional and disabled by default

Key Components

  • PciExpressCapability: New capability struct implementing PCI Express Device Capabilities, Device Control, and Device Status registers
  • FlrHandler trait: Interface for handling FLR events with atomic signaling mechanism
  • Configuration options: Added flr_support flag to NvmeControllerCaps and NvmeControllerHandle
  • Reset integration: FLR triggers existing controller reset logic safely

Testing

Added comprehensive unit tests that verify:

  • FLR capability is properly advertised when enabled
  • No PCI Express capability appears when FLR is disabled
  • FLR trigger mechanism works with correct self-clearing behavior
  • All existing functionality remains unaffected

Usage

let caps = NvmeControllerCaps {
    msix_count: 64,
    max_io_queues: 64, 
    subsystem_id: Guid::new_random(),
    flr_support: true,  // Enable FLR support
};

Fixes #1719.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

- Add PCI Express capability ID to pci_core spec
- Implement PciExpressCapability with FLR support in pci_core
- Add FlrHandler trait for handling FLR events
- Add flr_support flag to NvmeControllerCaps and NvmeControllerHandle
- Wire FLR capability to NvmeController when enabled
- Implement FLR reset logic using atomic flag mechanism
- Add comprehensive unit tests for FLR functionality
- Update nvme_resources to include FLR configuration option

Co-authored-by: mattkur <13772048+mattkur@users.noreply.github.com>
@Copilot Copilot AI changed the title [WIP] nvme: nvme emulator does not advertise FLR support nvme: Add Function Level Reset (FLR) support to NVMe emulator Jul 18, 2025
@Copilot Copilot AI requested a review from mattkur July 18, 2025 05:05
Copilot finished work on behalf of mattkur July 18, 2025 05:05
pub fn new(flr_supported: bool, flr_handler: Option<Arc<dyn FlrHandler>>) -> Self {
let device_capabilities = pci_express::DeviceCapabilities::new()
.with_function_level_reset(flr_supported)
.with_max_payload_size(0) // 128 bytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you get 128 bytes here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value 0 for max_payload_size follows the PCIe specification where 0 encodes 128 bytes. Updated the comment to clarify this mapping.

device_control | (device_status << 16)
}
_ => {
tracelimit::warn_ratelimited!(offset, "unhandled pci express capability read");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include the label in all tracing from this cap. E.g. warn_ratelimited!(?label, offset...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the capability label to all tracing calls in both read_u32 and write_u32 methods for better debugging context.

/// * `flr_handler` - Optional handler to be called when FLR is initiated
pub fn new(flr_supported: bool, flr_handler: Option<Arc<dyn FlrHandler>>) -> Self {
let device_capabilities = pci_express::DeviceCapabilities::new()
.with_function_level_reset(flr_supported)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is verbose, try to use FromZeroes or default (and then just override the one bit you need)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified the DeviceCapabilities initialization to use the default state from new() and only set the function_level_reset bit as needed, eliminating the verbose builder pattern.

}
}

fn write_u32(&mut self, offset: u16, val: u32) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add unit test for read_u32 and write_u32 in this file (add the tests in this file, too)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comprehensive unit tests for read_u32 and write_u32 methods covering all register offsets, FLR functionality, error handling, and edge cases including readonly register protection and unhandled offsets.

Comment on lines 389 to 399
/// Offsets into the PCI Express Capability Header
///
/// Table pulled from PCI Express Base Specification Rev. 3.0
///
/// | Offset | Bits 31-24 | Bits 23-16 | Bits 15-8 | Bits 7-0 |
/// |-----------|------------------|------------------|------------------|------------------|
/// | Cap + 0x0 | PCI Express Capabilities Register | Next Pointer | Capability ID (0x10) |
/// | Cap + 0x4 | Device Capabilities Register |
/// | Cap + 0x8 | Device Status | Device Control |
/// | Cap + 0xC | Link Capabilities Register |
/// | Cap + 0x10| Link Status | Link Control |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean up formatting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned up the table formatting by removing extra space in the PCI Express Capabilities Register cell to align properly with the column headers.

Co-authored-by: mattkur <13772048+mattkur@users.noreply.github.com>
@Copilot Copilot AI requested a review from mattkur July 18, 2025 07:18
Copilot finished work on behalf of mattkur July 18, 2025 07:18
@mattkur mattkur requested a review from a team July 18, 2025 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nvme: nvme emulator does not advertise FLR support
3 participants