-
Notifications
You must be signed in to change notification settings - Fork 1.7k
core: zstd: Implement dispatcher methods for zstd decompression #10697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
core: zstd: Implement dispatcher methods for zstd decompression #10697
Conversation
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
WalkthroughAdded ZSTD support: new algorithm constant, opaque ZSTD decompression context and lifecycle/dispatch APIs, integrated ZSTD into core decompression flow, implemented streaming frame-wise ZSTD decompressor, and added tests for multi-chunk streaming and checksum-based corruption detection. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Case
participant Core as flb_decompression_context
participant ZSTD as flb_zstd_decompression_context
participant Lib as ZSTD Library
Test->>Core: create context (algorithm=ZSTD)
Core->>ZSTD: flb_zstd_decompression_context_create()
ZSTD->>Lib: ZSTD_createDCtx()
Lib-->>ZSTD: DCtx
loop for each input chunk
Test->>Core: append chunk to input buffer
Test->>Core: flb_decompress()
Core->>ZSTD: flb_zstd_decompressor_dispatch()
ZSTD->>Lib: ZSTD_findFrameCompressedSize()
alt full frame available
ZSTD->>Lib: ZSTD_decompressDCtx()
Lib-->>ZSTD: decompressed data
ZSTD-->>Core: return success, advance input
else need more input
ZSTD-->>Core: return INSUFFICIENT_DATA
else corrupted frame
ZSTD-->>Core: return FATAL_ERROR
end
end
Test->>Core: destroy context
Core->>ZSTD: flb_zstd_decompression_context_destroy()
ZSTD->>Lib: ZSTD_freeDCtx()
Lib-->>ZSTD: freed
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
src/flb_compression.c (1)
134-138
: Missing ZSTD context cleanup in destructorThe destructor only handles GZIP decompression context cleanup but not ZSTD. This will cause memory leaks when ZSTD contexts are destroyed.
Apply this fix:
if (context->inner_context != NULL) { - flb_gzip_decompression_context_destroy(context->inner_context); + if (context->algorithm == FLB_COMPRESSION_ALGORITHM_GZIP) { + flb_gzip_decompression_context_destroy(context->inner_context); + } + else if (context->algorithm == FLB_COMPRESSION_ALGORITHM_ZSTD) { + flb_zstd_decompression_context_destroy(context->inner_context); + } context->inner_context = NULL; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
include/fluent-bit/flb_compression.h
(1 hunks)include/fluent-bit/flb_zstd.h
(1 hunks)src/flb_compression.c
(4 hunks)src/flb_zstd.c
(2 hunks)tests/internal/zstd.c
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: PR - fuzzing test
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-centos-7
🔇 Additional comments (8)
include/fluent-bit/flb_compression.h (1)
29-29
: LGTM!The new constant follows the sequential pattern and properly extends the compression algorithm enumeration.
include/fluent-bit/flb_zstd.h (1)
26-35
: LGTM!The forward declaration and new function signatures properly extend the zstd API with streaming decompression support. The opaque pointer pattern for context management is appropriate.
src/flb_compression.c (1)
182-184
: LGTM!The ZSTD algorithm integration is properly implemented with appropriate context creation, state initialization, and decompressor dispatch.
Also applies to: 209-211, 227-232
tests/internal/zstd.c (1)
167-317
: LGTM!Comprehensive test coverage for streaming decompression including multi-chunk processing and corruption detection. Proper memory cleanup in all test paths.
src/flb_zstd.c (4)
27-29
: LGTM!Clean and simple context structure encapsulating the ZSTD decompression context.
168-233
: LGTM!Well-implemented streaming decompression dispatcher with proper error handling:
- Distinguishes between recoverable (need more data) and fatal errors
- Handles frame boundary detection correctly
- Updates buffer pointers appropriately after decompression
- Sets failure state on errors for proper error propagation
235-253
: LGTM!Context creation properly handles allocation failures for both the wrapper structure and the internal ZSTD context.
255-265
: LGTM!Destructor properly checks for null pointers before freeing resources.
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
2d7131e
to
7c09c6f
Compare
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/flb_gzip.c (1)
691-692
: Nit: Update the stale comment (“header + crc32”) to reflect header-only parsing here.The logic now only requires the fixed 10-byte header at this point; CRC/footer is handled later. Suggest clarifying the comment to prevent confusion for future readers.
Apply this diff:
- /* Minimal length: header + crc32 */ + /* Minimal length for header parsing: GZIP fixed header (10 bytes). + * CRC/footer will be processed after the body. */
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
src/flb_gzip.c
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: PR - fuzzing test
- GitHub Check: pr-compile-centos-7
🔇 Additional comments (1)
src/flb_gzip.c (1)
693-697
: Good change: partial-header now returns INSUFFICIENT_DATA (non-fatal), enabling true streaming.This aligns with the streaming decompressor semantics and avoids noisy error logs while waiting for more bytes. Keeping the state unchanged (still EXPECTING_HEADER) is correct and ensures idempotent re-entry once more data arrives.
For robust handling of zstd compressed buffers and chunks, we need to implement a dispatch mechanism of zstd decompressions.
This is because decompression of zstd should be easily exceeded the limit of buffers and massive usage of memory.
So, we need to implement streaming decompression for zstd, too.
The motivation why we wanted to implement this feature is implementing zstd compression on input and output forward for forward protocol.
At this time, Fluentd had already implemeneted such feature and the later part of works we need to implement to use this core feature to handle robost zstd decompressions in in_forward plugin.
ref: fluent/fluentd#4758
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit