Standardized Object Detection Benchmarking

Problem

Current object detection benchmarking practices suffer from significant inconsistencies that compromise the reliability of reported performance metrics. Typically, researchers report mAP values from their research code, then export models to ONNX format and compile with fp16 TensorRT to report latency measurements. This approach introduces several sources of error:

Precision Compatibility: Some models do not function correctly when compiled to fp16 precision
Postprocessing Overhead: Complex postprocessing operations significantly impact model performance but are inconsistently handled across implementations
Measurement Methodology: Inconsistent reporting between raw trtexec outputs and Python session measurements
Thermal Throttling: Inadequate control for GPU power throttling due to thermal saturation, leading to unreproducible latency measurements

Solution

This framework provides an optimized TensorRT Python implementation that translates directly from ONNX graphs to latency/mAP pairs without leveraging complex postprocessing for any model. The implementation addresses the identified issues through:

Throttling Monitoring: Active detection of GPU thermal throttling to determine measurement reliability
Thermal Management: Insertion of cooling buffers between subsequent inference calls to reduce throttling effects
Hosted Model Repository: Centralized hosting of ONNX graphs to ensure model availability and reproducibility
Standardized Export: Consistent model export methodology across architectures

Model Export Standards

ONNX graphs are obtained directly from the original author repositories for each model type. For YOLO models specifically, export is performed using the command:

yolo export format=onnx nms=True conf=0.001

Technical Implementation

A notable distinction from the D-FINE implementation is the inclusion of CUDA graph support. While CUDA graphs are straightforward to implement with trtexec, they present additional complexity in Python environments. However, they provide meaningful performance improvements for certain model architectures, justifying their inclusion in this framework.

Usage

To run the benchmark:

Install dependencies: pip install -r requirements.txt
Execute: python3 benchmark_all.py <path to coco val dir> <path to coco val annotations>

Contributions

Contributions of new models to the benchmark suite are welcome. Please submit model additions by opening a pull request to the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
sab		sab
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Standardized Object Detection Benchmarking

Problem

Solution

Model Export Standards

Technical Implementation

Usage

Contributions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

roboflow/single_artifact_benchmarking

Folders and files

Latest commit

History

Repository files navigation

Standardized Object Detection Benchmarking

Problem

Solution

Model Export Standards

Technical Implementation

Usage

Contributions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages