Skip to content

[DEP-291] NMS params passed at runtime for NMS-fused ONNX graph#2195

Draft
dkosowski87 wants to merge 20 commits intomainfrom
DEP-291/configurable-NMS-threshold-for-YOLO
Draft

[DEP-291] NMS params passed at runtime for NMS-fused ONNX graph#2195
dkosowski87 wants to merge 20 commits intomainfrom
DEP-291/configurable-NMS-threshold-for-YOLO

Conversation

@dkosowski87
Copy link
Copy Markdown
Contributor

Work in progress ...

Task: DEP-291

roboflow-train related PR

…ration.py, utilizing new environment variable retrieval functions for strings and comma-separated lists.
…le input names and validating against declared fused NMS input names. Update forward method to dynamically build input tensors based on configuration, improving flexibility and error handling.
… and logging. Update input name checks for fused NMS and modify forward method to conditionally build input tensors based on provided parameters, improving flexibility in model configuration.
…he script includes command-line options for specifying image and model paths, as well as parameters for confidence, IOU threshold, and maximum detections. It handles image loading, model initialization, and outputs detection results with bounding box coordinates and confidence scores.
…he script includes command-line options for specifying image and model paths, as well as parameters for confidence, IOU threshold, and maximum detections. It handles image loading, model initialization, and outputs detection results with bounding box coordinates and confidence scores.
…inference method to utilize the new post_process_nms_fused_model_output function for improved detection results when fused NMS is enabled.
…mmand-line parameters for specifying the number of benchmark iterations and warmup runs, allowing users to measure inference latency with mean, median, and standard deviation statistics. Enhanced the main function to handle these new options while maintaining existing inference functionality.
…ns to generate and write latency and prediction reports in JSON format, including detailed statistics on inference latencies. Updated command-line options to specify target directory for output files, improving usability and organization of results.
…MS inference script. Updated output file paths to include run name as a subdirectory, enhancing organization of results and improving usability for reporting.
…troduced a new command-line option for selecting ONNX Runtime execution providers (cpu, cuda, tensorrt) and updated the model loading process to utilize the selected provider and device. Enhanced latency reporting to include execution provider details, improving configurability and transparency in inference performance.
…uced a new command-line option for specifying batch size, allowing users to duplicate the input image for batched inference. Updated the main function and related methods to handle batch inputs, enhancing performance measurement and flexibility in inference execution.
…ages. Updated command-line options to accept an image directory instead of a single image path, enabling batch processing of multiple images. Enhanced latency reporting with additional metrics and improved JSON output structure for better organization of results.
…ency reporting. Introduced a constant for test batch size, enhanced JSON output to include image paths, and updated command-line options for benchmark iterations and warmup runs. Refactored image loading logic to support dynamic batching based on model configuration.
…-element tensors for confidence, IOU threshold, and max detections. This change improves compatibility with the input builders and streamlines the inference process.
…ludes command-line options for image and model paths, as well as parameters for confidence, IOU threshold, and maximum detections. Handles image loading, model initialization, and outputs detection results with bounding box coordinates and confidence scores.
…. Includes command-line options for run name, image directory, model path, target directory, and parameters for confidence, IOU threshold, and maximum detections. Implements latency reporting and JSON output for detailed performance metrics, enhancing usability and configurability for inference tasks.
… error messaging. Updated input validation to allow tensors with dimension 0 equal to 1 and enhanced error messages for incompatible batch sizes. Adjusted batch processing logic to ensure compatibility with dynamic input sizes in YOLOv8 model.
…. Introduced a new command-line option for selecting execution providers (cpu, cuda, tensorrt) and updated model loading to utilize the selected provider and device. Enhanced logging to include provider details for improved configurability and transparency in inference performance.
@dkosowski87 dkosowski87 mentioned this pull request Apr 7, 2026
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant