We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
57.6966ms
118ms
import time import fastdeploy as fd import numpy as np import statistics if __name__ == '__main__': option = fd.RuntimeOption() option.use_gpu(0) option.use_trt_backend() option.trt_option.enable_fp16 = True option.trt_option.set_shape('images', [1, 3, 640, 640], [1, 3, 640, 640], [40, 3, 640, 640]) option.trt_option.serialize_file = 'weights/yolov8m.engine' model = fd.vision.detection.YOLOv8('weights/yolov8m.onnx', runtime_option=option) ims = [np.random.randint(0, 256, (360, 640, 3), dtype=np.uint8) for _ in range(20)] model.enable_record_time_of_runtime() costs = [] for i in range(500): if 100 <= i: begin = time.perf_counter() results = model.batch_predict(ims) if 100 <= i: costs.append(time.perf_counter() - begin) model.print_statis_info_of_runtime() print(f'{int(1000 * statistics.mean(costs))}ms')
$ python benchmark.py [INFO] fastdeploy/runtime/backends/tensorrt/trt_backend.cc(719)::CreateTrtEngineFromOnnx Detect serialized TensorRT Engine file in weights/yolov8m.engine, will load it directly. [INFO] fastdeploy/runtime/backends/tensorrt/trt_backend.cc(108)::LoadTrtCache Build TensorRT Engine from cache file: weights/yolov8m.engine with shape range information as below, [INFO] fastdeploy/runtime/backends/tensorrt/trt_backend.cc(111)::LoadTrtCache Input name: images, shape=[-1, 3, -1, -1], min=[1, 3, 640, 640], max=[40, 3, 640, 640] [INFO] fastdeploy/runtime/runtime.cc(339)::CreateTrtBackend Runtime initialized with Backend::TRT in Device::GPU. ============= Runtime Statis Info(yolov8) ============= Total iterations: 500 Total time of runtime: 29.7184s. Warmup iterations: 100 Total time of runtime in warmup step: 6.63981s. Average time of runtime exclude warmup step: 57.6966ms. 118ms
The text was updated successfully, but these errors were encountered:
模型内置的,统计的单纯是推理引擎的耗时。 而Python端,统计的是包含数据前后处理+推理引擎耗时
Sorry, something went wrong.
目前YOLOv8的预处理没有继承ProcessorManager,不支持CVCUDA加速。
请问如果适配这部分代码之后,如何正确的在Python将默认的预处理替换为CVCUDA?
是否仅需要初始化模型并调用接口model.preprocessor.use_cuda(True, 0):
model.preprocessor.use_cuda(True, 0)
model = fd.vision.detection.YOLOv8(...) # model.preprocessor.use_cuda(True, 0) # CPU model.preprocessor.use_cuda(False, 0) # CUDA model.preprocessor.use_cuda(True, 0) # CVCUDA
juncaipeng
No branches or pull requests
环境
性能疑问
57.6966ms
和Python耗时118ms
的差距。The text was updated successfully, but these errors were encountered: