- ONNX Runtime Custom Ops
Perform soft NMS on boxes
with scores
. Read Soft-NMS -- Improving Object Detection With One Line of Code for detail.
Type | Parameter | Description |
---|---|---|
float |
iou_threshold |
IoU threshold for NMS |
float |
sigma |
hyperparameter for gaussian method |
float |
min_score |
score filter threshold |
int |
method |
method to do the nms, (0: naive , 1: linear , 2: gaussian ) |
int |
offset |
boxes width or height is (x2 - x1 + offset). (0 or 1) |
- boxes: T
- Input boxes. 2-D tensor of shape (N, 4). N is the number of boxes.
- scores: T
- Input scores. 1-D tensor of shape (N, ).
- dets: T
- Output boxes and scores. 2-D tensor of shape (num_valid_boxes, 5), [[x1, y1, x2, y2, score], ...]. num_valid_boxes is the number of valid boxes.
- indices: tensor(int64)
- Output indices. 1-D tensor of shape (num_valid_boxes, ).
- T:tensor(float32)
Perform RoIAlign on output feature, used in bbox_head of most two-stage detectors.
Type | Parameter | Description |
---|---|---|
int |
output_height |
height of output roi |
int |
output_width |
width of output roi |
float |
spatial_scale |
used to scale the input boxes |
int |
sampling_ratio |
number of input samples to take for each output sample. 0 means to take samples densely for current models. |
str |
mode |
pooling mode in each bin. avg or max |
int |
aligned |
If aligned=0 , use the legacy implementation in MMDetection. Else, align the results more perfectly. |
- input: T
- Input feature map; 4D tensor of shape (N, C, H, W), where N is the batch size, C is the numbers of channels, H and W are the height and width of the data.
- rois: T
- RoIs (Regions of Interest) to pool over; 2-D tensor of shape (num_rois, 5) given as [[batch_index, x1, y1, x2, y2], ...]. The RoIs' coordinates are the coordinate system of input.
- feat: T
- RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element feat[r-1] is a pooled feature map corresponding to the r-th RoI RoIs[r-1].
- T:tensor(float32)
Filter out boxes has high IoU overlap with previously selected boxes.
Type | Parameter | Description |
---|---|---|
float |
iou_threshold |
The threshold for deciding whether boxes overlap too much with respect to IoU. Value range [0, 1]. Default to 0. |
int |
offset |
0 or 1, boxes' width or height is (x2 - x1 + offset). |
- bboxes: T
- Input boxes. 2-D tensor of shape (num_boxes, 4). num_boxes is the number of input boxes.
- scores: T
- Input scores. 1-D tensor of shape (num_boxes, ).
- indices: tensor(int32, Linear)
- Selected indices. 1-D tensor of shape (num_valid_boxes, ). num_valid_boxes is the number of valid boxes.
- T:tensor(float32)
Perform sample from input
with pixel locations from grid
.
Type | Parameter | Description |
---|---|---|
int |
interpolation_mode |
Interpolation mode to calculate output values. (0: bilinear , 1: nearest ) |
int |
padding_mode |
Padding mode for outside grid values. (0: zeros , 1: border , 2: reflection ) |
int |
align_corners |
If align_corners=1 , the extrema (-1 and 1 ) are considered as referring to the center points of the input's corner pixels. If align_corners=0 , they are instead considered as referring to the corner points of the input's corner pixels, making the sampling more resolution agnostic. |
- input: T
- Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.
- grid: T
- Input offset; 4-D tensor of shape (N, outH, outW, 2), where outH and outW is the height and width of offset and output.
- output: T
- Output feature; 4-D tensor of shape (N, C, outH, outW).
- T:tensor(float32, Linear)
Perform CornerPool on input
features. Read CornerNet -- Detecting Objects as Paired Keypoints for more details.
Type | Parameter | Description |
---|---|---|
int |
mode |
corner pool mode, (0: top , 1: bottom , 2: left , 3: right ) |
- input: T
- Input features. 4-D tensor of shape (N, C, H, W). N is the batch size.
- output: T
- Output the pooled features. 4-D tensor of shape (N, C, H, W).
- T:tensor(float32)
Returns a tuple (values
, indices
) where values
is the cumulative maximum elements of input
in the dimension dim
. And indices
is the index location of each maximum value found in the dimension dim
. Read torch.cummax for more details.
Type | Parameter | Description |
---|---|---|
int |
dim |
the dimension to do the operation over |
- input: T
- The input tensor with various shapes. Tensor with empty element is also supported.
- output: T
- Output the cumulative maximum elements of `input` in the dimension `dim`, with the same shape and dtype as `input`.
- indices: tensor(int64)
- Output the index location of each cumulative maximum value found in the dimension `dim`, with the same shape as `input`.
- T:tensor(float32)
Returns a tuple (values
, indices
) where values
is the cumulative minimum elements of input
in the dimension dim
. And indices
is the index location of each minimum value found in the dimension dim
. Read torch.cummin for more details.
Type | Parameter | Description |
---|---|---|
int |
dim |
the dimension to do the operation over |
- input: T
- The input tensor with various shapes. Tensor with empty element is also supported.
- output: T
- Output the cumulative minimum elements of `input` in the dimension `dim`, with the same shape and dtype as `input`.
- indices: tensor(int64)
- Output the index location of each cumulative minimum value found in the dimension `dim`, with the same shape as `input`.
- T:tensor(float32)
Perform Modulated Deformable Convolution on input feature, read Deformable ConvNets v2: More Deformable, Better Results for detail.
Type | Parameter | Description |
---|---|---|
list of ints |
stride |
The stride of the convolving kernel. (sH, sW) |
list of ints |
padding |
Paddings on both sides of the input. (padH, padW) |
list of ints |
dilation |
The spacing between kernel elements. (dH, dW) |
int |
deformable_groups |
Groups of deformable offset. |
int |
groups |
Split input into groups. input_channel should be divisible by the number of groups. |
- inputs[0]: T
- Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the number of channels, inH and inW are the height and width of the data.
- inputs[1]: T
- Input offset; 4-D tensor of shape (N, deformable_group* 2* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.
- inputs[2]: T
- Input mask; 4-D tensor of shape (N, deformable_group* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.
- inputs[3]: T
- Input weight; 4-D tensor of shape (output_channel, input_channel, kH, kW).
- inputs[4]: T, optional
- Input bias; 1-D tensor of shape (output_channel).
- outputs[0]: T
- Output feature; 4-D tensor of shape (N, output_channel, outH, outW).
- T:tensor(float32, Linear)
Perform Deformable Convolution on input feature, read Deformable Convolutional Network for detail.
Type | Parameter | Description |
---|---|---|
list of ints |
stride |
The stride of the convolving kernel. (sH, sW) |
list of ints |
padding |
Paddings on both sides of the input. (padH, padW) |
list of ints |
dilation |
The spacing between kernel elements. (dH, dW) |
int |
deformable_group |
Groups of deformable offset. |
int |
group |
Split input into groups. input_channel should be divisible by the number of groups. |
int |
im2col_step |
DeformableConv2d use im2col to compute convolution. im2col_step is used to split input and offset, reduce memory usage of column. |
- inputs[0]: T
- Input feature; 4-D tensor of shape (N, C, inH, inW), where N is the batch size, C is the numbers of channels, inH and inW are the height and width of the data.
- inputs[1]: T
- Input offset; 4-D tensor of shape (N, deformable_group* 2* kH* kW, outH, outW), where kH and kW is the height and width of weight, outH and outW is the height and width of offset and output.
- inputs[2]: T
- Input weight; 4-D tensor of shape (output_channel, input_channel, kH, kW).
- outputs[0]: T
- Output feature; 4-D tensor of shape (N, output_channel, outH, outW).
- T:tensor(float32, Linear)