Releases
v1.11.0
yosefe
released this
26 Jul 21:35
1.11.0 (July 26, 2021)
Features:
Core
Added support for UCX monitoring using virtual file system (VFS)/FUSE
Added support for applications with static CUDA runtime linking
Added support for a configuration file
Updated clang format configuration
UCP
Added rendezvous API for active messages
Added user-defined name to context, worker, and endpoint objects
Added flag to silence request leak check
Added API for endpoint performance evaluation
Added API - ucp_request_query
Added API - ucp_lib_query
Ported connection manager to a new UCT API
Added bandwidth optimizations for new protocols multi-lane
Added support for multi-rail over lanes with BW ratio >= 1/4
Added support for tracking outstanding requests and aborting those in case of connection failure
Refactored keep-alive protocol
Added device id to wireup protocol
Added support up to 128 transport layer resources in UCP context
Added support CUDA memory allocations with ucp_mem_map
Increased UCP_WORKER_MAX_EP_CONFIG to 64
Adjusted memory type zcopy threshold when UCX_ZCOPY_THRESH set
Refactored wireup protocols, rendezvous, get, zcopy protocols
Added put zcopy multi-rail
Improved logging for new protocols
Added system topology information
Added new protocols for eager offload protocols
UCT
Extended connection establishment API
Added active message AM alignment in iface params
Added active message short IOV API.
Added support for interface query by operation and memory type
Added API to get allocation base address and length
Added md_dereg_v2 API
UCS
Added log filter by source file name.
Added checking for last element in fraglist queue
Added a method to get IP address from sockaddr.
Added memory usage limits to registration cache
UCM
Improved x86 parser to recognize some mov flavors
CUDA
Added registration for whole CUDA allocations
Added CUDA-IPC keepalive
Adjusted performance estimations
Added Improve logging
Added allocation methods for CUDA pinned/managed memory
Added support for a global cuda_ipc cache
RDMA CORE (IB, ROCE, etc.)
Added report of QP info in case of completion with error
Refactored of FC send operations
Added support for DevX unique QPN allocation
Optimized endpoint lookup for DCI
Added support for RDMA sub-function (SF)
Added support for DCI via DEVX
Added DCI pool per LAG port
Added support for RoCE IP reachability check using a subnet mask
Added active message short IOV for UD/DC/RC mlx, UD/RC verbs
Added endpoint keep alive check for UD
Suppressed warning if device can't be opened
Added support for multiple flush cancel without completion
Added ignore for devices with invalid GID
Added support for SRQ linked list reordering
Added flush by flow control on old devices
Added support for configurable rdma_resolve_addr/route timeout
Shared memory
Added active message short IOV support for posix, sysv, and self transports
TCP
Added support for peer failure in case of CONNECT_TO_EP
Added support for active message short IOV
Java
Added full support for UCP Java API
Tests
Added length/mem_type for UCP client server example
Added port sockaddr tests for a new API
Added test send-recv between client/server with diff UCX_IB_NUM_PATHS
Added support for CUDA and CUDA managed memory in io_demoo
Added support for a custom watchdog timeout from command line
Extended memtype hook tests
Tools
Added UCP active message support to perftest
Added error handling option to perftest
Added wakeup option
Added performance tests for am short iov
CI
Added RHEL 7.6 with MOFED 4.7
Added Fedora 34, RHEL 7.2, 7.4
Added PGI support from HPC-SDK module
Added docker image with CUDA 11.2
Added IODEMO test
Added Ubuntu 20.4
Added test for connection manager fallback in client-server testing
Added loopback interface for tcp testing
Bugfixes:
Build
Fixes in libnuma detection macro
Fixes for cross compilation support
Fixes for --without-dc compilation
Continues Integration
Fixes in Azure pipeline build system
Fixes in Coverity CI
Fixes in Azure release pipeline
Packaging
Fixed in DEB package - added essential system dependencies
Documentation
Fixes in UCP, UCT, Readme, FAQ, and Read-the-docs documentation
Tests
Fixes in CMA peer failure test
Fixes in SRQ tests
Fixes in the usage requests_wait
Fixes in test_uct_query
Fixes addressing race conditions on client user data in test_uct_sockaddr
Fixes in IODEMO app
Fixes in error handling flow for perftest
Fixes in perftest batch tests
Fixes addressing hang issues for rendezvous protocol in UCP client server example
UCP
Fixes in endpoint error handling
Fixes in error reporting failed CM lanes
Fixes in progress worker flush
Fixes in rendezvous pipeline flow
Fixes in recursive protocol selection
Fixes in error handling for AM_ZCOPY
Fixes in length check condition in RMA PUT short
Fixes in failure handling rendezvous offload send
Fixes in offload completion with inlined data
Fixes in statistics calculations for rendezvous protocol
Fixes in ucp_worker_query() thread mode for SERIALIZED
Fixes preventing leaks of UCP requests
ROCM
Fixes in device memory registration and de-registration
Fixes in missing mem_query definition for rocm_copy
Fixes addressing build failure due to const violation
Fixes in sockaddr_accessibility test for rocm_copy and rocm_ipc
Fixes in bandwidth estimation for rocm_ipc
RDMA CORE (IB, ROCE, etc.)
Fixes addressing deadlock between DCI resources and RDMA_READ credits
Fixes in DSCP for RoCE DCT
Fixes in flush(cancel) flow
Fixes preventing segfault in uct_rdmacm_cm_ep_str
Fixes in scatter-gather entries logging
Fixes for compilation with experimental verbs
Fixes in UD dgid filtering
Fixes in domain resources destroying
Fixes in PCIe bandwidth calculation
Fixes addressing CQ creation failure using legacy ibv API
Fixes in iov2sge converter
Fixes in port width check on HDR100
Fixes in SL selection
Fixes in hardware tag matching compilation
Fixes in uct_rdmacm_cm_cqs hash key
Fixes for compilation with rdma-core 20
Java
UCT
Fixes in reachability of loopback ifaces
Fixes addressing possible uninitialized memory accesses
Fixes in error flow for endpoints created upon receiving connection request
Fixes in TCP keepalive to avoid false-positive error detection
UCM
Fixes addressing heap corruption caused by ucp_set_event_handler()
Fixes in mmap events test
You can’t perform that action at this time.