forked from wolfpld/tracy
-
Notifications
You must be signed in to change notification settings - Fork 0
/
NEWS
1034 lines (960 loc) · 49.5 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Note: There is no guarantee that version mismatched client and server will
be able to talk with each other. Network protocol breakages won't be listed
here.
v0.x.x (xxxx-xx-xx)
-------------------
- Support for pre-0.8 traces has been dropped.
- Profiled programs will ignore dlclose() calls.
- Added warning when the profiler interface is run with privilege elevation.
Advice is given to instead run the client with admin rights.
- Switched to official ZEN4 uarch data.
- Handle cases when thread name is set, but not through Tracy facilities.
- Allow customization of source location data through the following macros:
- TracyFunction - defaults to __FUNCTION__
- TracyFile - defaults to __FILE__
- TracyLine - defaults to __LINE__
- Tracy on Linux now targets and requires Wayland by default.
- Please don't ask about window decorations on Gnome. Current behavior is
the intended behavior. Gnome does not want windows to have decorations,
and Tracy respects this choice. If you find this problematic, use a
desktop environment that actually listens to its users.
- Pass LEGACY=1 parameter to make, if you want to instead rely on the GLFW
library, like before.
- Other platforms still use GLFW.
- Compare traces menu can now display source code differences between two
traces.
- Files containing assembly listings are now annotated with source line
information.
- Histogram tooltip will now also show left/right counts.
v0.9.0 (2022-10-26)
-------------------
- Attention! All the header and source files used for integrating Tracy with
applications were moved to the public/ directory. This will break your
integration!
- To fix this, update the source and include directories lists to point to
the new location.
- Tracy include files directly referenced by the client were moved to
tracy/ subdirectory, to facilitate setups which previously had Tracy
checkout parent directory in the include paths list (i.e. when you
included "tracy/Tracy.hpp").
- Previously, if you have included the Tracy checkout directory in your
project include directories list (i.e. you could include "Tracy.hpp"),
this could result in third-party library conflicts, e.g. with ImGui.
Such scenarios are no longer the case.
- Tracy macros now require to be terminated with a semicolon.
- The undocumented ___tracy_demangle() function API has been changed. Please
refer to the source code for further instructions.
- The parameter callback and its registration macro have been extended to
include user data pointer. You will need to update your code accordingly.
- Plots visualization has been improved.
- Each plot now has its own color, which can also be defined by the user.
- The area below the plot is now optionally filled with a color.
- Plots can now also be configured to be staircase instead of smooth. This
new setting is appropriate for many inputs where only discrete values
make sense, e.g. the memory allocation plot.
- The API for TracyPlotConfig() macro has been changed. Please refer to
the manual to see how you can fix this.
- Some text labels in the user interface are now more easy to read.
- The profiler will now instruct the user in the UI on what can be done, if
the send queue is slow to process (typically due to symbol resolution).
- If a client with an incompatible protocol is discovered, Tracy will now
try to show which versions can be used to handle the connection.
- Messages list in zone info window can now show messages exclusive to the
zone, filtering out the messages emitted from child zones.
- Added capture of vertical synchronization timings on Linux.
- The range of frame bar colors in the frames overview on top of the screen
can be now controlled with the "Target FPS" entry box in the options menu.
- The "Draw frame targets" option does not need to be selected.
- Previously the hardcoded FPS target thresholds were: 30, 60, 144 FPS.
- Currently the FPS target threshold is: half of target, target, twice the
target.
- Reworked the way zone names are shortened.
- Previously shortening supported only namespace removal, in a way that
didn't consider function parameters or template arguments.
- Shortening to one-letter namespace chains is no longer available.
- The new shortening rules first perform normalization of the function name.
- The function const qualifier is removed.
- Common return types are removed.
- All function parameters and all template arguments are removed.
- The next steps consist of repeated removal of namespaces, starting with
the most outermost one.
- While the old process was all or nothing, the new implementation by
default will dynamically adjust to the space available, trying to show
the most context possible.
- It is also possible to completely disable shortening, or require that it
is always performed in full.
- Function name normalization is enabled by default, even if there is space
to show full function name. This can be changed in options.
- Previously shortening was only applied to the zone names displayed on the
timeline. Currently this process will also apply to all other places in
the UI where function names are displayed. However, in these cases the
function names will only be normalized.
- Full function names are still available as tooltips, or in fine print if
the normalized name is already displayed in a tooltip.
- This functionality is disabled if zone name shortening is disabled.
- Added context menu for timeline labels. Currently the only option is to hide
the selected thread, plot, etc.
- You can now provide custom source file contents through a profiler callback.
- Exposed Tracy version to client applications (available through the
common/TracyVersion.hpp header file).
- D3D12 instrumentation is now thread-safe.
- Timeline can be now navigated with WASD keys.
- Symbol file paths are now normalized on libbacktrace systems. For example,
instead of "/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/12.2.0/../../../../
include/c++/12.2.0/bits/std_mutex.h" Tracy will now report such file as
"/usr/include/c++/12.2.0/bits/std_mutex.h".
- The import-chrome utility interprets Instant (`i`/`I`) events where the
`name` field contains the word `frame` as a frame event. The `name` is the
frame set name.
- Frame data won't be displayed if there was no frame instrumentation in the
profiling session.
- Note that some automated functionality (e.g. vertical synchronization
capture) may automatically generate frame data, which will force frames to
be displayed.
- Tracy threads will now be collapsed by default on the timeline.
- Clicking on a local thread in the CPU data view will make the thread visible
and uncollapsed on the timeline.
- Assembly view is now in color.
- The profiler UI will no longer unnecessarily redraw the screen if nothing
was changed. This should have a profound impact on power usage.
- Added microarchitecture data for Zen 4.
- Implemented optional propagation of inline cost down the local call stack.
- This feature may be useful when trying to get a general outlook of the
cost at the top-level function in the symbol.
- It is possible to get nonsense data when this is enabled, for example
total cost exceeding 100%. This is by design.
- Assembly line costs are not affected.
- Available clients now also broadcast their PID.
- Reversed mouse button assignments for jumping to source / assembly line in
symbol view. The left mouse button will now focus the target line.
- Assembly lines tooltip will now display local call stack of inline functions
(within the symbol).
- Right-clicking the source location entry in assembly line will show the
local call stack, along with source code preview of each entry and ability
to navigate to any selected inline function.
- The profiler UI will now indicate that it needs attention if the window is
not focused and something interesting happens. For example when a connection
is established, or when a saved trace finishes loading, etc. How the
attention request is indicated depends on the operating system.
- Clicking on the red microarchitecture icon in the symbol view assembly pane
will switch the selected microarchitecture to one the profiled application
was running on.
- Removed option to display instruction latencies in a graphical form. Latency
data is still available in instruction tooltip.
v0.8.2 (2022-06-28)
-------------------
- Added support for debuginfod debug information services. Note that
since this depends on proper system configuration, vendors providing
the debug information, and network retrieval, it is disabled by
default. To enable, compile the profiled application with the
TRACY_DEBUGINFOD define and link with libdebuginfod.
- When Tracy server-side utilities are build with MSVC, the required
libraries will be now automatically retrieved and built with vcpkg.
- Added microarchitecture data for: Bonnell, Airmont, Goldmont, Goldmont
Plus, Tremont.
- Recognize additional CPUIDs of Zen 3, Alder Lake, Ice Lake
microarchitectures.
- Assembly line width will be now extended, if needed. Previously the line
width was calculated for the initial layout and changing amount of
displayed data (especially listing the read/written registers) didn't
affect this, which may have made some lines partially unreadable.
- Added ability to filter call stacks in memory tab by inactive allocations.
Filtering by inactive allocations helps to pinpoint wasteful allocations
in the program.
- Plot graph will no longer display min/max values interpolated for
animation, but rather true values.
- The CPU topology tree structure was replaced by a CPU schematic showing
the same thing in a more concise way.
v0.8.1 (2022-04-21)
-------------------
- Support for pre-0.7 traces has been dropped.
- Update utility can now scan for source files missing in the trace cache,
if the '-c' parameter is given. Found files will be added to the cache.
- Added high-priority queue for fast queries to bypass slow symbol queries.
- Fixed Android documentation to show how to enable context switch tracing.
- Workaround MSVC 2015 stupidity which prevented compilation as C++11.
- Added support for showing branch cost data for CPUs that don't report
branch retirement events (but do report branch misses).
- The right-click context menu available for jump arrows in the symbol view
window will now additionally display jump context, i.e. jump sources and
jump target source code fragments.
- Added freedesktop.org compliant desktop entry and MIME type definition.
- The call stack column in list of messages will now be only displayed when
at least one message on the list has call stack data.
- File dialogs on Unix will be now native to the desktop environment you are
using. Note that this relies on xdg-desktop-portal and dbus.
v0.8.0 (2022-03-28)
-------------------
- Support for Cygwin has been dropped. It was not working for a very long
time and nobody had complained about it.
- Mingw is deprecated due to lack of interest.
- Added TRACY_NO_CALLSTACK_INLINES macro to disable inline functions
resolution in call stacks on Windows.
- Improved function matching algorithm in compare traces view.
- Added CMake integration.
- Reworked rpmalloc initialization.
- Fixed display of messages with newlines on messages list.
- Excluded some uninteresting wrapper functions from call stacks (for
example SIMD pass-through intrinsics to the compiler built-ins).
- Adjusted coloring of instruction hotness in symbol view.
- Properly handle rare cases when sampling on Linux is momentary not able to
resolve time stamps.
- Added Rocket Lake microarchitectural data.
- Updated CPU identifier lists.
- Implemented GPU timer overflow handling heuristics.
- Assembly instructions are now assigned to inline symbols.
- You can not only see the assembly source file and line, but also the
originating function.
- If symbol view is restricted to a single inline function, all assembly
instructions not in this context will be dimmed out.
- Likewise, the navigation in assembly code will be limited just to the
inline context, if a single function is selected.
- Kernel call stacks will be now properly captured and displayed in the
profiler. Kernel functions are marked with the red color.
- The CPU hardware performance counters can be now sampled on Linux.
- Three inferred statistics are displayed for lines in both source and
assembly code in the symbol view window:
- Instructions executed per cycle.
- Branch miss rate.
- Cache miss rate.
- Instruction cost estimation method is no longer tied to software call
stack sampling.
- The image name filter entry field is now providing a list of available
images.
- Reentrant function calls may be now excluded from calculations in the
statistics view.
- Crash handler is now properly removed during profiler destruction.
- Repeatedly right-clicking on the same source line in the symbol view
window will now cycle through assembly blocks associated with this source
line.
- Vulkan headers must be now explicitly included before including
TracyVulkan.hpp.
- The capture utility may now limit capture time to a specified number of
seconds.
- Fixed message thread assignment in the import-chrome utility.
- Sampling data can be now also found in the find zone menu.
- Instrumentation failures may now display their context, e.g. the zone text
that was to be set.
- A warning is now displayed when sampling data is out-of-order.
- Average value for plots can be now viewed.
- Moved symbol resolution to a separate thread. Profiling will no longer be
stuck when there is a large number of symbols to resolve. This not only
improves user experience, but also prevents buildup of data (and memory
consumption) on the client side.
- Android device name will be now reported.
- Added support for capturing fibers.
- Fibers require additional processing, which has to be enabled by adding
the TRACY_FIBERS define on the client side.
- Client code requires additional instrumentation using the new macros
TracyFiberEnter and TracyFiberLeave (or the corresponding C API
variants).
- Fibers are represented in traces as separate threads, and are
distinguished by green color. Faux context switch regions are used to
indicate when a fiber is being run by the worker thread.
- Continuous frame marks no longer need to be issued from a single thread.
- Context switch call stacks are now captured on Windows and Linux.
- Hovering the context switch wait region will now display wait stack,
which may provide additional insight into why the switch happened.
- Wait stacks inspection can be performed in a new view.
- Stacks can be limited to certain threads and to a selected time range.
- Stacks are presented either as a sorted list, or as a bottom-up and
top-down trees.
- Entry call stacks can be now also viewed as a bottom-up and top-down
trees.
- Updated project build files to MSVC 2022.
- Call stack tooltips now also show the executable image name.
- Playback frames can be now changed by interacting with the frame image
slider using the mouse wheel.
- Signal used to handle crashes on Linux can be now redefined.
- Various DPI scaling improvements.
- User interface can be now scaled in run time.
- Symbol code retrieval now also supports kernel on Windows.
- Added low-level C API interface for GPU zones.
- Symbol child calls can be now listed.
- Replaced "restrict time" in memory window with a proper time range limit.
- Added Alder Lake microarchitectural data.
- Added GPU zone statistics.
- Universal Windows Platform support.
- All call stack related functionality can be now disabled with the
TRACY_NO_CALLSTACK macro.
- Added ability to add full-view annotations from the annotations list
window.
v0.7.8 (2021-05-19)
-------------------
- Updated Zen 3 and added Tiger Lake microarchitectural data.
- Manually disconnecting from the server will no longer display erroneous
warning message.
- Added ability to display sample time spent in child function calls.
- Fixed issue which may have prevented sampling on ARM64.
- Added TRACY_NO_FRAME_IMAGE macro to disable frame image compression
thread.
- Ctrl and shift keys will now modify mouse wheel zoom speed.
- Improved user experience in the symbol view window.
- Added support for Direct3D 11 instrumentation.
- Vulkan contexts can be now calibrated on Linux.
- Support loading zstd-compressed chrome traces.
- Chrome traces with multiple PID entries (and possibly conflicting TIDs)
can be now imported.
- Added support for custom source location tag ("loc") in chrome traces.
- Sampling frequency can be now controlled using TRACY_SAMPLING_HZ macro.
- Trace compression can be now selected when saving a trace.
- If a trace cannot be saved, a failure dialog will be displayed.
- Run-time memory usage of frame images can be reduced by calculating
a compression dictionary. This can be only performed when a trace is saved
or through the update utility.
v0.7.7 (2021-04-01)
-------------------
- Linux crash handler will now also catch SIGABRT.
- Fixed invalid name assignment to source files discovered client-side.
- Added ability to check if a zone is active (which may be used to avoid
preparing zone text, etc., as it wouldn't be used anyway).
- Improved sorting behavior of internal vectors.
- Some data will now be always properly displayed during live capture.
This was not particularly visible before, as it mainly concerns edge
cases.
- Sorting is performed only as needed.
- In case of plots the performance during live capture may be decreased,
as these were sorted with at least 0.25 second intervals before. Now
the sorting is performed every frame.
- Some other data, which previously was not sorted, is sorted now.
- In headless capture mode sorting will be only performed when the trace
is saved to disk.
- Fixed some typos in macros.
- Fixed handling of non-ANSI file names on Windows. You can now name your
traces 'ęśąćż.tracy' and it should work as intended. This is supported on
Windows 10 release 1903 and newer.
- Fixed sending GPU context name in on-demand mode.
- Fixed color channel order in ZoneColor() macro.
- Handle failure state when a memory pointer allocation is reported twice,
without an intermediate free.
- Renamed "call stack parents" to "entry call stacks".
- Display number of entry call stacks in assembly line sample count tooltip.
- Added tooltips with preview of source code in various places in the UI.
v0.7.6 (2021-02-06)
-------------------
- Various fixes in build scripts.
- Fixed a faulty rpmalloc initialization path when the first thing the
thread did was sending a message with call stack.
- Added fallback timer define for various virtualized environments, which
may not be able to access the hardware timer registers. This will result
in usage of timer provided by the standard library, with reduced
resolution.
- Further OpenCL improvements.
- Updated libbacktrace.
- Adds Mach-O 64-bit FAT support.
- Fixes memory corruption when processing Mach-O data.
- Fixes missing matching entries during binary search.
- Adds support for MiniDebugInfo.
- Adds fallback to ELF symbol table if no debug info is available.
- Various other fixes.
- Store build time of profiled program in captures.
- GPU contexts can be now named.
- Implemented client -> server source code transfer.
v0.7.5 (2021-01-23)
-------------------
- More robust handling of system tracing on Android.
- Added warning dialog when the connection is lost before all needed data
can be retrieved.
- Fixed handling of NaN plot entries (by skipping them).
- Dynamic zone colors are now supported through the ZoneColor() macro.
- Fixed Arm machine code printout to match the one printed by objdump.
- Fixed client memory corruption when using colored messages.
- Switched to the next-gen ImGui table UI.
- Table columns can have their order rearranged, can be hidden, can be
sorted both in ascending and descending order (where appropriate).
- Table columns state is now preserved between runs.
- Various fixes related to restricting listening to localhost.
- Improved compatibility of ETW tracing with non-MSVC compilers.
- Fixed Vulkan call stack transfer.
- Added support for transient GPU zones (OpenGL, Vulkan, Direct3D 12).
- OpenCL fixes for assert-less builds and non-active zones.
- Added support for thread names and title bar description in traces
imported from chrome tracing format.
v0.7.4 (2020-11-15)
-------------------
- Added support for user-provided locks to keep dbghelp calls thread-safe.
- Call stacks can be now copied to clipboard.
- Allow more control over which automated captures are performed.
- Added textual descriptions for some assembly instructions.
- Profiler memory usage is now also displayed as a percentage of available
physical memory.
- Microarchitecture mismatch is now clearly displayed in the source view
window.
- Added Zen 3 and Cascade Lake microarchitectural data.
- Ghost zones are now supporting all zone coloring modes and namespace
shortening.
- Extend C API to support memory pools.
- Frame rate targets can be now visually represented on the timeline view.
v0.7.3 (2020-10-06)
-------------------
- Properly support DPI scaling on Linux (requires GLFW 3.3).
- Added early checks for output file validity in the capture utility.
- Improvements to presence broadcast handling.
- Custom zone colors can be optionally ignored.
- Added support for tracking multiple memory pools.
- Memory free failure dialog can now show call stack pointing to the failure
location.
- Added support for Wayland on Linux.
- If during the first 5 seconds of the trace there are no frames being
reported, the profiler will switch to following last 5 seconds of the
trace, instead of displaying three last frames.
v0.7.2 (2020-09-14)
-------------------
- Note: the bitbucket repository is obsolete and will soon stop receiving
updates. Migrate to https://github.com/wolfpld/tracy, if you haven't
already.
- The "waiting for connection" dialog no longer has "cancel" button. To
abort connection attempt just use the "close window" button.
- Added update notification.
- The most recent traced events can be now viewed regardless of timeline
zoom level.
- Fixed going-to-line in source view (again).
- Crash handling on client is now not performed, if there is no active
connection.
- Added ability to listen only on IPv4 interfaces.
v0.7.1 (2020-08-24)
-------------------
- Dropped support for pre-v0.6 traces.
- Fixed regression on non-AVX2 CPUs.
- Fixed incorrect calculation of some ghost zones.
- Added list of cached source files.
- Added import of plot data.
- Secure versions of alloc/free macros.
- Automated tracing of vertical synchronization on Windows.
- Fixed attachment of postponed frame images.
- Source location data can be now copied to clipboard from zone info window.
- Zones in find zones menu can be now grouped by zone name.
- Vulkan and D3D12 GPU contexts can be now calibrated.
- Added CSV export utility.
- "Go to frame" popup no longer has a dedicated button. To show it, click on
the frame counter.
- Added macro for checking if profiler is connected.
- Implemented optional data removal from traces in the update utility.
- Allow manual management of profiler lifetime.
- Adjusted priority of ETW threads to time critical.
- Annotations can be now freely adjusted on the timeline.
- Limiting time range for find zone functionality has been significantly
improved.
- Added time range limits for statistics and symbol view.
- Implemented call stack sampling on Linux (including Android).
- Exact time from start of profiling session can be now viewed by hovering
the mouse over the time scale.
- Code transfer can be now compiled-out.
- Added support for zone markup in unloadable modules.
- Added image name filter to sampling statistics results window.
v0.7 (2020-06-11)
-----------------
This is the last release which will be able to load pre-v0.6 traces. Use the
update utility to convert your old traces now!
- chrome:tracing importer now imports zone metadata from "args" key.
- Added display of statistical mode to find zone menu.
- Automatic stack sampling is now available on windows.
- Properly handle tracing on long-running systems.
- Message list entries can now show associated frame image.
- Call stack window will now display module names.
- Symbol location in call stack window may now also display symbol address.
- Statistics menu can now be used to display call stack sampling data or
list available symbols.
- All call paths leading to the sampled instruction in a call stack can be
now displayed.
- Frame image compression ratio (lossless in-memory compression, not taking
into account DXT compression) is displayed in playback window.
- Allow reconnection straight from the discard data dialog.
- Added ability to set custom names for locks.
- Improved handling of network ports.
- Added time percentage display to instrumentation statistics.
- Display of ghost zones (generated from automated call stack sampling).
- Notify when empty labels display is enabled.
- Small fragments of executable code will be now sent from client to server.
- Added notification about query backlog.
- Fixed performance problem with query backlog.
- Display number of in-flight queries, in addition to query backlog.
- Improved failure reports.
- The capture utility will connect to localhost by default.
- Added optional support for QPC timer on windows.
- Complete rewrite of source file viewer. It is now 100% reliable when going
to a source location.
- Symbol source view was added.
- Extension of source file viewer.
- Can display source file, assembly view, or both at the same time.
- May include display of statistical profiling data.
- Ability to switch between source files which were used to build the
symbol.
- Ability to switch between inlined functions which are incorporated into
the symbol.
- Graphical representation of control flow in program.
- Display of micro-architectural data for each assembly instruction.
- Tracking register dependencies between assembly instructions.
- Disassembly may be saved to a file, in order to be processed by external
tools.
- If the default listening port is occupied, profiler will now try listening
on other ports.
- Added possibility to perform source file names substitution.
- Profiler windows can be now docked.
- CPU usage tooltip now displays a list of running threads.
- Added possibility to filter discovered clients list.
- Source files are now cached during capture.
- Profiler will now display a popup when application crashes.
- Added ability to send simple integral values as extra payload for zones.
- Per-frame zone times on the frames plot can now display self time.
- Ability to bind only on localhost interface.
- OpenCL profiling.
- Direct3D 12 profiling.
v0.6.3 (2020-02-13)
-------------------
- Fixed performance issues with loading saved traces on Ryzen CPUs.
- Profiler window contents are now properly updated during window resize.
- Improved tid to pid mapping on windows.
- Zero length and unfinished zones are no longer taken into account for
statistics.
- Build files for shared library are now available (experimental).
- GPU zones now also have "active" parameter.
- Further reduction of memory usage and on-disk trace size.
- Replaced ska::flat_hash_map with robin-hood-hashing.
- Speed-up rendering of long lists of items.
- Exact event time is displayed in some places in the UI.
- Memory allocation lists can now be sorted.
- Added display of trace file compression ratio.
- Optional Zstd compression of trace files.
- Frame images are now internally compressed using Zstd (instead of LZ4).
- Fix display of continuous frame set tooltips.
v0.6.2 (2019-12-30)
-------------------
- Improved call stack decoding on OSX.
- Collection of CPU topology data.
- C API now supports allocated source locations.
- Added chrome:tracing importer.
- Allow merging of ZoneText() strings.
- Time distribution can now show both exclusive and inclusive times.
- Display proper value of selection time in find zone menu.
- Implemented limiting find zone search to a specified time range.
- Highlight hovered zone from find zone menu zone list on the histogram.
- Allow copying user data directory location to the clipboard.
v0.6.1 (2019-11-28)
-------------------
- Dropped support for pre-v0.5 traces.
- Improve BSD support.
- GPU zone CPU thread highlight will now highlight whole thread, not only
the thread name.
- Added CPU thread highlight for CPU data items.
- Client parameters may be now set from the server.
- Minor UI fixes.
v0.6 (2019-11-17)
-----------------
This is the last release which will be able to load pre-v0.5 traces. Use the
update utility to convert your old traces now!
- Dropped support for pre-v0.4 traces.
- Major memory usage decrease.
- Significant network bandwidth decrease.
- Implemented context switch capture on selected platforms.
- Zone timings in various UI places can now take into account only the
time when the thread was executing.
- Zone information window can now display regions in which thread was
suspended by the operating system.
- CPUs on which the zone was running are enumerated.
- Thread activity regions can be graphed on the timeline.
- API breakage: SetThreadName() now only works on current thread.
- Fixed thread name retrieval after thread is destroyed.
- Added number of CPU cores to host info.
- Limited number of possible source locations to 64K.
- Limited supported capture length to 1.6 days.
- CPU cores are now displayed on the timeline.
- Thread execution workload is displayed, including threads from external
programs.
- Thread migrations across CPU cores can be graphed.
- System-wide workload distribution is now plotted on the timeline.
- Added "CPU data" window showing programs competing for CPU during the
capture.
- Switched to using native thread identifiers (relatively small numbers), as
opposed to pthreads identifiers, which in reality were pointers.
- Improved thread name discovery if context switch capture is enabled.
- Per-trace state is now preserved between profiling sessions:
- Timeline view position.
- Item categories draw/hide settings.
- Timeline zones will be highlighted using a different color, when a
matching time range is selected on histogram.
- Per-frame zone times are now displayed on the frames plot when a zone is
selected in the find zone menu.
- Zone color is now displayed in zone information window.
- Zone colors can now be determined basing on depth and thread or source
location.
- Thread colors are displayed across the profiler application.
- Frame times can be now compared.
- Expose more lock handling functionality.
- Network port can be now specified by the user.
- Proper handling of multithreaded Vulkan code.
- Added extreme compression level in update utility.
- Added time distribution data in the zone information window.
- Trace file name is now displayed in trace information window.
- Annotations can be now added to the timeline.
- Server now performs network data retrieval and decompression on a dedicated
thread.
- Added examples of Tracy integration.
- Allow grouping of zones in the find zone menu by zone parent or with no
grouping.
- Zone list in the statistics window can be now filtered.
- Implemented configuration of plots.
- Messages can now collect call stacks.
v0.5 (2019-08-10)
-----------------
This is the last release which will be able to load pre-v0.4 traces. Use the
update utility to convert your old traces now!
- Major decrease of trace dump file size.
- Major optimizations across the board.
- Vcpkg is now used for library management on Windows.
- Display dump file size change in the update utility.
- Added notification area.
- Display trace loading time.
- Display background processing tasks after trace is loaded.
- Display trace save notification.
- Show crash icon, if there was a crash.
- Added C API.
- Profiling session may now gracefully terminate, due to incorrect
instrumentation. A popup with termination reason will be displayed.
- Call stack improvements.
- Call stack frames now have a proper source file and file line
information on Linux.
- Single call stack frame may now have multiple entries, representing
inlined function calls.
- Call stack grouping in the find zone menu now has a special display
mode.
- Call stack memory allocations tree improvements:
- Add top-down variant to complement the previously available bottom-up
one.
- Add ability to group tree nodes by function name.
- Allow restricting tree to display only active allocations.
- Added support for Lua call stack capture.
- Self time of zones may be now displayed in the find zone menu.
- Added ability to disconnect from a client.
- Find zone groups can now be sorted by mean time per call.
- Zones displayed in the find zone menu can be now grouped by order of
appearance, execution time or name.
- Time is now displayed without trailing fractional zeros (e.g. "2.5 ms"
instead of "2.50 ms").
- Child zones displayed in zone info window can be now grouped by source
location.
- Selected or hovered lock is now highlighted on the timeline.
- Locks are now grouped into single and multithreaded (contended and
uncontended) in the options menu locks list.
- On broken platforms the profiler can now be initialized as needed (and
possible), taking a performance and functionality hit.
- User experience improvements in the graphical profiler.
- Thread position and height is now animated, to eliminate flickering that
was happening when depth of displayed zones was changing.
- Zooming in/out using the mouse wheel is now animated.
- Plot range adjustment is now animated.
- Various other UI improvements.
- System CPU usage is now being monitored.
- Threads that have nothing to display in the current view are now hidden by
default.
- Dimmed-out the timeline outside the profiling area.
- Source file view can now be opened also from statistics menu.
- Display standard deviation in find zone and compare traces menus.
- Display zone messages in zone information window.
- Display order of threads can be changed in the options menu.
- Prevent deadlocks by querying socket send buffer size.
- Frame set statistics can be now limited to frames visible on the screen.
- Messages can be now colored.
- Zone selection in compare traces menu can be now linked to the other
trace.
- Added support for frame image (screen shot) storage.
- Implemented ability to cut off outliers on histograms.
- Zone or frame that is currently hovered by the mouse cursor will be
highlighted on the histogram.
- Server now displays available clients in the local network.
- Source code whitespace visibility can now be enabled or disabled.
- Profiler will now check if proper timer readings can be performed on
x86/x64.
- Application can now log app-specific information, similarly to how the
host info reports system information.
- Message list will automatically scroll down to the most recent message.
- Feature will disable when the list is scrolled by user.
- To re-enable, scroll to the bottom of the list.
- Message list can be now filtered.
- A notification popup will be displayed during trace cleanup.
- Source file view won't be available if a source file is newer than the
capture.
- Added ability to set custom trace descriptions.
- Added frame time target lines.
- FPS counts are now displayed next to frame times.
- GPU drift value can be now automatically measured.
- Connection window is now a popup hidden under a dedicated button.
v0.4.1 (2018-12-30)
-------------------
- Active frame set can be now switched by clicking on a frame set on the
timeline.
- Add ability to go to a specified frame.
- Most commonly used addresses can be now selected from the drop-down menu.
- Fixed corner case problem with profiler initialization on Windows.
- Added third state (stopped) to the pause/resume button. It will be used
after the connection to the client is terminated.
- Active trace can be discarded.
- Call stack capture may be forced through TRACY_CALLSTACK define.
- Lock info window has been added.
- Time of lock creation and termination is now being tracked.
- Menu bar buttons are now toggles that can also close their corresponding
windows.
- Find zone and compare menu improvements.
- Ability to ignore case during search.
- Pressing enter key will now start search, just like pressing the "find"
button.
- Using the ^F keyboard shortcut will open the find zone menu and focus
the input box.
- Added ability to automatically connect to an IP address in the graphical
profiler application (use "-a address" argument to enable).
- Pressing enter key after entering client address in the welcome dialog
will now automatically begin connection process.
v0.4 (2018-10-09)
-----------------
- Renamed "standalone" utility to "profiler".
- Added trace update utility, which will convert files saved in previous
versions of tracy to be up-to-date.
- Optional high compression (--hc) mode is available that will increase
the compression level, at the cost of considerably longer compression
time.
- Fix regression causing varying size of profiler window for different
captures.
- Added support for on-demand tracing.
- If a client application is compiled with the TRACY_ON_DEMAND macro
defined, tracing will not begin until a connection to server is
established.
- Since data is not fully captured in this mode, the resulting trace will
be less precise, until application state is appropriately reset. For
example, locks need to be fully released, zone stacks need to be
flushed. This is an automatic process.
- All tracing macros are able to work in the on-demand mode.
- Improved compatibility with various system setups.
- Aside from using TRACY_NO_EXIT define you can also set the same-named
environmental variable to 1 to get the same effect.
- Added ability to show/hide all threads and plots.
- Performance improvements.
- Improvements to memory data presentation.
- Added memory allocation info window.
- Selecting memory allocation on a plot will draw time range of the
allocation.
- Middle clicking on an memory allocation address (or on a button in
memory allocation info window) will zoom the view to the allocation
range.
- Find zone menu improvements:
- Zones can be now also grouped by call stacks.
- Zone groups can be now also sorted by time spend in each zone.
- Zone groups list now displays group times.
- Average and median zone times are now displayed on the histogram.
- Selected zones will be highlighted on the timeline view.
- Added named versions of tracing macros that allow specifying scoped
variable name.
- The main profiler window is now kept at the bottom of windows stack.
- The "profiler" utility will now use a custom embedded font.
- Microseconds are now displayed using correct symbol ('μ' instead of 'u').
- Unix builds of the "profiler" utility will now ask for a file name when
saving a trace.
- Progress popup is now displayed when a trace file is loading.
- Zones that share source location with a zone that is hovered over are now
highlighted.
- Added ability to zoom-in to a selection range made using middle mouse
button.
- Holding the ctrl key will switch to zoom-out mode.
- The "profiler" utility will use less resources when its window is
out-of-focus or minimized.
- Added support for cross-DLL profiling.
- Items in options menu (locks, threads, etc.) are now described with number
of events.
- Source location of lock declaration is also provided.
- Created an extensive user manual for the profiler.
- Added ability to capture multiple frame sets.
- Viewer will display multiple frame ranges at once.
- Only one frame set can be active at once. The selected one is used for
the frame navigation graph, frame navigation buttons and drawing frame
separators.
- The active frame set will be highlighted, and the rest will be dimmed
out.
- Frames can now also be discontinuous.
- Frames and zones too small to be displayed will be marked with a zig-zag
pattern.
- General improvements to message list and message markers.
- Hovering over message on a list will highlight its marker (previously it
only worked the other way).
- Left clicking on a message marker will focus the message list on the
selected message.
- Middle clicking on a message marker will center it on screen.
- Added trace information window.
- This includes frame time statistics and histogram.
- Displayed memory sizes are now properly formatted.
- Added call stack tree for memory allocations.
- You can display allocations list for each call stack tree entry.
- The source code of the profiled application may now be viewed in the
profiler.
- BIG FAT WARNING: The actual profiled program source code is not known to
the profiler. It only checks if there is a file on your disk that
matches the file name of the captured source location. Even if the file
is displayed, it may be out of date.
- CPU and GPU zones will have "Source" button, if source file can be
opened.
- Source files for call stack traces can be opened by right-clicking on
the file name. Since in this case there is no button that can be hidden,
a small animation will be played to notify user if the source cannot be
opened.
- The main profiler view will now occupy the whole window. Previous behavior
is still available for embedded use cases.
- Many button labels are now accompanied by icons.
- Fonts should now be less blurry.
- "Go to parent" button in zone info window won't be displayed if there is
no parent to go to.
- Improvements to the compare traces menu.
- There are now colored markers to make it easier to distinguish "this" and
"external" traces.
- The amount of saved time is now displayed (a difference between total
run times of both traces).
- Tracy will now collect host information, like CPU name, amount of system
memory, etc.
- Windows builds of the "profiler" utility will perform a check of supported
CPU instruction set and match it against the one required by the binary
(by default AVX2 is used). If the program cannot be executed on the
processor, a message dialog with workaround instructions will be
displayed.
- Tracy can intercept crashes and finish sending data from a dying process.
- Currently this is only implemented on Windows, Linux and Android.
- Call stack window may now display addresses of the frames, instead of
source file locations.
- Memory events will now properly register their thread.
- Profiler settings are now stored in a persistent location.
- On Windows settings are stored in %APPDATA%/tracy.
- On other platforms settings are stored in $XDG_CONFIG_HOME/tracy or
$HOME/.config/tracy, if the variable is not set.
- The main profiler window position, size and maximized state are saved
and restored.
- The size and position of internal windows now doesn't depend on the
runtime directory of the profiler executable.
- Added connection handshake.
- Server won't be able to connect to client if there's a protocol version
mismatch.
- Client not in on-demand mode will refuse connections after the first
connection was made and the initial event buffers were cleared.
- A single server will no longer try to connect to multiple clients.
- The capture utility will now display time span of the ongoing capture.
v0.3 (2018-07-03)
-----------------
- Breaking change: the format of trace files has changed.
- Previous tracy version will crash when trying to open new traces.
- Loading of traces saved by previous version is supported.
- Tracy will no longer crash when trying to load traces saved by future
versions. Instead, a dialog advising to update will be displayed.
- Tracy will no longer crash in most cases when trying to open files that
are not traces. Some crashes are still possible, due to support of old,
header-less traces.
- Ability to track every memory allocation in profiled program.
- Allocation event queuing must be done in order, which requires exclusive
access to the serialized queue on the client side. This has no effect on
the rest of events, which are stored in a concurrent queue, as before.
- You can search for a memory address and see where it was allocated, for
how long, etc. This lists all matching allocations since the program was
started.
- All active (non-freed) allocations may be listed. This shows the current
memory state by default, but can go back to any point in time.
- Graphical representation of process memory map may be displayed. New
allocations/frees are displayed in a bright color and fade out with
time. This feature also can look back in time.
- Memory usage plot is automatically generated.
- Basic allocation information is displayed in memory plot tooltips.
- A summary of memory events within a zone (and its children) is now
printed in zone info window.
- Support loading profile dumps with no memory allocation data (generated by
v0.2).
- Added ability to display global statistics of a selected zone from the
zone info window.
- Fixed regression with lock announce processing that appeared during
worker/viewer split.
- Allow selecting/unselecting all locks for display.
- Performance improvements.
- Don't save unneeded lock information in trace file.
- Don't save thrash in message list data.
- Allow expanding view span up to one hour, instead of one minute.
- Added trace comparison window.
- An external trace has to be loaded first.
- Zone query in both traces (current and external).
- Both results are overlaid on the same histogram.
- Graphs can be adjusted as-if there was the same number of zones
collected.
- Read time directly from a hardware register on ARM/ARM64, if possible.
- User-space access to the timer needs to be enabled in the kernel, so
tracy will perform run-time checks and fallback to the old method if the
check fails.
- Prevent connections in a TIME-WAIT state from blocking new listen
connections.
- Display y-range of plots.
- Added ability to unload traces loaded from files. To do so close the main
profiler window. You will return to the connect/open selection dialog.
Live captures cannot be terminated this way.
- Zones previously displayed in zone info window are remembered and you can
go back to them. Closing the zone info window or switching between CPU and
GPU zones will clear the memory.
- Improved message list window.
- Messages are now displayed in columns.
- Originating thread of each message is now included in the list.
- Messages can be filtered by the originating thread.
- You can now navigate to next and previous frame.
- Zone statistics can be now displayed using only self times.
- Support for tracing GPU events using Vulkan.
- Timeline will now display "OpenGL context" or "Vulkan context" instead of
"GPU context".
- Fixed regression causing invalid display of GPU context appearance time.
- Fixed regression causing invalid reporting of an active CPU in zone end
events, if MSVC rdtscp optimization was not enabled.
- Ability to collect true call stacks.
- Supported on Windows, Linux, Android.
- The following events can collect call stacks:
- Memory alloc/free.
- Zone begin.
- GPU zone begin.
- Zone stack trace now also displays frames from a real call trace.
- On Linux call stack frame name resolution requires a call to dladdr,
which in turn requires linking with libdl.
- Allow manual entry of GPU time drift value.
- Unix build system no longer shares object files between different build
units.
- Fixes inability to build debug and release versions of a single utility
without "make clean".
- Fixes incompatibility between "standalone" and "capture" utilities due
to different set of used feature flags.
- On Windows "standalone" utility now adapts to system DPI setting.
- Optional per-call zone naming.
v0.2 (2018-04-05)
-----------------
- Fixed broken TRACY_NO_EXIT behavior.
- Visual refresh (new color scheme).
- Added optional support for live in-depth zone analysis.
- Ability to search for zones matching a query.
- Histogram of zone time spans.
- List occurrences of a zone, grouped by thread, or by user text.
- Zone groups can be selected and highlighted on histogram graph.