Skip to content

yohanchatelain/pytracer

Repository files navigation

pytracer log

Pytracer

Pytracer is a python tracer designed to assess the numerical quality of python codes. Pytracer automatically instruments python modules to trace the inputs and outputs of targeted modules' functions. Generated traces are then aggregated and visualized through a Plotly dashboard server.

Usage

Pytracer is divided in three steps: trace, parse and visualize.

usage: pytracer [-h] {trace,parse,visualize,info,clean} ...

Pytracer

optional arguments:
  -h, --help            show this help message and exit

pytracer modules:
  {trace,parse,visualize,info,clean}
                        pytracer modules
    trace               trace functions
    parse               parse traces
    visualize           visualize traces
    info                get info about current traces
    clean               clean pytracer cache

The module "trace" takes as input the python application to trace with the option --command. The list of modules to trace must be added to the configuration file, section modules_to_load. The execution of the application will generate a pickle file that contains all inputs/outputs encountered, called a trace. Pytracer is designed to detect numerical instabilities and works under a stochastic arithmetic environment (see fuzzy). Therefore, the different executions of the targeted application under this environment should result in floating-point values that differ. The parse module attempts to gather these traces and compute statistics on the observed differences. The parse module converts a set of traces into a hfd5 file used for the visualization. Then the visualize module opens a dash server to visualize the traces on a browser and interact with them.

Trace module

The trace module instruments modules in config.modules_to_load and execute the module application.

usage: pytracer trace [-h] --command ... [--dry-run] [--report {on,off,only}] [--report-file FILE]

optional arguments:
  -h, --help            show this help message and exit
  --command ...         command to trace
  --dry-run             Run the module wihtout tracing it
  --report {on,off,only}
                        Report call and memory usage
  --report-file FILE    Write report to <FILE>

The trace command creates a cache folder (\__pytracercache\__) that contains the traces.

Parse module

The parse module aggregates traces and produce a HDF5 file.

usage: pytracer parse [-h] [--filename FILENAME | --directory DIRECTORY] [--format {pickle}] [--batch-size BATCH_SIZE] [--method {cnh,general}] [--online]

optional arguments:
  -h, --help            show this help message and exit
  --filename FILENAME   only parse <filename>
  --directory DIRECTORY
                        parse all files in <directory>and merge them
  --format {pickle}     format of traces (auto-detected by default)
  --batch-size BATCH_SIZE
                        Number of elements to process per batch. Increasing this number requires more memory RAM
  --method {cnh,general}
                        Method used to compute the significant digits: Centered Normal Hypothesis (CNH) or General (see significantdigits package)
  --online              Do not bufferized parsing

Visualize module

The visualize module opens the plotly dashboard server.

usage: pytracer visualize [-h] [--directory DIRECTORY] [--debug] --filename FILENAME --callgraph CALLGRAPH [--host HOST] [--threaded [THREADED]]

optional arguments:
  -h, --help            show this help message and exit
  --directory DIRECTORY
                        directory with traces
  --debug               run dash server in debug mode
  --filename FILENAME   file to visualize
  --callgraph CALLGRAPH
                        Call graph file
  --host HOST           IP to run on
  --threaded [THREADED]
                        Multithreading yes/no

This command produces the following output:

Dash is running on http://127.0.0.1:8050/

 * Serving Flask app "pytracer.gui.app" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:8050/ (Press CTRL+C to quit)

You must open the address in a browser to see the trace.

Info module

The info module list the available traces and aggregation files.

usage: pytracer info [-h] [--directory DIRECTORY] [--trace] [--aggregation]

optional arguments:
  -h, --help            show this help message and exit
  --directory DIRECTORY
                        Directory to get information from
  --trace               Print traces information
  --aggregation         Print aggregations information

Here an example:

========== Traces ==========

           Date:        Mon Nov  1 12:24:20 2021
           Name:        211101122420.494927.pkl
           Path:        /Work/pytracer/.__pytracercache__/traces/211101122420.494927.pkl
           Size:        435.8KB
           Args:        Namespace(command=['pytracer/test/internal/test_basic.py', '--test2', '1'], 
                                  dry_run=False, pytracer_module='trace', report='OFF', report_file='')
     ReportName:        None
     ReportPath:        None
PytracerLogName:        pytracer.log.494927
PytracerLogPath:        /home/yohan/Work/pytracer/pytracer.log.494927

           Date:        Mon Nov  1 12:49:51 2021
           Name:        211101124951.500061.pkl
           Path:        /Work/pytracer/.__pytracercache__/traces/211101124951.500061.pkl
           Size:        435.8KB
           Args:        Namespace(command=['pytracer/test/internal/test_basic.py', '--test2', '1'], 
                                  dry_run=False, pytracer_module='trace', report='OFF', report_file='')
     ReportName:        None
     ReportPath:        None
PytracerLogName:        pytracer.log.500061
PytracerLogPath:        /Work/pytracer/pytracer.log.500061

========== Aggregation ==========

           Date:        Mon Nov  1 12:24:30 2021
           Name:        stats.495010.h5
           Path:        /Work/pytracer/.__pytracercache__/stats/stats.495010.h5
           Size:        5.6MB
           Args:        Namespace(batch_size=5, directory='.__pytracercache__/traces', filename=None, format=None, 
                                  method='cnh', online=False, pytracer_module='parse')
         Traces:        ['/Work/pytracer/.__pytracercache__/traces/211101122420.494927.pkl',
                         '/Work/pytracer/.__pytracercache__/traces/211101124951.500061.pkl']
  CallgraphName:        callgraph.495010.pkl
  CallgraphPath:        /Work/pytracer/.__pytracercache__/stats/callgraph.495010.pkl
PytracerLogName:        pytracer.log.495010
PytracerLogPath:        /Work/pytracer/pytracer.log.495010

Complete example

  pytracer clean # if run before
  pytracer trace --command ./pytracer/test/internal/test_basic.py
  pytracer parse
  pytracer info # List the traces and aggregation files available
  pytracer visualize --filename <stats> --callgraph <callgraph>

Config file

Before being able to use pytracer, do

export PYTRACER_CONFIG=~/Workspace/pytracer/pytracer/data/config/config.json

where the value is the absolute path to the config.json

The configuration file is a json file containing several entries:

  • module_to_load: String list. List of the modules to trace
  • module_to_exclude: String list. List of modules to exclude
  • include_file: String list. List of the inclusion files. An inclusion file restrict the function to trace for a given module.
  • exlude_file: String list. List of the inclusion files. An inclusion file list the function to do not trace for a given module.
  • logger: Suboption for the logger.
    • format: {print,logger}. Which logger format used.
    • output: String. File where storing the log outputs.
    • color: Boolean. Enable colors.
    • level: {debug,info,warning}. Minimum level of information.
  • io: Suboption for the trace.
    • type: {pickle}. Specify the format of the trace.
    • backtrace: Boolean. Enable backtracing.
    • cache: Suboption for the trace directory:
      • root: String. Name of the directory to store traces

About

Pytracer: Automatic python code instrumentation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages