RFC: Python 3 and speedup #26

maage · 2019-09-18T13:42:31Z

I had problems with xortool to handle all files in test/data and also I wanted to use Python 3.

After initial fixes, I noticed it was somewhat slow and limited how big files it could handle.

This PR is draft as I've dropped some features. And changed some outputs. Like output true byte repr (b'secret') instead of trying to beautify it. It is handy to paste from there if you need it in your python code. Also some output messages are removed or changed.

To achieve better performance and also fixed some bugs, I've used several tools. First I've cleaned up code to remove issues linters usually warn about (naming and docstrings warnings were mostly ignored). Second I've tried to use python3 functions if there were suitable. This should remove some bugs and there is less generic code to maintain. Third I've used bytes as internal representation. Strings are only used when ever they are needed. Fouth is tech choice to use numpy. This xoring is good target for matrix operations. Also it should keep memory usage somewhat in check. Fifth I've use generators to lower memory usage. xortool writes keys and files as it discovers new keys. Sixth, I've tried to limit branching factor as xortool can get stuck with totally random files with multiple numbers of max occurence top charactes. This is not that carefully tuned out so you might want to check it. You still can end up filling your disk up.

I can currently discover a key for over 100 MiB file with known keylength under 2 minutes of CPU time and about 1.1 GiB memory. It takes couple of minutes more to write 25 GiB files out.

To compare original 'xortool -b -l 65 test/data/ls_xored' takes more time than that. And my version has the results within couple of seconds.

Because my algorithm is subtly different you can get different keys compared to old. when there is no one winning key. Mostly this is because of branching factor limitation.

I've tested this only on Linux (Fedora 30) with:
python3-3.7.4-1.fc30.x86_64
python3-numpy-1.16.4-2.fc30.x86_64

- Accept 0x format (0x00) - Raise error if char is empty - Handle None in parse_char

- Also it is string.ascii_letters

Also: update charset and routine Drop routines not used

noraj · 2019-10-24T16:12:24Z

@hellman bump

maage · 2019-11-02T20:19:18Z

I've submitted pr #28 with most of minor issues as there is conflicts.
I have this numpy patch, but it is against #28 and is mainly not changed from here. I've just tried to move away all small and unrelated pieces from numpy stuff.
Latest numpy patch

hellman · 2020-01-09T08:44:20Z

Speedup is nice! Would be nice to detect numpy before using it, because having numpy as a dependency seems an overkill.

maage added 20 commits September 18, 2019 14:07

simple fixes to make tests pass

639bcb2

Use python3 and fix style in coding mark

00c04bf

colors: Add COLORS dict to simplify color usage with format

b2ac7d3

libcolors: Cleanups

39433ce

args: Move parse_char to args and improve

438fd0d

- Accept 0x format (0x00) - Raise error if char is empty - Handle None in parse_char

args: Add parse_int and use it

0d5b222

args: Reorder parameters, no actual change

640d906

charset: Cleanups

68c47c2

routine: Cleanups

5e56f70

- Also it is string.ascii_letters

xortool-xor: Sort/clean imports

4b20029

xortool-xor: Do not use negative varialbles

e3df7f7

xortool-xor: use buffered write if possible

da127ce

xortool-xor: more opts to handle newline and cycle

9bb3b7b

xortool-xor: Simplify logic when there is no data

8f68e11

xortool-xor: Use bytes and py3 functions more

0c08bb5

xortool-xor: Use numpy to do xor

2c4843a

xortool: cleanups

3d83748

xortool: Simplify parameter handling

6a0a576

xortool: Docstrings according to linter style

901547b

xortool: Use bytes and numpy

c77a13f

Also: update charset and routine Drop routines not used

This was referenced Sep 18, 2019

xortool-xor is not python 3 compatible #21

Closed

not working on Python 3.7 #18

Closed

Getting a Type Error #17

Closed

Python 3 support incomplete #16

Closed

hellman marked this pull request as ready for review October 25, 2019 07:38

bee-san mentioned this pull request Jun 12, 2021

Increase speed bee-san/xortool#5

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Python 3 and speedup #26

RFC: Python 3 and speedup #26

maage commented Sep 18, 2019

noraj commented Oct 24, 2019

maage commented Nov 2, 2019 •

edited

Loading

hellman commented Jan 9, 2020

RFC: Python 3 and speedup #26

Are you sure you want to change the base?

RFC: Python 3 and speedup #26

Conversation

maage commented Sep 18, 2019

noraj commented Oct 24, 2019

maage commented Nov 2, 2019 • edited Loading

hellman commented Jan 9, 2020

maage commented Nov 2, 2019 •

edited

Loading