Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

X86 16 bit platform support #70

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

X86 16 bit platform support #70

wants to merge 4 commits into from

Conversation

xor2003
Copy link

@xor2003 xor2003 commented Jun 8, 2024

x86 16 bit platform is still usefull for DOS era software.
There is no acceptable decompiler for this platform.
So I'm trying to use angr to create a decompiler for it.
I feel the implementation is 85% complete.
I'm able to decompile some small examples.
I feel I need help form people interested in it, to finish it. So I decided to publish it.

Small fixes required for "angr" repo to enable 16 bit addressing. So I will also create correspoing PRs.

@ltfish
Copy link
Member

ltfish commented Jun 9, 2024

angr decompiler has a lot of places where they expect the architecture to be either 32-bit or 64-bit. Did you not have to change them?

@xor2003
Copy link
Author

xor2003 commented Jun 9, 2024

I could get results with this minimal patch: angr/angr#4683

@twizmwazin
Copy link
Member

Have you looked into pypcode's support for x86-16? Depending on where you are in your project, it might be easier to leverage that than maintain a seperate lifter here

@xor2003
Copy link
Author

xor2003 commented Jun 11, 2024

I could don't find 16 bit support in the pcode. Could you give the url?

@xor2003
Copy link
Author

xor2003 commented Jun 14, 2024

Ghidra have long unresolved issues with x86 16 bit NationalSecurityAgency/ghidra#981"
But thanks, I can try to use it as reference to fix this implementation

@xor2003
Copy link
Author

xor2003 commented Jun 14, 2024

Moreover I tryed decompiler with pcode:

Block at 0x100, size: 11
     _start:
100  PUSH    BP
101  MOV     BP, SP
103  MOV     AX, wordptr[BP+0x4]
106  MUL     wordptr[BP+0x6]
109  POP     BP
10a  RET     
IRSB {
   00 | ------ 00000100, 1 ------
   01 | unique[b080:2] = BP
   02 | SP = SP - 0x2
   03 | unique[b200:4] = CALLOTHER 0x0, SS, SP
   04 | *[ram]unique[b200:4] = unique[b080:2]
   05 | ------ 00000101, 2 ------
   06 | BP = SP
   07 | ------ 00000103, 3 ------
   08 | unique[2380:2] = BP + 0x4
   09 | unique[4680:4] = CALLOTHER 0x0, SS, unique[2380:2]
   10 | unique[9200:2] = *[ram]unique[4680:4]
   11 | AX = unique[9200:2]
   12 | ------ 00000106, 3 ------
   13 | unique[2380:2] = BP + 0x6
   14 | unique[4680:4] = CALLOTHER 0x0, SS, unique[2380:2]
   15 | unique[2d000:4] = zext(AX)
   16 | unique[9200:2] = *[ram]unique[4680:4]
   17 | unique[2d080:4] = zext(unique[9200:2])
   18 | unique[2d180:4] = unique[2d000:4] * unique[2d080:4]
   19 | DX = SUBPIECE unique[2d180:4], 0x2
   20 | AX = SUBPIECE unique[2d180:4], 0x0
   21 | CF = DX != 0x0
   22 | OF = CF
   23 | ------ 00000109, 1 ------
   24 | unique[2e380:2] = 0x0
   25 | unique[bb00:4] = CALLOTHER 0x0, SS, SP
   26 | unique[2e380:2] = *[ram]unique[bb00:4]
   27 | SP = SP + 0x2
   28 | BP = unique[2e380:2]
   29 | ------ 0000010a, 1 ------
   30 | unique[bb00:4] = CALLOTHER 0x0, SS, SP
   31 | IP = *[ram]unique[bb00:4]
   32 | SP = SP + 0x2
   33 | EIP = CALLOTHER 0x0, CS, IP
   34 | return EIP
   NEXT: None; Ijk_Ret
}

WARNING  | 2024-06-14 09:30:08,510 | angr.analyses.calling_convention | _analyze_function(): Cannot find a calling convention for <Function _start (0x100)> that fits the given arguments.
WARNING  | 2024-06-14 09:30:08,510 | angr.analyses.calling_convention | Cannot determine calling convention for <Function _start (0x100)>.
WARNING  | 2024-06-14 09:30:08,511 | ailment.converter_pcode | Unsupported opcode: CALLOTHER emulation not currently supported
Traceback (most recent call last):
  File "/home/xor/vextest/decompile.py", line 69, in <module>
    dec = project.analyses[Decompiler].prep()(func, cfg=cfg.model)
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/angr/analyses/analysis.py", line 202, in wrapper
    oself.__init__(*args, **kwargs)
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/angr/analyses/decompiler/decompiler.py", line 103, in __init__
    self._decompile()
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/angr/analyses/decompiler/decompiler.py", line 164, in _decompile
    clinic = self.project.analyses.Clinic(
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/angr/analyses/analysis.py", line 217, in __call__
    r = w(*args, **kwargs)
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/angr/analyses/analysis.py", line 202, in wrapper
    oself.__init__(*args, **kwargs)
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/angr/analyses/decompiler/clinic.py", line 168, in __init__
    self._analyze_for_decompiling()
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/angr/analyses/decompiler/clinic.py", line 210, in _analyze_for_decompiling
    if not (ail_graph := self._decompilation_graph_recovery()):
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/angr/analyses/decompiler/clinic.py", line 247, in _decompilation_graph_recovery
    self._convert_all()
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/angr/utils/timing.py", line 43, in timed_func
    return func(*args, **kwargs)
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/angr/analyses/decompiler/clinic.py", line 825, in _convert_all
    ail_block = self._convert(block_node)
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/angr/analyses/decompiler/clinic.py", line 869, in _convert
    ail_block = ailment.IRSBConverter.convert(block.vex, self._ail_manager)
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/ailment/__init__.py", line 50, in convert
    return PCodeIRSBConverter.convert(irsb, manager)
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/ailment/converter_pcode.py", line 99, in convert
    return PCodeIRSBConverter(irsb, manager)._convert()
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/ailment/converter_pcode.py", line 153, in _convert
    self._convert_current_op()
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/ailment/converter_pcode.py", line 172, in _convert_current_op
    self._special_op_handlers[self._current_behavior.opcode]()
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/ailment/converter_pcode.py", line 453, in _convert_store
    off = self._get_value(self._current_op.inputs[1])
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/ailment/converter_pcode.py", line 361, in _get_value
    return self._convert_varnode(varnode, False)
  File "/home/xor/vextest/venv/lib/python3.10/site-packages/ailment/converter_pcode.py", line 290, in _convert_varnode
    assert unique_offset is not None, "Cannot find the source unique variable"
AssertionError: Cannot find the source unique variable

@mborgerson
Copy link
Member

03 | unique[b200:4] = CALLOTHER 0x0, SS, SP

It looks like we just need to implement segment to linear address resolution for this p-code arch.

@xor2003
Copy link
Author

xor2003 commented Jun 15, 2024

  1. No, there are more issues: Incorrect relative CALL decoding on x86-16 bit targets NationalSecurityAgency/ghidra#981
    Segmented addressing not well supported
    It works when segments are aligned to 0x1000, but when they are not, address calculation is wrong so most CALL / JUMP references have are wrong.
  2. Where I can try to implement segment -> linear translation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants