-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We need some help to finally fix Darwin PowerPC assembler so that not just build succeeds but also tests pass #211
Comments
Seams to be a NULL pointer is dereferenced - this might happen because of the calling convention is violated. |
My PPC32 hardware is running OpenBSD, but I have a QEMU running Darwin. I can't compile Boost for Darwin, because my gcc 3.3 can't do C++11, but I can compile ppc32_sysv_macho.S and call them from C. I found a problem: This diff seems to fix make_context, but ontop_context might still be broken.
I put a few other changes in the diff.
Someone else, who has the C++ compiler, can check the diff, edit it, and git commit it. My diff doesn't fix an alignment problem. make_fcontext sets the correct alignment 16, but jump_fcontext does addi r1, r1, 244, and 244 isn't a multiple of 16. When you jump to the new context, you will have a misaligned stack pointer. This will break altivec code (but PPC32 compilers tend to disable altivec) and slow down code that has 8-byte floats on the stack. I fixed the alignment for ELF in df8fb6b by rearranging the frame from 244 to 240 bytes. A similar fix might help Mach-o, but ELF and Mach-o have different stack layouts, so you can't copy ELF's layout. |
@kernigh Thank you very much! I will try building it now and running tests. For the stack, I think we can make it 240, since now it is saving “hidden”, which was copied from ELF version or wherever. In fact an earlier P. S. If you can run at least Tiger in QEMU, @catap has fixed GCC10 for it. |
I cannot comment on much here (no time to load context myself, so not sure exactly what frame you are discussing) but ...
Darwin ppc32 ABI includes altivec, it cannot be disabled (although 10.4 will run on a G3, it does so by specific code that tests if altivec is present and skips the save/restore - it does not remove the space from the ABI). So this needs fixing - we know what the stack layout of a saved context should be ... if the synthesised one does not preserve alignment, it is probably still not quite right. |
@barracuda156 I believe that we need the "hidden" pointer to return a transfer_t, as The stack layout is in the 32-bit PowerPC Function Calling Conventions of the old OS X ABI Function Call Guide in Apple's documentation archive. The calling routine reserves 8(SP) for the link register and 4(SP) for the condition register. The called routine may save its LR and CR in the calling routine's frame. One might shrink jump_fcontext's frame from 244 to 240 bytes by saving its LR and CR in its caller's frame. @iains The 244-byte stack frame belongs to jump_fcontext, where it does When I said, "PPC32 compilers tend to disable altivec", I meant that they don't emit altivec instructions. In the ABI, each function must preserve altivec registers v20 to v31 (AltiVec Technology Programming Interface Manual ALTIVECPIM.pdf, 3.2 Register Usage Conventions; example in Apple's _setjmp). Our jump_fcontext is missing code to save and restore v20..v31, but if the compiler didn't emit altivec, then the program never uses v20..v31, and never needs jump_fcontext to preserve v20..v31. Some libraries, like libjpeg-turbo and pixman, have code that checks if the CPU has altivec, then calls altivec code. After they return from the altivec code, they would stop using v20..v31, so there would be no problem with v20..v31. I fear a problem with a misaligned stack pointer. A fiber with a misaligned SP might fail to do jpeg or pixman. |
@kernigh With your patch (and also adjusting 244 to 240, I did that earlier before your comment and wanted to see what happens anyway), Fibonacci test passes fine:
No more Bus errors, awesome! Two questions:
|
@barracuda156 The fibonacci program should print out the first 10 numbers from the Fibonacci sequence. I'm not seeing them in your output, so I wouldn't yet call the test passing. |
All right, I've been twisting my brain up with this code this morning trying to make sense of it... Disclaimer, I'm not an assembly programmer or systems expert, just making sense of what I can from the links in this thread. I think the additional change needed in ; restore CTR
mtctr r6
+ ; set the first arg to the on-top function (r5 is already correct)
+ mr r4, r7
+
; jump to ontop-function
bctr Basically, the on-top function takes a I'll test this code locally once I get my Boost environment set up on my test machine... In the meantime, @barracuda156 feel free to try it out since you seem to have everything up and running. Btw I strongly suggest NOT making the 244 -> 240 change you mentioned without fully understanding all the implications. |
@evanmiller I am temporarily away from native hardware, but ppc32 can be tested in Rosetta with reasonable reliability. Will try that in coming days. |
@kernigh When you have time, could you please comment on ppc64 versions? Is there something special to be done or just adjust byte sizes? Hopefully we can fix the whole Darwin PPC implementation in one go (after all) and not leave ppc64 for later. |
Good news! Well, mixed... Compiling with the two changes above (@kernigh's plus my register tweak), I am able to get the Fibonacci sequence to print!
This means that control is successfully passed several times between a fiber and the main program. Hooray! However, as you can see, there's a segfault when the program ends... I am guessing it is triggered during |
For make_fcontext, use the diff provided here: boostorg#211 (comment) For ontop_context, adapt the Linux PPC32 fixes from here: boostorg@df8fb6b Co-authored-by: George Koehler <[email protected]>
For make_fcontext, use the diff provided here: boostorg#211 (comment) For ontop_context, adapt the Linux PPC32 fixes from here: boostorg@df8fb6b Co-authored-by: George Koehler <[email protected]>
OK, I think I've gotten it working now... see #215 for the latest and greatest. In addition to @kernigh's The Fibonacci test now prints correct output, and doesn't crash, so I'm pretty pleased with it. Once I finish fiddling with the file structure I'll remove the Draft status. Stack alignment and PPC64 I will leave to others, or else save for another holiday. |
For make_fcontext, use the diff provided here: boostorg#211 (comment) For ontop_context, adapt the Linux PPC32 fixes from here: boostorg@df8fb6b Co-authored-by: George Koehler <[email protected]>
Btw it looks like the entire Darwin PPC64 ba35720#diff-4812b5eb7dffd2775492b2b706f826bf1463a239399169eb1711ffec9085a49bL53 So PPC64 "needs some work", to say the least. |
@evanmiller @kernigh @olk A quick update. Specifically, these components of
|
|
@evanmiller Sorry, where in particular? |
See line linked in above comment – the entire PPC64 file was accidentally commented out in 2016, so no one has compiled it in years. Line 53 of src/asm/make_ppc64_sysv_macho_gas.S |
Is there is a way to cross-compile to that target from a modern macos? It would be nice to have a CI job. |
Yes, it can be done - but requires:
Having said that, I do this reasonably often from x86-64-darwin2* (and earlier). It is also actually possible at least to run compile tests on the cross-compiler (with TCL and dejagnu needed), and if you have a real hardware box, execute tests can be run remotely (although it is much better performance to do the build locally if you have decent hardware) edit: "binutils" == cctools, ld64 and a version of dsymutil built from the LLVM sources (with powerpc patched in)... hence the "bit fiddly" - but, nevertheless doable. |
AppleClang is stripped from PPC support? I don't have hardware nor (would) use it. I'm planning on reworking asm compiling to dispatch targets via preprocessing (develop...Kojoley:context:feature/autodispatch) and having as much targets as possible on CI would reduce chances of breaking your targets. I think my case would be covered by just testing if |
Apple clang never supported PPC (at least, no released or published version did). Neither does upstream clang (without patches and that support is incomplete). The last Apple toolchain with PPC support is the Xcode 3.2.6 gcc-4.2.1. I was talking about using a GCC cross toolchain (which continues to support powerpc-darwin, even with current trunk [14.0.0]). Having said that, I realise you need a pretty restricted set of functionality - essentially preprocessor + assembler, right? (I will see if there's some easier sub-set of tools to do that - but you can be certain that there is nothing available 'out of the box'). |
Yes, though I forgot that you guys added
I will just ping here to test the PR then. |
Are these |
Here is the PR #228 if you care enough to test it. |
should be work now |
@olk I guess, |
deutsch I assumed it is fixed ... maybe I was confused be the lengthy discussion |
@olk Last time I looked at it – it was broken to the point of one of the source files being accidentally commented out, and that for years went unnoticed :) It is trivial to make it build, however Fibonacci test fails. Someone with a better understanding of assembler and ABI has to look into it. |
In a couple of recent PRs I have made some fixes to Darwin PPC assembler here (nothing big, just fixing which was actually breaking the build), which fixed building of
context
and made it somewhat functional (in a sense that certain dependencies that requirecontext
now build successfully,folly
and friends being the case).However, tests do not pass: macports/macports-ports#16407 (comment)
From earlier PRs by @kernigh and @DaoWen for PPC ELF it looks like the issue is quite non-trivial and requires a thorough understanding of both what Boost is doing and the magic code :)
Unfortunately, my knowledge of PPC assembler is rather limited and does not suffice here. I am not going to drop the case, since we really want this fixed properly, but any advice or help will be greatly appreciated.
This has been broken apparently forever – can we fix it now? :)
(Considering a recent case with Boost shared pointer (boostorg/smart_ptr#105), I cannot be entirely sure that it is assembler which is to blame and not again some wrong alignment, but probably it is assembler, given that it was also broken for *BSD.)
P. S. For the record, alternative version of PPC code for
context
from the following repo also fails, at least with Fibonacci test: https://github.com/twlostow/libcontextP. P. S. @iains I am aware you are very busy atm, so please feel free to completely ignore the issue. But still tagging you as the person who understands Darwin magic code best.
The text was updated successfully, but these errors were encountered: