-
Notifications
You must be signed in to change notification settings - Fork 911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reflect.*Header uses a different representation than in the standard Go implementation #1284
Comments
Same as #691 The layout of That said, we did consider using |
Does this imply that this is a bug in murmur and needs an addition here: https://github.com/twmb/murmur3/blob/v1.1.3/murmur.go#L69 ? |
It is not necessarily a bug in that library per se; it is just non-portable code. In general Go code is written to target the standard Go implementation, and in practice the layout of string headers and slice headers in standard Go don't change, so people treat them as stable in their code. Rob Pike mentions this when explaining one of his "Go proverbs", which is "With the unsafe package there are no guarantees." - https://www.youtube.com/watch?v=PAAkCSZUG1c&t=13m49s In the library you linked to, it looks to me like that function is just an optimization, and the same result could be achieved in a portable way (or at least provide a portable alternative using build tags etc) |
Note that v1.1.5 avoids the problem this issue is about. |
Are you sure? It doesn't look like it: https://github.com/twmb/murmur3/blob/v1.1.5/murmur.go#L69 var str string
strhdr := (*reflect.StringHeader)(unsafe.Pointer(&str))
strhdr.Data = ((*reflect.SliceHeader)(unsafe.Pointer(&slice))).Data
strhdr.Len = len(slice)
return str
For that reason, I'm using github.com/spaolacci/murmur3 instead. For reference, this is the TinyGo type: Lines 756 to 759 in c303266
And here the upstream Go type: // StringHeader is the runtime representation of a string.
// It cannot be used safely or portably and its representation may
// change in a later release.
// Moreover, the Data field is not sufficient to guarantee the data
// it references will not be garbage collected, so programs must keep
// a separate, correctly typed pointer to the underlying data.
type StringHeader struct {
Data uintptr
Len int
} Note this part:
Therefore, if you want portable code you shouldn't use |
Oh, correction: this follows the unsafe rule (6) in https://golang.org/pkg/unsafe/, follows the rules laid out by @mdempsky in golang/go#40701, and passes https://github.com/jlauinger/go-safer. I misread this issue. I'm not sure why tinygo's int may be longer than the machine's uintptr, because the purpose of an int is to be the size of a machine word. The Also fwiw, spaolacci's does an unsafe unaligned conversion from of four slice bytes to an int32, this was originally flagged in go1.14 but Go removed the check (since on any realistic arch, the unaligned read is fine). ref: spaolacci/murmur3#29 |
No. Many people think this, but it's false. The
So it would be perfectly legal for a Go implementation to implement (Sidenote, the otherwise unrelated type Because a string can never be longer than what can be stored in a pointer-sized integer, this is the implementation of the string type in TinyGo: Lines 9 to 13 in e02f308
As you can see, it uses So what to do about In summary, if you want to write portable code that uses
Sure, but that doesn't make them portable (as pointed out in the documentation for these two types).
TinyGo supports the Cortex-M0 chip, which does not support unaligned accesses. So this conversion would be illegal on some targets that TinyGo supports. I happen to be using a Cortex-M4 so that's probably why I haven't had any issues with it. Oh and note that x86 also mandates proper alignment in some cases: https://pzemtsov.github.io/2016/11/06/bug-story-alignment-on-x86.html |
Those are reasonable points. While the This basically means that Is it worth it to consider having tinygo's |
Yes, unfortunately that would be the consequence. To repeat, this is what the comment says: // StringHeader is the runtime representation of a string.
// It cannot be used safely or portably and its representation may
// change in a later release. Code that uses it should expect it to be non portable.
Technically, maybe. In practice, this would either be very difficult and/or leave a trail of subtle bugs. |
I've made a PR upstream: twmb/murmur3#9 |
Looks like this was resolved, so now closing. Please reopen if needed. Thanks! |
Assuming that this remains an unappealing solution for TinyGo to adopt, I'm wondering if there could be additional documentation created with suggested fixes for projects that don't work with TinyGo but might want to. Maybe @aykevl's contribution in twmb/murmur3#9 could be documented as an alternative to one otherwise unportable pattern. I was hoping to use TinyGo to build a WASM binary with https://github.com/apache/arrow. This doesn't work due to the use of I see that other related TinyGo tickets that have been opened have been closed as duplicates of this issue, so I didn't want to reopen this or open a new issue. But I would be willing to contribute documentation if someone could help suggest more alternatives to existing uses of |
I have created a docs PR here: tinygo-org/tinygo-site#289 |
Thanks, @aykevl. For reference, here is the draft set of changes for Apache Arrow to work with TinyGo: apache/arrow#14026. Curious if you might have any suggestions for improvement. |
Maybe, instead of this: return unsafe.Slice((*int32)(unsafe.Pointer(h.Data)), cap(b)/Int32SizeBytes)[:len(b)/Int32SizeBytes] You could do this: return unsafe.Slice((*int32)(unsafe.Pointer(h.Data)), cap(b)/unsafe.Sizeof(int32(0)))[:len(b)/unsafe.Sizeof(int32(0))] ...but I'm not sure the latter is any easier to read. In any case, I think Ideally you could do this all in a single function using generics, but I don't know whether you still need to support Go 1.17. When you use generics, you can use Also, just FYI: casting slices like this is not risk-free, even on x86. You could get a segmentation fault if the alignment isn't correct. |
Note that as of Go 1.20 (assuming that TinyGo is up to date with the API changes in that release) you can use the |
Yes, and I'm thankful for those functions :) they are supported in TinyGo. |
Here is an update to the documentation that describes how to use these newer and more portable functions: tinygo-org/tinygo-site#346 (They're still not fully portable of course, because casting slices and strings like that can reveal architecture specific details). |
TLDR
There seems to be an implicit cast in go-proper. See below.
Version
tinygo version 0.14.0 darwin/amd64 (using go version go1.14.7 and LLVM version 10.0.0)
Failure
Expectation
Testcase
The text was updated successfully, but these errors were encountered: