Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGABRT and core dump when trying to check pollin/pollout on a destroyed sock #291

Open
aHugues opened this issue Nov 24, 2020 · 1 comment

Comments

@aHugues
Copy link

aHugues commented Nov 24, 2020

Hi there, I've ran into an issue when working with this project.

Due to an error on my part, I've tried to access the method Sock.Pollin on a sock that was previously destroyed. Doing so resulted in a SIGABRT and a core dump from the underlying czmq lib.

Steps to reproduce

This simple example would result in a similar crash

package main

import (
	"fmt"
	"log"
	"time"

	goczmq "gopkg.in/zeromq/goczmq.v4"
)

func checkPollin(sock *goczmq.Sock) {
	for {
		fmt.Printf("pollin: %t\n", sock.Pollin())
		time.Sleep(100 * time.Millisecond)
	}
}

func main() {
	sock, err := goczmq.NewSub("inproc://test", "test")
	if err != nil {
		log.Fatal("Impossible to create sock")
	}

	go checkPollin(sock)
	time.Sleep(150 * time.Millisecond)
	sock.Destroy()
	time.Sleep(500 * time.Millisecond)
}

Running this example would result in the following output:

pollin: false
pollin: false
panic_zmq: src/zsock_option.inc:3760: zsock_events: Assertion `self' failed.
SIGABRT: abort
PC=0x7f34f5bac18b m=0 sigcode=18446744073709551610

goroutine 0 [idle]:
runtime: unknown pc 0x7f34f5bac18b
stack: frame={sp:0x7ffd70922c60, fp:0x0} stack=[0x7ffd70123fb8,0x7ffd70922ff0)

... (omitted core dump)

Reason

The czmq lib calls assert (self); in the beginning of each method, which would fail when the sock has been destroyed because resources have been freed. When the assert fails, a SIGABRT is sent, which triggers the core dump.

The goczmq lib does not perform additional verification before calling the C method

// Events returns the current value of the socket's events option
func Events(s *Sock) int {
	val := C.zsock_events(unsafe.Pointer(s.zsockT))
	return int(val)
}

// Pollin returns true if there is a Pollin
// event on the socket
func (s *Sock) Pollin() bool {
	return Events(s)&Pollin == Pollin
}

Solutions

My fix was to make sure that the case was not possible (so joining all goroutines before destroying the sock), but two solutions would make this approach more robust, as a SIGABRT is not recoverable:

  • Adding a way to verify that a socket has not been destroyed, as it is not possible at the moment
  • Adding a verification that the socket has not been destroyed during the function call, before calling the C method

The second solution might modify the Pollin signature with something like func (s *Sock) Pollin() (bool, error) which would make it easy to process errors during the Pollin call

Additional remarks

For now, I'm pretty sure this affects the Pollin and Pollout methods. I suspect that this affects all methods calling directly the C library as well, for instance RecvFrame, but I have not checked that.

@Elvis339
Copy link

Hey @aHugues have you implemented and tested your solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants