Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS issue should return UNKNOWN or CRITICAL #268

Open
tatref opened this issue Sep 12, 2022 · 6 comments
Open

DNS issue should return UNKNOWN or CRITICAL #268

tatref opened this issue Sep 12, 2022 · 6 comments

Comments

@tatref
Copy link

tatref commented Sep 12, 2022

Hi,

nrpe/src/utils.c

Lines 156 to 157 in b226fe4

fprintf(output, "Could not resolve hostname %.100s: %s\n", host, gai_strerror(gaierr));
exit(1);

At the moment, a DNS issue returns a WARNING.

This should probably be either UNKNOWN or CRITICAL.

Also, using check_nrpe for a host on Nagios will return an OK instead of WARNING, which can be problematic

I know that this project is not longer developped, can this still be fixed? I can make the PR

Thanks

@ericloyd
Copy link

I argue that it should not be considered CRITICAL, and UNKNOWN is not really the case - it is known that it is not resolvable, so it is not unknown. In essence, I believe that WARNING remains the proper response.

If you want to check for proper DNS resolution, you should be using the check_dns plugin outside of NRPE. To be super pedantic, you could make a dependency that requires the check_dns result to be in an OK state before using a FQDN in an NRPE-based check, so that the NRPE check doesn't execute unless DNS is responding properly.

In short - my vote would be to leave it as is.

@tatref
Copy link
Author

tatref commented Sep 12, 2022

Thanks for your feedback!

Well, the goal of the command is to check for something on the remote host, so the result of the command could be UNKNOWN, because it didn't even get to execute the command, so in essence, the result os not KNOWN

I know I can add a check_dns, but adding this to every host is going to cumbersome. Moreover, maybe I have an /etc/hosts entry for this host, so check_dns is not necessarily the way to go.

@ericloyd
Copy link

I still believe that it is not a CRITICAL condition for whatever is being checked. It is a failure in NRPE's ability to connect, and there are ways to ensure that it can connect before executing the check. I named one.

And if you're using /etc/hosts, then DNS failure isn't an actual option here, is it? It's basic connectivity issues, which should then issue an UNKNOWN. But not a CRITICAL.

@ericloyd
Copy link

By the way, service dependencies are "smart" in that, if configured properly, you don't need to specific them for all hosts. You leave that part blank, make the master service your check_dns and your dependent service your check_nrpe (with no commands, just connectivity checking). That way, DNS must be working for check_nrpe to work. Then you make all your NRPE-based checks dependent upon check_nrpe (with no commands) so that they only run if NRPE is working. It's two dependencies.

@ericloyd
Copy link

So if we've narrow the code snippet in question down to being a DNS issue, as opposed to a general connectivity issue, then it definitely shouldn't be returning CRITICAL, and in this case, I'll agree that UNKNOWN would be more appropriate than WARNING.

@tatref
Copy link
Author

tatref commented Mar 18, 2024

Also note that failing to connect to port 5666 results in CRITICAL:

Critical : (No output on stdout) stderr: connect to address 1.2.3.4 port 5666: Connection refused

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants