Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Telemetry prototype #769

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

jskeet
Copy link
Collaborator

@jskeet jskeet commented Mar 13, 2024

The intention of this PR is that we merge some parts of it, but not all of it - ideally whole commits should be in or out.

@jskeet jskeet requested a review from a team as a code owner March 13, 2024 11:26
Copy link
Collaborator

@amanda-tarafa amanda-tarafa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few comments, but of the things we knew we had to chat about anyway.

/// <summary>
/// Formats the version of the assembly containing the specified type, in a "generally informative" way.
/// </summary>
internal static string FormatAssemblyVersion(System.Type type)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nit, but is there somewhere else we can move this method to? A general purpose utils class or similar? It's just weird to call VersionHeaderBuilder.FormatAssemblyVersion from the CreateActivitySource method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I don't have a better place for now. Will add a TODO for now. One thing that's worth noting is that the "is it a valid header value" may not be relevant in a more general case (and may not be reasonable for OTel, even).

/// Returns an <see cref="ActivitySource"/> named after the given client type.
/// </summary>
/// <remarks>
/// Multiple calls to this method (or <see cref="FromClientType{T}()"/>) for the same
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't leave a suggestion but on this one:

FromClientType{T} => FromClientType(System.Type)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/// </remarks>
/// <param name="type">The client type to obtain an activity source for. Must not be null.</param>
/// <returns>An <see cref="ActivitySource"/> named after the given client type.</returns>
public static ActivitySource FromClientType(System.Type type)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd drop the Client from the method name. We are not enforcing that, so these methods could be used to get activity sources for any type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering about that. They could be used for any type, but we don't want to give the impression that they would be used by any other system by default. Let's talk about it more - I know you're not 100% convinced by having this at all.

internal static class ApiCallTracingExtensions
{
// Unary async
internal static Func<TRequest, CallSettings, Task<TResponse>> WithRetry<TRequest, TResponse>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be named WithRetry => WithTracing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, thanks. Done.

@@ -104,6 +108,7 @@ public ClientHelper(Options options)
// These operations are applied in reverse order.
// I.e. Version header is added first, then retry is performed.
return ApiCall.Create(methodName, asyncGrpcCall, syncGrpcCall, baseCallSettings, Clock)
.WithTracing(_activitySource)
Copy link
Collaborator

@amanda-tarafa amanda-tarafa Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call WithTracing after calling WithLogging, so that logs are withtin the activity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thereby hangs a tail... currently we're using the method name of the delegate in WithTracing to get the activity name, which would be messed up by WithLogging. We should almost certainly stop doing that, basically - it makes all of this code really brittle. But we can decide all of that when we revisit the activity (and activity source) name.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes! Definetely a strong reason to stop using the name of the delegate in my view. As in I think it's more important (an expected) that the logs are contained in the trace. The methods themselves can access the current activity and set their own name as a tag, so we can still make that information available easily.

@@ -104,6 +108,7 @@ public ClientHelper(Options options)
// These operations are applied in reverse order.
// I.e. Version header is added first, then retry is performed.
return ApiCall.Create(methodName, asyncGrpcCall, syncGrpcCall, baseCallSettings, Clock)
.WithTracing(_activitySource)
.WithLogging(Logger)
.WithRetry(Clock, Scheduler, Logger)
Copy link
Collaborator

@amanda-tarafa amanda-tarafa Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think we should maybe change WithRetry so that it also receives the activity source, and creates an activity per attempt.
Alternatively, we call WithTracing after WithRetry so that the whole thing, including retries are within a single activity, and the logs from retrying are withing our activity.
Alternatively, we have a single activity that contains the retries, and each attempt is an event?
I think I'd prefer the first one, but any of these seem fine, as long as attempts are explicitly included in the trace.

return async (request, callSettings) =>
{
using var activity = activitySource.StartActivity(activityName, ActivityKind.Client);
activity?.SetTag(GrpcCallTypeTag, UnaryCallType);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a tag with the name of the client, in case users pass in their own activity source, we may want to still be able to identify this activity as produced by us.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I've added a TODO so that we remember to look at it - basically I think we'll need a whole design doc just about what names/tags we want to expose.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed we need to desing this elsewhere: naming, tags, and events and spans, and what corresponds to what, for instance, is a retry attempt represented in it's own span or is it an event? And what name it has, etc.

// (e.g. the full RPC name, the client name etc).
private static string FormatActivityName(Delegate fn, string methodName) => $"{fn.Method.Name}/{methodName}";

// TODO: See if there's a standard way of doing this. It seems odd to have to do it ourselves.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the same.
This actually looks more like a logging concern than a tracing one? But yes, let's see what others do or if there are best practices etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think setting the status to be an error is probably important for tracing though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the status is relevant, but the exception message and the stack trace seem odd in tracing, at least to me.

}
catch (Exception ex) when (SetActivityException(activity, ex))
{
// We'll never actually get here, because SetActivityException always returns false.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'll be way clearer to have the SetActivityException method be void, and just call it here, and then thrwo. And that's not just because that's actually what happens, but also because the bool SetActivityException combination makes me thingkthat this method will return true if it set the info, and false otherwise, when what happens is, it always sets the info but always returns false.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried about throw losing context in some async situations.

We could use ExceptionDispatchInfo.Capture(e.InnerExceptions.FirstOrDefault() ?? e).Throw(); instead, perhaps.
(I'm aware that using an exception filter here is a bit odd.)

Will add a comment so we don't forget to come back to this.

This wires everything up, but doesn't actually start any activities. The implementation will all be in ApiCallTracingExtensions, in another commit.
While we *could* remove ActivitySource from ClientHelper.Options, that would require the settings to be cloned if the user hasn't specified an activity source. That could potentially be done in ClientBuilderBase, but keeping the default separate from the user-specified activity source feels simpler.
There's just a single test so far - more required. We should discuss this commit in detail before going further.
Streaming calls are trickier, because we don't want to stop the activity before the call has actually completed. (We may not even know that in some cases.)
@jskeet
Copy link
Collaborator Author

jskeet commented Mar 14, 2024

New commits pushed - I hope they don't mess up the context of the comments too much.

Copy link
Collaborator

@amanda-tarafa amanda-tarafa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, I think we mostly agree on everything including in what needs further discussion. And thanks for the TODOs!

@@ -104,6 +108,7 @@ public ClientHelper(Options options)
// These operations are applied in reverse order.
// I.e. Version header is added first, then retry is performed.
return ApiCall.Create(methodName, asyncGrpcCall, syncGrpcCall, baseCallSettings, Clock)
.WithTracing(_activitySource)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes! Definetely a strong reason to stop using the name of the delegate in my view. As in I think it's more important (an expected) that the logs are contained in the trace. The methods themselves can access the current activity and set their own name as a tag, so we can still make that information available easily.

@@ -214,7 +214,8 @@ public sealed class Options
public string ApiVersion { get; set; }

/// <summary>
/// The activity source to use for tracing, if any. This may be null.
/// The activity source to use for tracing, if any. This may be null. This is ignored
/// if <see cref="Settings"/> specifies an activity source.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, this always trips me, even when I'm writing docs myself.

I think as is is fine then.

return async (request, callSettings) =>
{
using var activity = activitySource.StartActivity(activityName, ActivityKind.Client);
activity?.SetTag(GrpcCallTypeTag, UnaryCallType);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed we need to desing this elsewhere: naming, tags, and events and spans, and what corresponds to what, for instance, is a retry attempt represented in it's own span or is it an event? And what name it has, etc.

// (e.g. the full RPC name, the client name etc).
private static string FormatActivityName(Delegate fn, string methodName) => $"{fn.Method.Name}/{methodName}";

// TODO: See if there's a standard way of doing this. It seems odd to have to do it ourselves.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the status is relevant, but the exception message and the stack trace seem odd in tracing, at least to me.

@jskeet
Copy link
Collaborator Author

jskeet commented Mar 18, 2024

Great, thanks. We can talk in a VC about how we want to make progress. (We could potentially make everything internal and start merging the bits we're happier with, for example.)

@googleapis googleapis deleted a comment May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants