- 
                Notifications
    You must be signed in to change notification settings 
- Fork 406
Be mindful of other logging context filters in 3rd-party code #19068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Be mindful of other logging context filters in 3rd-party code #19068
Conversation
01d945b    to
    0256915      
    Compare
  
    | def safe_set(attr: str, value: Any) -> None: | ||
| """ | ||
| Only write the attribute if it hasn't already been set or we actually have | ||
| a Synapse logcontext (indicating that this log record is relevant to | ||
| Synapse). | ||
| """ | ||
| if context is not SENTINEL_CONTEXT or not hasattr(record, attr): | ||
| setattr(record, attr, value) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the abstraction worth it? Should I just inline the usage?
Originally, I thought I would have to use it more for all of the request attributes below but turns out we can do a little optimization to avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not move setting the server_name under the request attribute? Then you'd only have one usage of safe_set, and then it'd be even more clear that it's not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant structuring the code to be be something like the following:
        context = current_context()
        record.request = self._default_request
        # Avoid overwriting an existing `server_name` on the record. This is running in
        # the context of a global log record filter so there may be 3rd-party code that
        # adds their own `server_name` and we don't want to interfere with that.
        if not hasattr(record, "server_name"):
            record.server_name = "unknown_server_from_no_logcontext"
        # context should never be None, but if it somehow ends up being, then
        # we end up in a death spiral of infinite loops, so let's check, for
        # robustness' sake.
        if context is not None:
            # Add some data from the HTTP request.
            request = context.request
            # The sentinel logcontext has no request so if we get past this point, we
            # know we have some actual Synapse logcontext and don't need to worry about
            # using `safe_set`. We'll consider this an optimization since this is a
            # pretty hot-path.
            if request is None:
                return True
            def safe_set(attr: str, value: Any) -> None:
                """
                Only write the attribute if it hasn't already been set.
                """
                if not hasattr(record, attr):
                    setattr(record, attr, value)
            safe_set("server_name", context.server_name)
            # Logging is interested in the request ID. Note that for backwards
            # compatibility this is stored as the "request" on the record.
            safe_set("request", str(context))
            record.ip_address = request.ip_address
            record.site_tag = request.site_tag
            record.requester = request.requester
            record.authenticated_entity = request.authenticated_entity
            record.method = request.method
            record.url = request.url
            record.protocol = request.protocol
            record.user_agent = request.user_agent
        return Truewhere you get the check for SENTINEL out of the way early, and then run code that assumes a non-sentinel logcontext.
Though, I now realise safe_set was checking both that this wasn't the SENTINEL_CONTEXT, as well as checking that the attribute wasn't already set. So the above doesn't completely eliminate the need for safe_set.
Though if safe_set is now only checking that the attribute is not already set, then it becomes even less necessary, and can probably be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd also refactor this block like so to save on indentation:
        # context should never be None, but if it somehow ends up being, then
        # we end up in a death spiral of infinite loops, so let's check, for
        # robustness' sake.
        if context is None:
            return True
        # Add some data from the HTTP request.
        request = context.request
        ...
        return TrueThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant structuring the code to be be something like the following: [...] where you get the check for SENTINEL out of the way early, and then run code that assumes a non-sentinel logcontext.
This won't work because not everything is logged with a context.request (like start-up messages, background jobs, etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged but if we come up with a better pattern, I can make a follow-up PR ⏩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was wondering if there was something like that. Thanks for explaining, looks fine to me then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No blocking comments - overall this looks good to me.
| def safe_set(attr: str, value: Any) -> None: | ||
| """ | ||
| Only write the attribute if it hasn't already been set or we actually have | ||
| a Synapse logcontext (indicating that this log record is relevant to | ||
| Synapse). | ||
| """ | ||
| if context is not SENTINEL_CONTEXT or not hasattr(record, attr): | ||
| setattr(record, attr, value) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not move setting the server_name under the request attribute? Then you'd only have one usage of safe_set, and then it'd be even more clear that it's not necessary.
| Thanks for the review @anoadragon453 🦨 | 
Be mindful that Synapse can be run alongside other code in the same Python process. We shouldn't overwrite fields on given log record unless we know it's relevant to Synapse.
(no clobber)
Background
As part of Element's plan to support a light form of vhosting (virtual host) (multiple instances of Synapse in the same Python process), we're currently diving into the details and implications of running multiple instances of Synapse in the same Python process.
"Per-tenant logging" tracked internally by https://github.com/element-hq/synapse-small-hosts/issues/48
Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.