Introduce `Clock.call_when_running(...)` to wrap startup code in a
logcontext, ensuring we can identify which server generated the logs.
Background:
> Ideally, nothing from the Synapse homeserver would be logged against the `sentinel`
> logcontext as we want to know which server the logs came from. In practice, this is not
> always the case yet especially outside of request handling.
>
> Global things outside of Synapse (e.g. Twisted reactor code) should run in the
> `sentinel` logcontext. It's only when it calls into application code that a logcontext
> gets activated. This means the reactor should be started in the `sentinel` logcontext,
> and any time an awaitable yields control back to the reactor, it should reset the
> logcontext to be the `sentinel` logcontext. This is important to avoid leaking the
> current logcontext to the reactor (which would then get picked up and associated with
> the next thing the reactor does).
>
> *-- `docs/log_contexts.md`
Also adds a lint to prefer `Clock.call_when_running(...)` over
`reactor.callWhenRunning(...)`
Part of https://github.com/element-hq/synapse/issues/18905
This can be reviewed commit by commit
There are a few improvements over the experimental support:
- authorisation of Synapse <-> MAS requests is simplified, with a single
shared secret, removing the need for provisioning a client on the MAS
side
- the tests actually spawn a real server, allowing us to test the rust
introspection layer
- we now check that the device advertised in introspection actually
exist, making it so that when a user logs out, the tokens are
immediately invalidated, even if the cache doesn't expire
- it doesn't rely on discovery anymore, rather on a static endpoint
base. This means users don't have to override the introspection endpoint
to avoid internet roundtrips
- it doesn't depend on `authlib` anymore, as we simplified a lot the
calls done from Synapse to MAS
We still have to update the MAS documentation about the Synapse setup,
but that can be done later.
---------
Co-authored-by: reivilibre <oliverw@element.io>
Bulk refactor `Counter` metrics to be homeserver-scoped. We also add
lints to make sure that new `Counter` metrics don't sneak in without
using the `server_name` label (`SERVER_NAME_LABEL`).
All of the "Fill in" commits are just bulk refactor.
Part of https://github.com/element-hq/synapse/issues/18592
### Testing strategy
1. Add the `metrics` listener in your `homeserver.yaml`
```yaml
listeners:
# This is just showing how to configure metrics either way
#
# `http` `metrics` resource
- port: 9322
type: http
bind_addresses: ['127.0.0.1']
resources:
- names: [metrics]
compress: false
# `metrics` listener
- port: 9323
type: metrics
bind_addresses: ['127.0.0.1']
```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9322/_synapse/metrics` and/or
`http://localhost:9323/metrics`
1. Observe response includes the `synapse_user_registrations_total`,
`synapse_http_server_response_count_total`, etc metrics with the
`server_name` label
We do this by shoving it into Rust. We believe our python http client is
a bit slow.
Also bumps minimum rust version to 1.81.0, released last September (over
six months ago)
To allow for async Rust, includes some adapters between Tokio in Rust
and the Twisted reactor in Python.
Since MAS 0.13.0, the provisionning of devices and users is done
synchronously and reliably enough that we don't need to auto-provision
on the Synapse side anymore.
It's important to remove this behaviour if we want to start caching
token introspection results.
Evolution of
cd78f3d2ee
This cache does not have any explicit invalidation, but this is deemed
acceptable (see code comment).
We may still prefer to add it eventually, letting us bump up the
Time-To-Live (TTL) on the cache as we currently set a 2 minute expiry
to balance the fact that we have no explicit invalidation.
This cache makes several things more efficient:
- reduces number of outbound requests from Synapse, reducing CPU
utilisation + network I/O
- reduces request handling time in Synapse, which improves
client-visible latency
- reduces load on MAS and its database
---
Other than that, this PR also introduces support for `expires_in`
(seconds) on the introspection response.
This lets the cached responses expire at the proper expiry time of the
access token, whilst avoiding clock skew issues.
Corresponds to:
https://github.com/element-hq/matrix-authentication-service/pull/4241
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
The context for this is that the Matrix spec allows basically anything
in the device ID. With MSC3861, we're restricting this to strings that
can be represented as scopes.
Whilst this works well for next-gen auth sessions, compatibility/legacy
sessions still can have characters that can't be encoded (mainly spaces)
in them.
To work around that, we added in MAS a behaviour where the device_id is
given as an explicit property of the token introspection response, and
remove it from the scope.
Because we don't expect users to rollout new Synapse and MAS versions in
sync, we needed a way to 'advertise' support for this behaviour: the
easiest way to do that was through an extra header in the introspection
response.
On the longer term, I expect MAS and Synapse to move away from the
introspection endpoint, and instead define a specific API for Synapse ->
MAS communication.
PR on the MAS side:
https://github.com/element-hq/matrix-authentication-service/pull/4067
This has been a problem with Element Web, as it will proble /register
with an empty body, which gave this error:
```
curl -d '{}' -HContent-Type:application/json /_matrix/client/v3/register
{"errcode": "M_UNKNOWN",
"error": "Invalid username"}
```
And Element Web would choke on it. This changes that so we reply
instead:
```
{"errcode": "M_FORBIDDEN",
"error": "Registration has been disabled. Only m.login.application_service registrations are allowed."}
```
Also adds a test for this.
See https://github.com/element-hq/element-web/issues/27993
---------
Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>
Another PR on my quest to a `*_path` variant for every secret. Adds two
config options `admin_token_path` and `client_secret_path` to the
experimental config under `experimental_features.msc3861`. Also includes
tests.
I tried to be a good citizen here by following `attrs` conventions and
not rewriting the corresponding non-path variants in the class, but
instead adding methods to retrieve the value.
Reading secrets from files has the security advantage of separating the
secrets from the config. It also simplifies secrets management in
Kubernetes. Also useful to NixOS users.
This is an implementation of MSC4190, which allows appservices to manage
their user's devices without /login & /logout.
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
During the migration the automated script to update the copyright
headers accidentally got rid of some of the existing copyright lines.
Reinstate them.
* Describe `insert_client_ip`
* Pull out client_ips and MAU tracking to BaseAuth
* Define HAS_AUTHLIB once in tests
sick of copypasting
* Track ips and token usage when delegating auth
* Test that we track MAU and user_ips
* Don't track `__oidc_admin`