Skip to content

Create ServiceToService package to standardize S2S calls authorization#842

Open
Dragemil wants to merge 2 commits intov10.0-previewfrom
feature/service-to-service
Open

Create ServiceToService package to standardize S2S calls authorization#842
Dragemil wants to merge 2 commits intov10.0-previewfrom
feature/service-to-service

Conversation

@Dragemil
Copy link
Contributor

@Dragemil Dragemil commented Feb 26, 2026

An attempt to standardize service to service authentication and authorization, using for authorization the same patterns that are used for the end client calls. Comments regarding anything are welcome.

The authentication heavily relies on some infrastructural entity (eg. an Istio service mesh) to enforce that incoming requests from all allowed services have a particular header with appropriate value attached, and I know this is possible. Without it (or with some misconfiguration of it), it's obviously very insecure, as such thing could be very easily spoofed.

On properly configured infrastructure, a request without the appropriate header would be short circuited with 403 status by the proxy, never reaching the destination service. It is also assumed that the external ingress will cut out and provide correct LNCD-Caller-Id headers.

This implementation is mainly concerned by the authorization on quite granular, application level, while also providing some defense in depth, thanks to not accepting requests from unknown services, even though it is mainly the infrastructure responsibility.

It works out of the box on the local envs, it's just not secure at all there, but we don't need the security there.

Some accompanying infrastructure with this would look a bit like this:
leancodepl/terraform-kubernetes-cluster#43
https://github.com/leancodepl/exampleapp/pull/386
https://github.com/leancodepl/exampleapp-deployment/pull/37

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
945 944 0 1 0 0 1m 52s

Skipped

Tests Status
LeanCode.Azure.Tests.AzureStorageAuditLogIntegrationTests.Ensure_that_all_logs_are_correctly_uploaded_to_multiple_files skipped ⏭️

🎉 No failed tests in this run.

Github Test Reporter by CTRF 💚

@codecov
Copy link

codecov bot commented Feb 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.38%. Comparing base (a927943) to head (ad5c454).

Additional details and impacted files
@@                Coverage Diff                @@
##           v10.0-preview     #842      +/-   ##
=================================================
+ Coverage          85.13%   85.38%   +0.25%     
=================================================
  Files                241      246       +5     
  Lines               5091     5180      +89     
  Branches             374      382       +8     
=================================================
+ Hits                4334     4423      +89     
  Misses               674      674              
  Partials              83       83              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new LeanCode.ServiceToService package that standardizes service-to-service (S2S) authentication and authorization across microservices. The package leverages infrastructure-level header enforcement (e.g., service mesh, ingress middleware) for security, using the LNCD-Caller-Id header to identify calling services and map them to roles for authorization using the existing CQRS security framework.

Changes:

  • New ServiceToService package with incoming authentication handler and outgoing HTTP client extensions
  • Authentication handler that validates service callers via LNCD-Caller-Id header and assigns roles
  • Policy scheme support to route between S2S and default authentication based on caller identity
  • Comprehensive test coverage for all components

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
LeanCode.CoreLibrary.slnx Added new ServiceToService project and test project to solution
src/Infrastructure/LeanCode.ServiceToService/LeanCode.ServiceToService.csproj Project file with dependencies on ASP.NET Core and CQRS.Security
src/Infrastructure/LeanCode.ServiceToService/ServiceToServiceDefaults.cs Constants for header name, authentication scheme, and policy scheme
src/Infrastructure/LeanCode.ServiceToService/Outgoing/ServiceToServiceHttpClientExtensions.cs Extension to add caller identity header to outgoing HTTP requests
src/Infrastructure/LeanCode.ServiceToService/Incoming/ServiceToServiceAuthenticationOptions.cs Configuration options for authentication including caller-to-roles mapping
src/Infrastructure/LeanCode.ServiceToService/Incoming/ServiceToServiceAuthenticationHandler.cs Authentication handler that validates caller ID and creates claims principal
src/Infrastructure/LeanCode.ServiceToService/Incoming/ServiceToServiceAuthenticationExtensions.cs DI registration extensions for S2S authentication and policy scheme
src/Infrastructure/LeanCode.ServiceToService/Incoming/ServiceToServiceCallerRolesValidator.cs Startup validator ensuring configured roles exist in role registry
test/Infrastructure/LeanCode.ServiceToService.Tests/LeanCode.ServiceToService.Tests.csproj Test project file
test/Infrastructure/LeanCode.ServiceToService.Tests/Outgoing/ServiceToServiceHttpClientExtensionsTests.cs Tests for outgoing HTTP client header injection
test/Infrastructure/LeanCode.ServiceToService.Tests/Incoming/ServiceToServiceAuthenticationHandlerTests.cs Tests for authentication handler covering various scenarios
test/Infrastructure/LeanCode.ServiceToService.Tests/Incoming/ServiceToServiceAuthenticationExtensionsTests.cs Tests for policy scheme routing logic
test/Infrastructure/LeanCode.ServiceToService.Tests/Incoming/ServiceToServiceCallerRolesValidatorTests.cs Tests for startup validation of caller roles

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +8 to +14
{
public string NameClaimType { get; set; } = "sub";
public string RoleClaimType { get; set; } = "role";
public FrozenDictionary<string, FrozenSet<string>> CallerRoles { get; set; } =
FrozenDictionary<string, FrozenSet<string>>.Empty;
public bool RejectUnknownCallers { get; set; } = true;
public bool RejectMissingCallerId { get; set; } = true;
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ServiceToServiceAuthenticationOptions properties lack XML documentation explaining their purpose and default values. Consider adding XML doc comments for each property to explain: what NameClaimType and RoleClaimType control, how CallerRoles maps service identifiers to roles, and the behavior of the rejection flags.

Suggested change
{
public string NameClaimType { get; set; } = "sub";
public string RoleClaimType { get; set; } = "role";
public FrozenDictionary<string, FrozenSet<string>> CallerRoles { get; set; } =
FrozenDictionary<string, FrozenSet<string>>.Empty;
public bool RejectUnknownCallers { get; set; } = true;
public bool RejectMissingCallerId { get; set; } = true;
{
/// <summary>
/// Gets or sets the claim type that contains the caller identifier used as the logical service name.
/// </summary>
/// <remarks>
/// Defaults to <c>"sub"</c>. This claim is used to look up entries in <see cref="CallerRoles"/>.
/// </remarks>
public string NameClaimType { get; set; } = "sub";
/// <summary>
/// Gets or sets the claim type that contains the roles assigned to the caller.
/// </summary>
/// <remarks>
/// Defaults to <c>"role"</c>. Values of this claim are interpreted as application roles.
/// </remarks>
public string RoleClaimType { get; set; } = "role";
/// <summary>
/// Gets or sets the mapping between caller identifiers and the roles they are allowed to assume.
/// </summary>
/// <remarks>
/// The key is the caller identifier (as taken from the claim specified by <see cref="NameClaimType"/>),
/// and the value is the set of roles that the caller is permitted to use. Defaults to an empty mapping,
/// meaning no callers are explicitly configured.
/// </remarks>
public FrozenDictionary<string, FrozenSet<string>> CallerRoles { get; set; } =
FrozenDictionary<string, FrozenSet<string>>.Empty;
/// <summary>
/// Gets or sets a value indicating whether callers that are not present in <see cref="CallerRoles"/> are rejected.
/// </summary>
/// <remarks>
/// Defaults to <see langword="true" />. When <see langword="true" />, any caller whose identifier is not
/// configured in <see cref="CallerRoles"/> will be denied access.
/// </remarks>
public bool RejectUnknownCallers { get; set; } = true;
/// <summary>
/// Gets or sets a value indicating whether requests that do not contain a caller identifier claim are rejected.
/// </summary>
/// <remarks>
/// Defaults to <see langword="true" />. When <see langword="true" />, requests without a claim of type
/// <see cref="NameClaimType"/> will be denied access.
/// </remarks>
public bool RejectMissingCallerId { get; set; } = true;
/// <summary>
/// Gets or sets a value indicating whether <see cref="CallerRoles"/> should be validated when the application starts.
/// </summary>
/// <remarks>
/// Defaults to <see langword="true" />. When enabled, misconfigurations in <see cref="CallerRoles"/> may cause
/// startup-time failures instead of runtime authorization errors.
/// </remarks>

Copilot uses AI. Check for mistakes.
namespace LeanCode.ServiceToService.Incoming;

/// <summary>
/// Authenticates callers based on <c>LNCD-Caller-Id</c>.
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML documentation comment uses <c> tags for the header name, but does not mention that the header value should be the service identifier. Consider adding a parameter or return value documentation to clarify what values are expected in this header and how they map to the CallerRoles configuration.

Suggested change
/// Authenticates callers based on <c>LNCD-Caller-Id</c>.
/// Authenticates callers based on <c>LNCD-Caller-Id</c>.
/// The header value must be the service identifier of the calling service, which is used as a key in
/// <see cref="ServiceToServiceAuthenticationOptions.CallerRoles" /> to resolve the caller's roles.

Copilot uses AI. Check for mistakes.
Comment on lines +53 to +65
if (
!context.Request.Headers.TryGetValue(
ServiceToServiceDefaults.CallerIdHeaderName,
out var values
) || string.IsNullOrWhiteSpace(values.ToString())
)
{
return fallbackToDefaultSchemeOnMissingHeader
? defaultScheme
: ServiceToServiceDefaults.AuthenticationScheme;
}

var callerId = values.ToString();
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expression values.ToString() is called twice (lines 57 and 65). Consider extracting it to a local variable after line 57 to avoid redundant string allocation and improve performance.

Copilot uses AI. Check for mistakes.
Comment on lines +10 to +28
[Fact]
public void Adds_caller_identity_header_to_configured_http_client()
{
const string ServiceName = "notifications-service";

using var provider = new ServiceCollection()
.AddHttpClient("outgoing")
.AddCallerIdentity(ServiceName)
.Services.BuildServiceProvider();

var clientFactory = provider.GetRequiredService<IHttpClientFactory>();
var client = clientFactory.CreateClient("outgoing");

client
.DefaultRequestHeaders.Should()
.ContainSingle(kv => kv.Key == ServiceToServiceDefaults.CallerIdHeaderName)
.Which.Value.Should()
.ContainSingle(ServiceName);
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding test coverage for edge cases such as empty strings, whitespace-only strings, and null values for the serviceName parameter. This would ensure the method handles invalid input gracefully or fails with clear error messages.

Copilot uses AI. Check for mistakes.
Comment on lines +2 to +6

public static class ServiceToServiceDefaults
{
public const string CallerIdHeaderName = "LNCD-Caller-Id";
public const string AuthenticationScheme = "ServiceToService";
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ServiceToServiceDefaults class and its constants lack XML documentation. Consider adding documentation that explains the purpose of each constant, especially noting that CallerIdHeaderName is the header that must be enforced by infrastructure and PolicyScheme is for combining S2S with another authentication scheme.

Suggested change
public static class ServiceToServiceDefaults
{
public const string CallerIdHeaderName = "LNCD-Caller-Id";
public const string AuthenticationScheme = "ServiceToService";
/// <summary>
/// Provides default names and configuration values used for service-to-service (S2S) authentication.
/// </summary>
public static class ServiceToServiceDefaults
{
/// <summary>
/// The name of the HTTP header that identifies the calling service in S2S requests.
/// </summary>
/// <remarks>
/// This header must be enforced and populated by the infrastructure (for example, an API gateway
/// or reverse proxy) so that downstream services can reliably determine the authenticated caller.
/// </remarks>
public const string CallerIdHeaderName = "LNCD-Caller-Id";
/// <summary>
/// The default authentication scheme name used for service-to-service authentication handlers.
/// </summary>
public const string AuthenticationScheme = "ServiceToService";
/// <summary>
/// The name of the policy scheme used to combine S2S authentication with another authentication scheme.
/// </summary>
/// <remarks>
/// This policy scheme can be configured to select between the S2S authentication scheme and another
/// scheme (for example, a user-based authentication scheme), depending on the incoming request.
/// </remarks>

Copilot uses AI. Check for mistakes.
Comment on lines +22 to +36
if (
!Request.Headers.TryGetValue(ServiceToServiceDefaults.CallerIdHeaderName, out var values)
|| string.IsNullOrWhiteSpace(values.ToString())
)
{
if (Options.RejectMissingCallerId)
{
LogMissingCallerIdHeader(Logger, ServiceToServiceDefaults.CallerIdHeaderName);
return Task.FromResult(AuthenticateResult.Fail("Missing caller identity header."));
}

return Task.FromResult(AuthenticateResult.NoResult());
}

var callerId = values.ToString();
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the PolicyScheme implementation, values.ToString() is called twice in this method (lines 24 and 36). Consider extracting it to a local variable after the null check on line 24 to avoid redundant string allocation.

Copilot uses AI. Check for mistakes.
Comment on lines +80 to +114
[Fact]
public async Task Returns_success_with_subject_and_role_claims_for_known_caller()
{
var handler = ConfigureServices(options =>
{
options.NameClaimType = "sub";
options.RoleClaimType = "role";
options.CallerRoles = new Dictionary<string, FrozenSet<string>>
{
["notifications-service"] = ["system_notifications_service", "system_admin_panel"],
}.ToFrozenDictionary();
});

var result = await AuthenticateAsync(
handler,
new() { [ServiceToServiceDefaults.CallerIdHeaderName] = "notifications-service" }
);

using var _ = new AssertionScope();
result.Succeeded.Should().BeTrue();
result.None.Should().BeFalse();
result.Failure.Should().BeNull();
result.Ticket.Should().NotBeNull();
result.Principal.Should().NotBeNull();

result
.Principal!.Claims.Should()
.ContainSingle(c => c.Type == "sub")
.Which.Value.Should()
.Be("notifications-service");

var roleClaims = result.Principal.Claims.Where(c => c.Type == "role");
roleClaims.Should().HaveCount(2);
roleClaims.Select(c => c.Value).Should().BeEquivalentTo("system_notifications_service", "system_admin_panel");
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The authentication handler uses values.ToString() which concatenates multiple header values with commas. Consider adding test coverage for the scenario where multiple values are present in the LNCD-Caller-Id header to verify the expected behavior (rejection or specific handling).

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +32
public ValidateOptionsResult Validate(string? name, ServiceToServiceAuthenticationOptions options)
{
if (!options.ValidateCallerRolesAtStartup)
{
return ValidateOptionsResult.Skip;
}

var availableRoles = roleRegistry.All.Select(role => role.Name).ToHashSet(StringComparer.Ordinal);
var unknownRoles = options
.CallerRoles.Values.SelectMany(roles => roles)
.Distinct(StringComparer.Ordinal)
.Where(role => !availableRoles.Contains(role))
.Order(StringComparer.Ordinal)
.ToArray();

if (unknownRoles.Length == 0)
{
return ValidateOptionsResult.Success;
}

return ValidateOptionsResult.Fail(
$"Unknown role names configured in CallerRoles: {string.Join(", ", unknownRoles)}."
);
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validator correctly validates CallerRoles, but it doesn't validate whether CallerRoles is empty when ValidateCallerRolesAtStartup is true. Consider whether an empty CallerRoles dictionary (which would reject all service calls) should trigger a warning or validation failure to catch potential misconfigurations.

Copilot uses AI. Check for mistakes.
Comment on lines +7 to +12
public static IHttpClientBuilder AddCallerIdentity(this IHttpClientBuilder builder, string serviceName)
{
return builder.ConfigureHttpClient(client =>
client.DefaultRequestHeaders.Add(ServiceToServiceDefaults.CallerIdHeaderName, serviceName)
);
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AddCallerIdentity method lacks XML documentation explaining what it does and how the serviceName parameter should be configured. Consider adding documentation that explains this method adds the LNCD-Caller-Id header to all requests made by this HttpClient, and that the serviceName should match a key in the CallerRoles configuration on the receiving service.

Copilot uses AI. Check for mistakes.
Copy link
Member

@marcin-dardzinski marcin-dardzinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without it (or with some misconfiguration of it), it's obviously very insecure, as such thing could be very easily spoofed.

So is there anything the app can do about it now, or we just pray it's configured?

It works out of the box on the local envs, it's just not secure at all there, but we don't need the security there.

Istio on local clusters? 🫦

return Task.FromResult(AuthenticateResult.NoResult());
}

var claims = new List<Claim> { new(Options.NameClaimType, callerId) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows only to set name / role claims on the generated identity. Maybe we should allow more control here, e.g. instead CallerRoles we should have, TryCreateClient(string clientId, out ClaimsIdentity) or sth.

With a helper / default method that just reads the roles from a dict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but would we ever need anything more? I guess maybe, but I'm trying to think of some examples.


public static class ServiceToServiceHttpClientExtensions
{
public static IHttpClientBuilder AddCallerIdentity(this IHttpClientBuilder builder, string serviceName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe AddServiceToServiceCallerIdentity?

Also, unless ServiceToServiceDefaults are really defaults and not consts. I'd do sth like this AddIdentity(IHttpClientBuilder builder, string callerId, string headerName = ServiceToServiceDefaults.CallerIdHeaderName)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for the name.

Well, the ServiceToServiceDefaults sometimes are defaults (authorization scheme) and sometimes are consts (header name).

@Dragemil
Copy link
Contributor Author

So is there anything the app can do about it now, or we just pray it's configured?

Well' that's the main point of this RFC, cause I don't have any better idea. The good thing is that it's actually very easy to have a secure by default Istio setup, that will lean into disallowing requests you haven't explicitly allowed.

Istio on local clusters? 🫦

Hell no.
I mean, we could, but I wouldn't absolutely go this way. The local envs should be more frictionless, rather than secure.

@Dragemil
Copy link
Contributor Author

So is there anything the app can do about it now, or we just pray it's configured?

I wonder if this S2S auth shouldn't also check for some project wide secret, and only if it is correct, then check for the well known header values. It would mitigate the ease of shooting one in the foot at least, in terms of just providing a header from the outside of the ingress that wouldn't be stripped.

@jakubfijalkowski
Copy link
Member

Maybe we can see if the Istio is running and the namespace is managed by Istio and fail if that is not the case? Plus allow that on development environments.

@Dragemil
Copy link
Contributor Author

Maybe we can see if the Istio is running and the namespace is managed by Istio and fail if that is not the case? Plus allow that on development environments.

You would need to give the workload some additional k8s RBAC to check on the namespace label, and maybe some other roles to check whether some Istio pods are running. And it'd still be just a heuristic, since you probably wouldn't check if there is a default deny policy on the namespace (but if you'd check for that, then that would stop being just a heuristic I think).

So I think it's quite more complex solution, while still is just a heuristic, so I think i like the project x env api key more, since it's simpler and would work simply on the local env.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants