Skip to content

fix: remove # from ingress snippet#110

Merged
vigneshrajsb merged 1 commit intomainfrom
fix-snippet
Feb 24, 2026
Merged

fix: remove # from ingress snippet#110
vigneshrajsb merged 1 commit intomainfrom
fix-snippet

Conversation

@vigneshrajsb
Copy link
Contributor

@vigneshrajsb vigneshrajsb commented Feb 24, 2026

Problem

nginx-controller consuming a lot of memory and OOMKills

Root Cause: # in PR value corrupts nginx-ingress cleanConf depth tracking

The nginx-ingress controller generates nginx.conf by rendering a Go template and then post-processing it through a function called cleanConf. This function re-indents the config by tracking brace depth — it increments depth on { and decrements on }, writing that many tabs at the start of each line.

Critical flaw: cleanConf treats # as a comment character in all contexts (it has no awareness of nginx string literals). Once it encounters #, it enters comment mode for the rest of that line — meaning any { or } characters after the # are ignored for depth tracking.

The ingressBannerSnippet function generated a configuration-snippet containing:

sub_filter "</head>" '<script>window.LFC_BANNER = [{"label":"uuid",...},{"label":"PR","value":"#20845","url":"..."},...];</script></head>';

On the single long line containing the JSON array, the sequence is:

...{"label":"PR","value":"  #  20845","url":"..."},{"label":"sha"...},{...},...];
                            ^
                     cleanConf enters comment mode here

When cleanConf hits # inside "value":"#20845", it stops counting braces for the rest of the line. The { that opened the PR object is counted, but its matching } and all subsequent {} pairs (sha, branch, service name, build) are not counted. This leaves the depth counter permanently +1 after each such location block is processed.

Impact

  • 211 ingresses across lifecycle environments had this pattern (all PR-based deployments)
  • Each ingress generates 2 server blocks (HTTP + HTTPS), each with a location containing the sub_filter
  • ~422 depth increments accumulated → 431 levels of tab indentation at the end of the file
  • The generated nginx.conf grew to 35MB (vs. a normal <1MB)
  • During config reload, NGINX loads the new 35MB config alongside the old one in memory, causing transient spikes that breached the 1Gi limit and triggered OOMKills

This explains why the old cluster handled 500+ ingresses within 1Gi but this cluster could not — the old cluster did not have lifecycle environment ingresses with this #-in-value pattern at this scale.

Fix

Remove the # prefix from the PR value in the LFC_BANNER JSON:

- value: `#${pullRequestNumber}` || '',
+ value: `${pullRequestNumber}` || '',

The PR number without # is still meaningful since the full PR URL is already present in the url field. This eliminates the # character from the nginx snippet entirely, preventing cleanConf from entering comment mode mid-line.

Verification

After this change is deployed, the nginx.conf size should drop from ~35MB back to a normal range (<2MB), and memory usage per pod should stabilize well below 1Gi even during reloads.

@vigneshrajsb vigneshrajsb marked this pull request as ready for review February 24, 2026 01:29
@vigneshrajsb vigneshrajsb requested a review from a team as a code owner February 24, 2026 01:29
@vigneshrajsb vigneshrajsb merged commit 8723704 into main Feb 24, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants