Skip to content

feat: add certificate metrics to agent for NGINXaaS#1731

Open
vivki wants to merge 4 commits into
nginx:mainfrom
vivki:naas-1315-certificate-receiver
Open

feat: add certificate metrics to agent for NGINXaaS#1731
vivki wants to merge 4 commits into
nginx:mainfrom
vivki:naas-1315-certificate-receiver

Conversation

@vivki

@vivki vivki commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

NGINXAAS-1315: Certificate expiry metric receiver

Motivation

As a platform engineer managing NGINXaaS deployments, I want to be alerted before a certificate expires. This alert should come from the same monitoring stack, following existing metrics patterns, and the metric labels should help identify which cert is the problem: common name, file path, algorithm, serial number.

nginx-agent already indexes every certificate nginx is using as part of config parsing. This change makes that data useful by exporting it as a metric, giving operators a simple threshold alert on nginx.certificate.expiry without any additional tooling.

The receiver is separate from the existing nginx/nginxplus receivers because it covers a distinct concern (TLS hygiene vs. traffic metrics), it can emit a lot of data points on cert-heavy deployments, and it should be easy to enable or disable independently.

Implementation

Adds a certificate OTel receiver that scrapes cert files via crypto/x509 every 15s and emits nginx.certificate.expiry, a gauge of the Unix timestamp at which each cert expires.

Attributes: file_path, public_key_algorithm, serial_number, subject.common_name
Resource attribute: instance.id | Gated on: FeatureCertificates

Renewals (same path, new cert) are reflected on the next scrape without a collector restart. The collector only restarts when the set of watched paths changes.

Commit Descriptions

Commit Description
3793b30 Add mdatagen stability annotations to existing receivers (prerequisite, no behaviour change)
0e8a6bf Define metric schema (metadata.yaml) and config types
dfc308f Run mdatagen, generated boilerplate only
ce0e4ec Implement scraper, factory, plugin wiring, tests

Checklist

Before creating a PR, run through this checklist and mark each as complete.

  • I have read the CONTRIBUTING document
  • I have run make install-tools and have attached any dependency changes to this pull request
  • If applicable, I have added tests that prove my fix is effective or that my feature works
  • If applicable, I have checked that any relevant tests pass after adding my changes
  • If applicable, I have updated any relevant documentation (README.md)
  • If applicable, I have tested my cross-platform changes on Ubuntu 22, Redhat 8, SUSE 15 and FreeBSD 13

@vivki vivki requested a review from a team as a code owner June 11, 2026 21:04
@github-actions github-actions Bot added chore Pull requests for routine tasks documentation Improvements or additions to documentation labels Jun 11, 2026
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 73.26007% with 73 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.64%. Comparing base (e10a0d3) to head (54501f1).

Files with missing lines Patch % Lines
internal/collector/otel_collector_plugin.go 4.87% 39 Missing ⚠️
...atereceiver/internal/metadata/generated_metrics.go 82.85% 17 Missing and 1 partial ⚠️
...catereceiver/internal/metadata/generated_config.go 73.33% 4 Missing and 4 partials ⚠️
internal/collector/certificatereceiver/factory.go 84.00% 2 Missing and 2 partials ⚠️
internal/collector/certificatereceiver/scraper.go 91.83% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1731      +/-   ##
==========================================
- Coverage   84.88%   84.64%   -0.24%     
==========================================
  Files         105      111       +6     
  Lines       13632    13897     +265     
==========================================
+ Hits        11571    11763     +192     
- Misses       1538     1602      +64     
- Partials      523      532       +9     
Files with missing lines Coverage Δ
internal/collector/certificatereceiver/config.go 100.00% <100.00%> (ø)
...tereceiver/internal/metadata/generated_resource.go 100.00% <100.00%> (ø)
...r/cpuscraper/internal/metadata/generated_config.go 75.75% <100.00%> (ø)
...emoryscraper/internal/metadata/generated_config.go 73.33% <100.00%> (ø)
internal/collector/factories.go 100.00% <100.00%> (ø)
internal/config/types.go 86.66% <100.00%> (-0.44%) ⬇️
internal/collector/certificatereceiver/factory.go 84.00% <84.00%> (ø)
internal/collector/certificatereceiver/scraper.go 91.83% <91.83%> (ø)
...catereceiver/internal/metadata/generated_config.go 73.33% <73.33%> (ø)
...atereceiver/internal/metadata/generated_metrics.go 82.85% <82.85%> (ø)
... and 1 more

Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e10a0d3...54501f1. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


status:
class: receiver
stability:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the mdatagen schema requires stability to be defined at both the receiver level (status.stability: beta: [metrics]) and the metric level (stability.level: development)

@vivki vivki force-pushed the naas-1315-certificate-receiver branch from e563b15 to 4ba54d3 Compare June 12, 2026 19:43
@vivki vivki changed the title Naas 1315 certificate receiver feat: add certificate metrics to agent for NGINXaaS Jun 15, 2026
}

for _, path := range c.cfg.CertFilePaths {
cert, err := parseCertFile(path)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple issues to think about here:

  1. there are potentially a lot of certs and parsing them can be non-trivial work to do every 15s.
  2. A path may contain more than one certificate

Do we have any notification mechanism for when c.cfg changes?

Maybe something for going through all the filepaths to extract all the certs. keep a list of all the certs with the data we need for each one (expiration, path, pubkeyalgo, serial, etc) as well as the file's mtime.

Then for each scrape we just iterate through that list and stat the file to see if it has changed and we need to reparse.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some changes to the scraper to address your feedback:

  1. there are potentially a lot of certs and parsing them can be non-trivial work to do every 15s.

Added an mtime-based cache. Each scrape does os.Stat per file; if mtime is unchanged we skip the read+parse and use cached certs.

  1. A path may contain more than one certificate

parseCertFile now loops pem.Decode until exhausted instead of stopping at the first block. Each cert gets its own data point.

@vivki vivki force-pushed the naas-1315-certificate-receiver branch from 4ba54d3 to f7f48ee Compare June 23, 2026 00:29
@github-actions github-actions Bot added the enhancement New feature or request label Jun 23, 2026
@vivki vivki force-pushed the naas-1315-certificate-receiver branch from f7f48ee to 9b7c983 Compare June 23, 2026 00:32
vivki added 4 commits June 22, 2026 17:42
Required by mdatagen for nginxplusreceiver, nginxreceiver, and
containermetricsreceiver metrics. No behaviour change.
Add metadata.yaml defining nginx.certificate.expiry (gauge, Unix timestamp)
with attributes file_path, public_key_algorithm, serial_number, subject.common_name.
Add CertificateReceiver config type with InstanceID and CertFilePaths []string.
Run: cd internal/collector/certificatereceiver && mdatagen metadata.yaml
Scraper reads cert files via crypto/x509 on each 15s scrape and emits
nginx.certificate.expiry (Unix timestamp) per cert — renewals are picked
up immediately without a collector restart. Gated on FeatureCertificates.
Collector restarts only when the set of watched cert file paths changes.
@vivki vivki force-pushed the naas-1315-certificate-receiver branch from 9b7c983 to 54501f1 Compare June 23, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore Pull requests for routine tasks documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants