Skip to content

feat(dns): NXDOMAIN fallback to an upstream resolver#1979

Open
s3rj1k wants to merge 1 commit into
NVIDIA:mainfrom
s3rj1k:feat/dns-upstream-fallback
Open

feat(dns): NXDOMAIN fallback to an upstream resolver#1979
s3rj1k wants to merge 1 commit into
NVIDIA:mainfrom
s3rj1k:feat/dns-upstream-fallback

Conversation

@s3rj1k
Copy link
Copy Markdown

@s3rj1k s3rj1k commented May 28, 2026

Description

carbide-dns answered only from carbide-api's zone, so names outside it (public hostnames a VM needs) returned NXDOMAIN. Add an upstream_resolver config field + --upstream-resolver flag, and on NXDOMAIN/Refused consult that resolver before returning the negative answer. Wire the value through the nico-dns chart.

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

@s3rj1k s3rj1k requested review from a team as code owners May 28, 2026 09:00
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 28, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

carbide-dns answered only from carbide-api's zone, so names outside it
(public hostnames a VM needs) returned NXDOMAIN. Add an upstream_resolver
config field + --upstream-resolver flag, and on NXDOMAIN/Refused consult
that resolver before returning the negative answer. Wire the value
through the nico-dns chart.

Signed-off-by: s3rj1k <evasive.gyron@gmail.com>
@s3rj1k s3rj1k force-pushed the feat/dns-upstream-fallback branch from 69ab098 to 2156705 Compare May 28, 2026 10:08
@ianderson-nvidia
Copy link
Copy Markdown
Contributor

carbide-dns is intended to only be authoritative for nico zones. It is not a recursive DNS server. It is also not good practice that you have your authoritative DNS server act as a recursive DNS server, so I am not comfortable merging this feature.

If you need recursive DNS, you can use something like unbound and forward nico-related queries to carbide-dns.

forward-zone:
  name: <nico zone name>
  forward-addr: <carbide-dns-ip>

@s3rj1k
Copy link
Copy Markdown
Author

s3rj1k commented May 28, 2026

carbide-dns is intended to only be authoritative for nico zones. It is not a recursive DNS server. It is also not good practice that you have your authoritative DNS server act as a recursive DNS server, so I am not comfortable merging this feature.

If you need recursive DNS, you can use something like unbound and forward nico-related queries to carbide-dns.

forward-zone:
  name: <nico zone name>
  forward-addr: <carbide-dns-ip>

Use-case here is to be able to resolve sslip.io exclusively.

I've tried using unbound with charts setup, but it looks like this does not work without rebuilding both unbound and unbound exporter images.

@ajf
Copy link
Copy Markdown
Collaborator

ajf commented May 28, 2026

Anyone can really use any recursive resolver they want, if they're willing to add the couple of hostnames that are the same in every site that map to the site specific IP addresses on Kubernetes services. Maybe we should think about documenting the hostnames required and not try to ship an unbound.

But yeah, carbide-dns is only intended to be an authoritative service and not do recursor functions; so I think this problem is better solved at the recursor service.

@s3rj1k
Copy link
Copy Markdown
Author

s3rj1k commented May 28, 2026

Maybe we should think about documenting the hostnames required and not try to ship an unbound.

I would also prefer a working solution based on unbound vs maintaining some extra code.

Let's keep this PR around for reference, until correct solution evolves.

@ajf
Copy link
Copy Markdown
Collaborator

ajf commented May 28, 2026

I would also prefer a working solution based on unbound

Do you mean you expect NICo helm charts to include an unbound deployment?

@s3rj1k
Copy link
Copy Markdown
Author

s3rj1k commented May 28, 2026

Do you mean you expect NICo helm charts to include an unbound deployment?

I mean they are already no? No images but unbound placeholders exist already, but I guess it would be enough to have setup documented and CI tested at some level so the next possible rebase/refactor will not break this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants