Configuring systemd-resolved to Work with dnsmasq on Ubuntu
Overview
While attempting to run the openshift-installer
locally
using libvirt
, I ran into a peculiar problem with NetworkManager
’s packaged version of dnsmasq
and systemd-resolved
. After a good amount of troubleshooting (most of it spent trying to understand
the relationship between all three components), I was able to work it out such that I could get the
OpenShift installer running on Ubuntu.
Problem
As suggested
I added the configuration settings for NetworkManager
to configure dnsmasq
and allow my local machine
to resolve the hostnames of the VMs. Unfortunately, it wasn’t working. The installer was timing out
while trying to connect to the cluster after spinning up:
time="2020-02-06T14:30:08-05:00" level=debug msg="Still waiting for the Kubernetes API: Get https://api.test-cluster.tt.testing:6443/version?timeout=32s: dial tcp: lookup api.test-cluster.tt.testing on 127.0.0.53:53: no such host"
Strange, because this works out of the box on RHEL-flavored Linux systems. After some digging and
increasing the logging on systemd-resolved
, I discovered that queries were going to a strange address:
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Looking up RR for api.test-cluster.tt.testing IN A.
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Cache miss for api.test-cluster.tt.testing IN A
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Transaction 30245 for <api.test-cluster.tt.testing IN A> scope dns on enp5s0/*.
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Using feature level UDP+EDNS0 for transaction 30245.
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Using DNS server 192.168.29.1 for transaction 30245.
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Sending query packet with id 30245.
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Processing query...
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Processing incoming packet on transaction 30245. (rcode=NXDOMAIN)
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Retrying transaction 30245.
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Cache miss for api.test-cluster.tt.testing IN A
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Transaction 30245 for <api.test-cluster.tt.testing IN A> scope dns on enp5s0/*.
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Using feature level UDP for transaction 30245.
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Sending query packet with id 30245.
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Processing incoming packet on transaction 30245. (rcode=NXDOMAIN)
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Not caching negative entry for: api.test-cluster.tt.testing IN A, cache mode set to no-negative
Feb 06 15:35:26 my-pc systemd-resolved[32657]: Transaction 30245 for <api.test-cluster.tt.testing IN A> on scope dns on enp5s0/* now complete with <rcode-failure> from network (unsigned)
192.168.29.1
is similar to the address I configured according to the OpenShift documentation, but not quite
the same.
Turns out that NetworkManager
on Ubuntu automatically creates extra dnsmasq
settings, further confusing things:
Feb 06 12:12:10 my-pc NetworkManager[13250]: <info> [1581009130.5539] dnsmasq[0x5604c3209e20]: dnsmasq appeared as :1.200
Feb 06 12:12:10 my-pc dnsmasq[13325]: setting upstream servers from DBus
Feb 06 12:12:10 my-pc dnsmasq[13325]: using nameserver 192.168.126.1#53 for domain tt.testing
Feb 06 12:12:10 my-pc dnsmasq[13325]: using nameserver 192.168.29.1#53(via enp5s0)
Feb 06 12:12:10 my-pc dnsmasq[13325]: using nameserver 192.168.29.1#53 for domain 29.168.192.in-addr.arpa
Feb 06 12:12:10 my-pc dnsmasq[13325]: using nameserver 192.168.29.1#53 for domain 254.169.in-addr.arpa
Feb 06 12:12:10 my-pc dhclient[13322]: DHCPREQUEST of 192.168.29.186 on enp5s0 to 255.255.255.255 port 67 (xid=.......)
Feb 06 12:12:10 my-pc dhclient[13322]: DHCPACK of 192.168.29.186 from 192.168.29.1
Solution
I tried several solutions1 2, but in the end what worked for me was to change the default DNS server
for my connection to 127.0.1.1
using systemd-resolve
.
First, retrieve your connection that’s causing problems:
$ systemd-resolve --status
...
Link 2 (enp5s0)
Current Scopes: DNS
LLMNR setting: yes
MulticastDNS setting: no
DNSSEC setting: no
DNSSEC supported: no
DNS Servers: 192.168.29.1
DNS Domain: ~.
...
Now update your dnsmasq
configuration to add that address in:
$ vim /etc/NetworkManager/dnsmasq.d/openshift.conf
server=/tt.testing/192.168.126.1
server=192.168.29.1
And now update your systemd-resolved
configuration to use dnsmasq
for resolution:
$ sudo systemd-resolve --set-dns=127.0.1.1 --interface=enp5s0
And that should do the trick:
$ nslookup api.test-cluster.tt.testing
Server: 127.0.0.53
Address: 127.0.0.53#53
Non-authoritative answer:
Name: api.test-cluster.tt.testing
Address: 192.168.126.10
Name: api.test-cluster.tt.testing
Address: 192.168.126.11
Unfortunately, those changes won’t survive a reboot. In order to make the changes permanent, you’ll need to create a
file in /etc/systemd/network/
, such as /etc/systemd/network/enp5s0.conf
with contents similar to:
[Match]
Name=enp5s0
[Resolve]
DNS=127.0.1.1
Now with that set, your changes should survive a reboot.