Stealth secondaries and Cisco Jabber

2020-04-30 - News - Tony Finch

The news part of this item is that I've updated the stealth secondary documentation with a warning about configuring servers (or not configuring them) with secondary zones that aren't mentioned in the sample configuration files.

One exception to that is the special Cisco Jabber zones supported by the phone service. There is now a link from our stealth secondary DNS documentation to the Cisco Jabber documentation, but there are tricky requirements and caveats, so you need to take care.

The rest of this item is the story of how we discovered the need for these warnings.

The context

Cisco Jabber is designed around a classic enterprise-style internal/external network architecture with firewalls and DNS views, which doesn't fit the University very well. The special Jabber DNS SRV records (_cisco-uds etc.) have been set up on the phone system's own DNS servers, which are able to support the special split views more easily than the central DNS servers.

If the network requirements are satisfied then you can see the Jabber internal view records, but in practice most clients should see the DNS SRV records for the external view.

The problem

With many people working from home, our colleagues in the telecoms office found that Jabber was not working as expected. After some investigation it became apparent that the internal view _cisco-uds DNS SRV records were leaking: often Virgin Media's DNS servers would return the wrong answers, and sometimes the various public DNS resovers would as well.

This was very mysterious.

We could not find any configuration problems with the phone system's DNS servers, nor with the central DNS servers, nor with the contents of the DNS zones.

The answer

After much head-scratching and many red herrings and blind alleys, I worked out that one of the public DNS servers for the cam.ac.uk zone was configured as a secondary for Jabber's special internal _cisco-uds view. There was a 1-in-6 chance that people outside the University would get the wrong records, depending on which of our 6 public DNS servers their resolver happened to talk to.

The fix

So we've corrected the configuration mistake, and improved our documentation to reduce the risk of it happening again. But there's a bit more we can do.

One of the things that made this hard to debug was that the usual consistency checking tools such as Zonemaster did not spot the mistake. DNSviz encountered the problem, which gave me a bit of a clue, but DNSviz isn't designed to systematically examine all of a zone's nameservers in the way that Zonemaster does.

The reason Zonemaster didn't find the problem is that it examines a zone's own nameservers for consistency, but it doesn't check that all the zone's parent's nameservers have consistent delegations. In our Jabber case it was one of the parent zone (cam.ac.uk) servers that was doing the wrong thing with the child _cisco-uds zone.

We have a Zonemaster script for checking all our zones, but it currently uses a rather out-of-date version. I'm hoping that after some operating system upgrades it will be more convenient to use a recent version of Zonemaster, and it will make sense to add some extra checks so that Zonemaster can spot and complain about mistakes like our Cisco Jabber leakage.