Notes on work in progress

First ops page ported

2019-05-15

Yesterday I reached a milestone: I have ported the first "ops" page from the old IP Register web user interface on Jackdaw to the new one that will live on the DNS web servers. It's a trivial admin page for setting the message of the day, but it demonstrates that the infrastructure is (mostly) done.

Security checks

I have spent the last week or so trying to get from a proof of concept to something workable. Much of this work has been on the security checks. The old UI has:

  • Cookie validation (for Oracle sessions)

  • Raven authentication

  • TOTP authentication for superusers

  • Second cookie validaion for TOTP

  • CSRF checks

There was an awkward split between the Jackdaw framework and the ipreg-specific parts which meant I needed to add a second cookie when I added TOTP authentication.

In the new setup I have upgraded the cookie to modern security levels, and it handles both Oracle and TOTP session state.

    my @cookie_attr = (
            -name     => '__Host-Session',
            -path     => '/',
            -secure   => 1,
            -httponly => 1,
            -samesite => 'strict',
        );

The various "middleware" authentication components have been split out of the main HTTP request handler so that the overall flow is much easier to see.

State objects

There is some fairly tricky juggling in the old code between:

  • CGI request object

  • WebIPDB HTTP request handler object

  • IPDB database handle wrapper

  • Raw DBI handle

The CGI object is gone. The mod_perl Apache2 APIs are sufficient replacements, and the HTML generation functions are being replaced by mustache templates. (Though there is some programmatic form generation in table_ops that might be awkward!)

I have used Moo roles to mixin the authentication middleware bits to the main request handler object, which works nicely. I might do the same for the IPDB object, though that will require some refactoring of some very old skool OO perl code.

Next

The plan is to port the rest of the ops pages as directly as possible. There is going to be a lot of refactoring, but it will all be quite superficial. The overall workflow is going to remain the same, just more purple.

mobile message of the day form with error

Oracle connection timeouts

2019-05-07

Last week while I was beating mod_perl code into shape, I happily deleted a lot of database connection management code that I had inherited from Jackdaw's web server. Today I had to put it all back again.

Apache::DBI

There is a neat module called Apache::DBI which hooks mod_perl and DBI together to provide a transparent connection cache: just throw in a use statement, throw out dozens of lines of old code, and you are pretty much done.

Connection hangs

Today the clone of Jackdaw that I am testing against was not available (test run for some maintenance work tomorrow, I think) and I found that my dev web server was no longer responding. It started OK but would not answer any requests. I soon worked out that it was trying to establish a database connection and waiting at least 5 minutes (!) before giving up.

DBI(3pm) timeouts

There is a long discussion about timeouts in the DBI documentation which specifically mentions DBD::Oracle as a problem case, with some lengthy example code for implementing a timeout wrapper around DBI::connect.

This is a terrible documentation anti-pattern. Whenever I find myself giving lengthy examples of how to solve a problem I take it as a whacking great clue that the code should be fixed so the examples can be made a lot easier.

In this case, DBI should have connection timeouts as standard.

Sys::SigAction

If you read past the examples in DBI(3pm) there's a reference to a more convenient module which provides a timeout wrapper that can be used like this:

if (timeout_call($connect_timeout, sub {
    $dbh = DBI->connect(@connect_args);
    moan $DBI::errstr unless $dbh;
})) {
    moan "database connection timed out";
}

Undelete

The problem is that there isn't a convenient place to put this timeout code where it should be, so that Apache::DBI can use it transparently.

So I resurrected Jackdaw's database connection cache. But not exacly - I looked through it again and I could not see any extra timeout handling code. My guess is that hung connections can't happen if the database is on the same machine as the web server.

Reskinning IP Register

2019-05-01

At the end of the item about Jackdaw and Raven I mentioned that when the web user interface moves off Jackdaw it will get a reskin.

The existing code uses Perl CGI functions for rendering the HTML, with no styling at all. I'm replacing this with mustache templates using the www.dns.cam.ac.uk Project Light framework. So far I have got the overall navigation structure working OK, and it's time to start putting forms into the pages.

I fear this reskin is going to be disappointing, because although it's superficially quite a lot prettier, the workflow is going to be the same - for example, the various box_ops etc. links in the existing user interface become Project Light local navigation tabs in the new skin. And there are still going to be horrible Oracle errors.

Jackdaw and Raven

2019-04-16

I've previously written about authentication and access control in the IP Register database. The last couple of weeks I have been reimplementing some of it in a dev version of this DNS web server.

Read more ...

Bootstrapping Let's Encrypt on Debian

2019-03-15

I've done some initial work to get the Ansible playbooks for our DNS systems working with the development VM cluster on my workstation. At this point it is just for web-related experimentation, not actual DNS servers.

Of course, even a dev server needs a TLS certificate, especially because these experiments will be about authentication. Until now I have obtained certs from the UIS / Jisc / QuoVadis, but my dev server is using Let's Encrypt instead.

Chicken / egg

In order to get a certificate from Let's Encrypt using the http-01 challenge, I need a working web server. In order to start the web server with its normal config, I need a certificate. This poses a bit of a problem!

Snakeoil

My solution is to install Debian's ssl-cert package, which creates a self-signed certificate. When the web server does not yet have a certificate (if the QuoVadis cert isn't installed, or dehydrated has not been initialized), Ansible temporarily symlinks the self-signed cert for use by Apache, like this:

- name: check TLS certificate exists
  stat:
    path: /etc/apache2/keys/tls-web.crt
  register: tls_cert
- when: not tls_cert.stat.exists
  name: fake TLS certificates
  file:
    state: link
    src: /etc/ssl/{{ item.src }}
    dest: /etc/apache2/keys/{{ item.dest }}
  with_items:
    - src: certs/ssl-cert-snakeoil.pem
      dest: tls-web.crt
    - src: certs/ssl-cert-snakeoil.pem
      dest: tls-chain.crt
    - src: private/ssl-cert-snakeoil.key
      dest: tls.pem

ACME dehydrated boulders

The dehydrated and dehydrated-apache2 packages need a little configuration. I needed to add a cron job to renew the certificate, a hook script to reload apache when the cert is renewed, and tell it which domains should be in the cert. (See below for details of these bits.)

After installing the config, Ansible initializes dehydrated if necessary - the creates check stops Ansible from running dehydrated again after it has created a cert.

- name: initialize dehydrated
  command: dehydrated -c
  args:
    creates: /var/lib/dehydrated/certs/{{inventory_hostname}}/cert.pem

Having obtained a cert, the temporary symlinks get overwritten with links to the Let's Encrypt cert. This is very similar to the snakeoil links, but without the existence check.

- name: certificate links
  file:
    state: link
    src: /var/lib/dehydrated/certs/{{inventory_hostname}}/{{item.src}}
    dest: /etc/apache2/keys/{{item.dest}}
  with_items:
    - src: cert.pem
      dest: tls-web.crt
    - src: chain.pem
      dest: tls-chain.crt
    - src: privkey.pem
      dest: tls.pem
  notify:
    - restart apache

After that, Apache is working with a proper certificate!

Boring config details

The cron script chatters into syslog, but if something goes wrong it should trigger an email (tho not a very informative one).

#!/bin/bash
set -eu -o pipefail
( dehydrated --cron
  dehydrated --cleanup
) | logger --tag dehydrated --priority cron.info

The hook script only needs to handle one of the cases:

#!/bin/bash
set -eu -o pipefail
case "$1" in
(deploy_cert)
    apache2ctl configtest &&
    apache2ctl graceful
    ;;
esac

The configuration needs a couple of options added:

- copy:
    dest: /etc/dehydrated/conf.d/dns.sh
    content: |
      EMAIL="hostmaster@cam.ac.uk"
      HOOK="/etc/dehydrated/hook.sh"

The final part is to tell dehydrated the certificate's domain name:

- copy:
    content: "{{inventory_hostname}}\n"
    dest: /etc/dehydrated/domains.txt

For production, domains.txt needs to be a bit more complicated. I have a template like the one below. I have not yet deployed it; that will probably wait until the cert needs updating.

{{hostname}} {% if i_am_www %} www.dns.cam.ac.uk dns.cam.ac.uk {% endif %}

KSK rollover project status

2019-02-07

I have spent the last week working on DNSSEC key rollover automation in BIND. Or rather, I have been doing some cleanup and prep work. With reference to the work I listed in the previous article...

Done

  • Stop BIND from generating SHA-1 DS and CDS records by default, per

  • Teach dnssec-checkds about CDS and CDNSKEY

Started

  • Teach superglue to use CDS/CDNSKEY records, with similar logic to dnssec-checkds

The "similar logic" is implemented in dnssec-dsfromkey, so I don't actually have to write the code more than once. I hope this will also be useful for other people writing similar tools!

Some of my small cleanup patches have been merged into BIND. We are currently near the end of the 9.13 development cycle, so this work is going to remain out of tree for a while until after the 9.14 stable branch is created and the 9.15 development cycle starts.

Next

So now I need to get to grips with dnssec-coverage and dnssec-keymgr.

Simple safety interlocks

The purpose of the dnssec-checkds improvements is so that it can be used as a safety check.

During a KSK rollover, there are one or two points when the DS records in the parent need to be updated. The rollover must not continue until this update has been confirmed, or the delegation can be broken.

I am using CDS and CDNSKEY records as the signal from the key management and zone signing machinery for when DS records need to change. (There's a shell-style API in dnssec-dsfromkey -p, but that is implemented by just reading these sync records, not by looking into the guts of the key management data.) I am going to call them "sync records" so I don't have to keep writing "CDS/CDNSKEY"; "sync" is also the keyword used by dnssec-settime for controlling these records.

Key timing in BIND

The dnssec-keygen and dnssec-settime commands (which are used by dnssec-keymgr) schedule when changes to a key will happen.

There are parameters related to adding a key: when it is published in the zone, when it becomes actively used for signing, etc. And there are parameters related to removing a key: when it becomes inactive for signing, when it is deleted from the zone.

There are also timing parameters for publishing and deleting sync records. These sync times are the only timing parameters that say when we must update the delegation.

What can break?

The point of the safety interlock is to prevent any breaking key changes from being scheduled until after a delegation change has been confirmed. So what key timing events need to be forbidden from being scheduled after a sync timing event?

Events related to removing a key are particularly dangerous. There are some cases where it is OK to remove a key prematurely, if the DS record change is also about removing that key, and there is another working key and DS record throughout. But it seems simpler and safer to forbid all removal-related events from being scheduled after a sync event.

However, events related to adding a key can also lead to nonsense. If we blindly schedule creation of new keys in advance, without verifying that they are also being properly removed, then the zone can accumulate a ridiculous number of DNSKEY records. This has been observed in the wild surprisingly frequently.

A simple rule

There must be no KSK changes of any kind scheduled after the next sync event.

This rule applies regardless of the flavour of rollover (double DS, double KSK, algorithm rollover, etc.)

Applying this rule to BIND

Whereas for ZSKs, dnssec-coverage ensures rollovers are planned for some fixed period into the future, for KSKs, it must check correctness up to the next sync event, then ensure nothing will occur after that point.

In dnssec-keymgr, the logic should be:

  • If the current time is before the next sync event, ensure there is key coverage until that time and no further.

  • If the current time is after all KSK events, use dnssec-checkds to verify the delegation is in sync.

  • If dnssec-checkds reports an inconsistency and we are within some sync interval dictated by the rollover policy, do nothing while we wait for the delegation update automation to work.

  • If dnssec-checkds reports an inconsistency and the sync interval has passed, report an error because operator intervention is required to fix the failed automation.

  • If dnssec-checkds reports everything is in sync, schedule keys up to the next sync event. The timing needs to be relative to this point in time, since any delegation update delays can make it unsafe to schedule relative to the last sync event.

Caveat

At the moment I am still not familiar with the internals of dnssec-coverage and dnssec-keymgr so there's a risk that I might have to re-think these plans. But I expect this simple safety rule will be a solid anchor that can be applied to most DNSSEC key management scenarios. (However I have not thought hard enough about recovery from breakage or compromise.)

Superglue with WebDriver

2019-01-25

Earlier this month I wrote notes on some initial experiments in browser automation with WebDriver. The aim is to fix my superglue DNS delegation update scripts to work with currently-supported tools.

In the end I decided to rewrite the superglue-janet script in Perl, since most of superglue is already Perl and I would like to avoid rewriting all of it. This is still work in progress; superglue is currently an unusable mess, so I don't recommend looking at it right now :-)

My WebDriver library

Rather than using an off-the-shelf library, I have a very thin layer (300 lines of code, 200 lines of docs) that wraps WebDriver HTTP+JSON calls in Perl subroutines. It's designed for script-style usage, so I can write things like this (quoted verbatim):

# Find the domain's details page.

click '#commonActionsMenuLogin_ListDomains';

fill '#MainContent_tbDomainNames' => $domain,
    '#MainContent_ShowReverseDelegatedDomains' => 'selected';

click '#MainContent_btnFilter';

This has considerably less clutter than the old PhantomJS / CasperJS code!

Asyncrony

I don't really understand the concurrency model between the WebDriver server and the activity in the browser. It appears to be remarkably similar to the way CasperJS behaved, so I guess it is related to the way JavaScript's event loop works (and I don't really understand that either).

The upshot is that in most cases I can click on a link, and the WebDriver response comes back after the new page has loaded. I can immediately interact with the new page, as in the code above.

However there are some exceptions.

On the JISC domain registry web site there are a few cases where selecting from a drop-down list triggers some JavaScript that causes a page reload. The WebDriver request returns immediately, so I have to manually poll for the page load to complete. (This also happened with CasperJS.) I don't know if there's a better way to deal with this than polling...

The WebDriver spec

I am not a fan of the WebDriver protocol specification. It is written as a description of how the code in the WebDriver server / browser behaves, written in spaghetti pseudocode.

It does not have any abstract syntax for JSON requests and responses - no JSON schema or anything like that. Instead, the details of parsing requests and constructing responses are interleaved with details of implementing the semantics of the request. It is a very unsafe style.

And why does the WebDriver spec include details of how to HTTP?

Next steps

This work is part of two ongoing projects:

  • I need to update all our domain delegations to complete the server renaming.

  • I need automated delegation updates to support automated DNSSEC key rollovers.

So I'm aiming to get superglue into a usable state, and hook it up to BIND's dnssec-keymgr.

Preserving dhcpd leases across reinstalls

2019-01-14

(This is an addendum to December's upragde notes.)

I have upgraded the IP Register DHCP servers twice this year. In February they were upgraded from Ubuntu 12.04 LTS to 14.04 LTS, to cope with 12.04's end of life, and to merge their setup into the main ipreg git repository (which is why the target version was so old). So their setup was fairly tidy before the Debian 9 upgrade.

Statefulness

Unlike most of the IP Register systems, the dhcp servers are stateful. Their dhcpd.leases files must be preserved across reinstalls. The leases file is a database (in the form of a flat text file in ISC dhcp config file format) which closely matches the state of the network.

If it is lost, the server no longer knows about IP addresses in use by existing clients, so it can issue duplicate addresses to new clients, and hilarity will ensue!

So, just before rebuilding a server, I have to stop the dhcpd and take a copy of the leases file. And before the dhcpd is restarted, I have to copy the leases file back into place.

This isn't something that happens very often, so I have not automated it yet.

Bad solutions

In February, I hacked around with the Ansible playbook to ensure the dhcpd was not started before I copied the leases file into place. This is an appallingly error-prone approach.

Yesterday, I turned that basic idea into an Ansible variable that controls whether the dhcpd is enabled. This avoids mistakes when fiddling with the playbook, but it is easily forgettable.

Better solution

This morning I realised a much neater way is to disable the entire dhcpd role if the leases file doesn't exist. This prevents the role from starting the dhcpd on a newly reinstalled server before the old leases file is in place. After the server is up, the check is a no-op.

This is a lot less error-prone. The only requirement for the admin is knowledge about the importance of preserving dhcpd.leases...

Further improvements

The other pitfall in my setup is that monit will restart dhcpd if it is missing, so it isn't easy to properly stop it.

My dhcpd_enabled Ansible variable takes care of this, but I think it would be better to make a special shutdown playbook, which can also take a copy of the leases file.

Review of 2018

2019-01-11

Some notes looking back on what happened last year...

Stats

1457 commits

4035 IP Register / MZS support messages

5734 cronspam messages

Projects

  • New DNS web site (Feb, Mar, Jun, Sep, Oct, Nov)

    This was a rather long struggle with a lot of false starts, e.g. February / March finding that Perl Template Toolkit was not very satisfactory; realising after June that the server naming and vhost setup was unhelpful.

    End result is quite pleasing

  • IP Register API extensions (Aug)

    API access to xlist_ops

    MWS3 API generalized for other UIS services

    Now in active use by MWS, Drupal Falcon, and to a lesser extent by the HPC OpenStack cluster and the new web Traffic Managers. When old Falcon is wound down we will be able to eliminate Gossamer!

  • Server upgrade / rename (Dec)

    Lots of Ansible review / cleanup. Satisfying.

Future of IP Register

  • Prototype setup for PostgreSQL replication using repmgr (Jan)

  • Prototype infrastructure for JSON-RPC API in Typescript (April, May)

Maintenance

  • DHCP servers upgraded to match rest of IP Register servers (Feb)

  • DNS servers upgraded to BIND 9.12, with some serve-stale related problems. (March)

    Local patches all now incorporated upstream :-)

  • git.uis continues, hopefully not for much longer

IETF

  • Took over as the main author of draft-ietf-dnsop-aname. This work is ongoing.

  • Received thanks in RFC 8198 (DNSSEC negative answer synthesis), RFC 8324 (DNS privacy), RFC 8482 (minimal ANY responses), RFC 8484 (DNS-over-HTTPS).

Open Source

  • Ongoing maintenance of regpg. This has stabilized and reached a comfortable feature plateau.

  • Created doh101, a DNS-over-TLS and DNS-over-HTTPS proxy.

    Initial prototype in March at the IETF hackathon.

    Revamped in August to match final IETF draft.

    Deployed in production in September.

  • Fifteen patches committed to BIND9.

    CVE-2018-5737; extensive debugging work on the serve-stale feature.

    Thanked by ISC.org in their annual review.

  • Significant clean-up and enhancement of my qp trie data structure, used by Knot DNS. This enabled much smaller memory usage during incremental zone updates.

    https://gitlab.labs.nic.cz/knot/knot-dns/issues/591

What's next?

  • Update superglue delegation maintenance script to match the current state of the world. Hook it in to dnssec-keymgr and get automatic rollovers working.

  • Rewrite draft-ietf-dnsop-aname again, in time for IETF104 in March.

  • Server renumbering, and xfer/auth server split, and anycast. When?

  • Port existing ipreg web interface off Jackdaw.

  • Port database from Oracle on Jackdaw to PostgreSQL on my servers.

  • Develop new API / UI.

  • Re-do provisioning system for streaming replication from database to DNS.

  • Move MZS into IP Register database.

Notes on web browser automation

2019-01-08

I spent a few hours on Friday looking in to web browser automation. Here are some notes on what I learned.

Context

I have some old code called superglue-janet which drives the JISC / JANET / UKERNA domain registry web site. The web site has some dynamic JavaScript behaviour, and it looks to me like the browser front-end is relatively tightly coupled to the server back-end in a way that I expected would make reverse engineering unwise. So I decided to drive the web site using browser automation tools. My code is written in JavaScript, using PhantomJS (a headless browser based on QtWebKit) and CasperJS (convenience utilities for PhantomJS).

Rewrite needed

PhantomJS is now deprecated, so the code needs a re-work. I also want to use TypeScript instead, where I would previously have used JavaScript.

Current landscape

The modern way to do things is to use a full-fat browser in headless mode and control it using the standard WebDriver protocol.

For Firefox this means using the geckodriver proxy which is a Rust program that converts the WebDriver JSON-over-HTTP protocol to Firefox's native Marionette protocol.

[Aside: Marionette is a full-duplex protocol that exchanges JSON messages prefixed by a message length. It fits into a similar design space to Microsoft's Language Server Protocol, but LSP uses somewhat more elaborate HTTP-style framing and JSON-RPC message format. It's kind of a pity that Marionette doesn't use JSON-RPC.]

The WebDriver protocol came out of the Selenium browser automation project where earlier (incompatible) versions were known as the JSON Wire Protocol.

What I tried out

I thought it would make sense to write the WebDriver client in TypeScript. The options seemed to be:

  • selenium-webdriver, which has Selenium's bindings for node.js. This involves a second proxy written in Java which goes between node and geckodriver. I did not like the idea of a huge wobbly pile of proxies.

  • webdriver.io aka wdio, a native node.js WebDriver client. I chose to try this, and got it going fairly rapidly.

What didn't work

I had enormous difficulty getting anything to work with wdio and TypeScript. It turns out that the wdio typing was only committed a couple of days before my attempt, so I had accidentally found myself on the bleeding edge. I can't tell whether my failure was due to lack of documentation or brokenness in the type declarations...

What next

I need to find a better WebDriver client library. The wdio framework is very geared towards testing rather than general automation (see the wdio "getting started" guide for example) so if I use it I'll be talking to its guts rather than the usual public interface. And it won't be very stable.

I could write it in Perl but that wouldn't really help to reduce the amount of untyped code I'm writing :-)

The missing checklist

2019-01-07

Before I rename/upgrade any more servers, this is the checklist I should have written last month...

For rename

  • Ensure both new and old names are in the DNS

  • Rename the host in ipreg/ansible/bin/make-inventory and run the script

  • Run ipreg/ansible/bin/ssh-knowhosts to update ~/.ssh/known_hosts

  • Rename host_vars/$SERVER and adjust the contents to match a previously renamed server (mutatis mutandis)

  • For recursive servers, rename the host in ipreg/ansible/roles/keepalived/files/vrrp-script and ipreg/ansible/inventory/dynamic

For both

  • Ask infra-sas@uis to do the root privilege parts of the netboot configuration - rename and/or new OS version as required

For upgrade

  • For DHCP servers, save a copy of the leases file by running:

    ansible-playbook dhcpd-shutdown-save-leases.yml \
        --limit $SERVER
    
  • Run the preseed.yml playbook to update the unprivileged parts of the netboot config

  • Reboot the server, tell it to netboot and do a preseed install

  • Wait for that to complete

  • For DHCP servers, copy the saved leases file to the server.

  • Then run:

    ANSIBLE_SSH_ARGS=-4 ANSIBLE_HOST_KEY_CHECKING=False \
        ansible-playbook -e all=1 --limit $SERVER main.yml
    

For rename

  • Update the rest of the cluster's view of the name

    git push
    ansible-playbook --limit new main.yml
    

Notes on recent DNS server upgrades

2019-01-02

I'm now most of the way through the server upgrade part of the rename / renumbering project. This includes moving the servers from Ubuntu 14.04 "Trusty" to Debian 9 "Stretch", and renaming them according to the new plan.

Done:

  • Live and test web servers, which were always Stretch, so they served as a first pass at getting the shared parts of the Ansible playbooks working

  • Live and test primary DNS servers

  • Live x 2 and test x 2 authoritative DNS servers

  • One recursive server

To do:

  • Three other recursive servers

  • Live x 2 and test x 1 DHCP servers

Here are a few notes on how the project has gone so far.

Read more ...

Postcronspam

2018-11-30

This is a postmortem of an incident that caused a large amount of cronspam, but not an outage. However, the incident exposed a lot of latent problems that need addressing.

Description of the incident

I arrived at work late on Tuesday morning to find that the DHCP servers were sending cronspam every minute from monit. monit thought dhcpd was not working, although it was.

A few minutes before I arrived, a colleague had run our Ansible playbook to update the DHCP server configuration. This was the trigger for the cronspam.

Cause of the cronspam

We are using monit as a basic daemon supervisor for our critical services. The monit configuration doesn't have an "include" facility (or at least it didn't when we originally set it up) so we are using Ansible's "assemble" feature to concatenate configuration file fragments into a complete monit config.

The problem was that our Ansible setup didn't have any explicit dependencies between installing monit config fragments and reassembling the complete config and restarting monit.

Running the complete playbook caused the monit config to be reassembled, so an incorrect but previously inactive config fragment was activated, causing the cronspam.

Origin of the problem

How was there an inactive monit config fragment on the DHCP servers?

The DHCP servers had an OS upgrade and reinstall in February. This was when the spammy broken monit config fragment was written.

What were the mistakes at that time?

  • The config fragment was not properly tested. A good monit config is normally silent, but in this case we didn't check that it sent cronspam when things are broken, whoch would have revealed that the config fragment was not actually installed properly.

  • The Ansible playbook was not verified to be properly idempotent. It should be possible to wipe a machine and reinstall it with one run of Ansible, and a second run should be all green. We didn't check the second run properly. Check mode isn't enough to verify idempotency of "assemble".

  • During routine config changes in the nine months since the servers were reinstalled, the usual practice was to run the DHCP-specific subset of the Ansible playbook (because that is much faster) so the bug was not revealed.

Deeper issues

There was a lot more anxiety than there should have been when debugging this problem, because at the time the Ansible playbooks were going through a lot of churn for upgrading and reinstalling other servers, and it wasn't clear whether or not this had caused some unexpected change.

This gets close to the heart of the matter:

  • It should always be safe to check out and run the Ansible playbook against the production systems, and expect that nothing will change.

There are other issues related to being a (nearly) solo developer, which makes it easier to get into bad habits. The DHCP server config has the most contributions from colleagues at the moment, so it is not really surprising that this is where we find out the consequences of the bad habits of soloists.

Resolutions

It turns out that monit and dhcpd do not really get along. The monit UDP health checker doesn't work with DHCP (which was the cause of the cronspam) and monit's process checker gets upset by dhcpd being restarted when it needs to be reconfigured.

The monit DHCP UDP checker has been disabled; the process checker needs review to see if it can be useful without sending cronspam on every reconfig.

There should be routine testing to ensure the Ansible playbooks committed to the git server run green, at least in check mode. Unfortunately it's risky to automate this because it requires root access to all the servers; at the moment root access is restricted to admins in person.

We should be in the habit of running the complete playbook on all the servers (e.g. before pushing to the git server), to detect any differences between check mode and normal (active) mode. This is necessary for Ansible tasks that are skipped in check mode.

Future work

This incident also highlights longstanding problems with our low bus protection factor and lack of automated testing. The resolutions listed above will make some small steps to improve these weaknesses.

DNS-OARC and RIPE

2018-10-23

Last week I visited Amsterdam for a bunch of conferences. The 13th and 14th was the joint DNS-OARC and CENTR workshop, and 15th - 19th was the RIPE77 meeting.

I have a number of long-term projects which can have much greater success within the University and impact outside the University by collaborating with people from other organizations in person. Last week was a great example of that, with significant progress on CDS (which I did not anticipate!), ANAME, and DNS privacy, which I will unpack below.

Read more ...

DNS-over-TLS snapshot

2018-10-10

Some quick stats on how much the new DNS-over-TLS service is being used:

At the moment (Wednesday mid-afternoon) we have about

  • 29,000 - 31,000 devices on the wireless network

  • 3900 qps total on both recursive servers

  • about 15 concurrent DoT clients (s.d. 4)

  • about 7qps DoT (s.d. 5qps)

  • 5s TCP idle timeout

  • 6.3s mean DoT connection time (s.d. 4s - most connections are just over 5s, they occasionally last as long as 30s; mean and s.d. are not a great model for this distribution)

  • DoT connections very unbalanced, 10x fewer on 131.111.8.42 than on 131.111.12.20

The rule of thumb that number of users is about 10x qps suggests that we have about 70 Android Pie users, i.e. about 0.2% of our userbase.

IPv6 DAD-die issues

2018-03-26

Here's a somewhat obscure network debugging tale...

Read more ...

Deprocrastinating

2018-02-16

I'm currently getting several important/urgent jobs out of the way so that I can concentrate on the IP Register database project.

Read more ...

An interesting bug in BIND

2018-01-12

(This item isn't really related to progress towards a bright shiny future, but since I'm blogging here I might as well include other work-related articles.)

This week I have been helping Mark Andrews and Evan Hunt to track down a bug in BIND9. The problem manifested as named occasionally failing to re-sign a DNSSEC zone; the underlying cause was access to uninitialized memory.

It was difficult to pin down, partly because there is naturally a lot of nondeterminism in uninitialized memory bugs, but there is also a lot of nondeterminism in the DNSSEC signing process, and it is time-dependent so it is hard to re-run a failure case, and normally the DNSSEC signing process is very slow - three weeks to process a zone, by default.

Timeline

  • Oct 9 - latent bug exposed

  • Nov 12 - first signing failure

    I rebuild and restart my test DNS server quite frequently, and the bug is quite rare, which explains why it took so long to appear.

  • Nov 18 - Dec 6 - Mark fixes several signing-related bugs

  • Dec 28 - another signing failure

  • Jan 2 - I try adding some debugging diagnostics, without success

  • Jan 9 - more signing failures

  • Jan 10 - I make the bug easier to reproduce

    Mark and Evan identify a likely cause

  • Jan 11 - I confirm the cause and fix

The debugging process

The incremental re-signing code in named is tied into BIND's core rbtdb data structure (the red-black tree database). This is tricky code that I don't understand, so I mostly took a black-box approach to try to reproduce it.

I started off by trying to exercise the signing code harder. I set up a test zone with the following options:

    # signatures valid for 1 day (default is 30 days)
    # re-sign 23 hours before expiry
    # (whole zone is re-signed every hour)
    sig-validity-interval 1 23;
    # restrict the size of a batch of signing to examine
    # at most 10 names and generate at most 2 signatures
    sig-signing-nodes 10;
    sig-signing-signatures 2;

I also populated the zone with about 500 records (not counting DNSSEC records) so that several records would get re-signed each minute.

This helped a bit, but I often had to wait a long time before it went wrong. I wrote a script to monitor the zone using rndc zonestatus, so I could see if the "next resign time" matches the zone's earliest expiring signature.

There was quite a lot of flailing around trying to exercise the code harder, by making the zone bigger and changing the configuration options, but I was not successful at making the bug appear on demand.

To make it churn faster, I used dnssec-signzone to construct a version of the zone in which all the signatures expire in the next few minutes:

    rndc freeze test.example
    dig axfr test.example | grep -v RRSIG |
    dnssec-signzone -e now+$((86400 - 3600 - 200)) \
            -i 3600 -j 200 \
            -f signed -o test.example /dev/stdin
    rm -f test.example test.example.jnl
    mv signed test.example
    # re-load the zone
    rndc thaw test.example
    # re-start signing
    rndc sign test.example

I also modified BIND's re-signing co-ordination code; normally each batch will re-sign any records that are due in the next 5 seconds; I reduced that to 1 second to keep batch sizes small, on the assumption that more churn would help - which it did, a little bit.

But the bug still took a random amount of time to appear, sometimes within a few minutes, sometimes it would take ages.

Finding the bug

Mark (who knows the code very well) took a bottom-up approach; he ran named under valgrind which identified an access to uninitialized memory. (I don't know what led Mark to try valgrind - whether he does it routinely or whether he tried it just for this bug.)

Evan had not been able to reproduce the bug, but once the cause was identified it became clear where it came from.

The commit on the 9th October that exposed the bug was a change to BIND's memory management code, to stop it from deliberately filling newly-allocated memory with garbage.

Before this commit, the missing initialization was hidden by the memory fill, and the byte used to fill new allocations (0xbe) happened to have the right value (zero in the bottom bit) so the signer worked correctly.

Evan builds BIND in developer mode, which enables memory filling, which stopped him from being able to reproduce it.

Verifying the fix

I changed BIND to fill memory with 0xff which (if we were right) should provoke signing failures much sooner. And it did!

Then applying the one-character fix to remove the access to uninitialized memory made the signer work properly again.

Lessons learned

BIND has a lot of infrastructure that tries to make C safer to use, for instance:

  • Run-time assertions to ensure that internal APIs are used correctly;

  • Canary elements at the start of most objects to detect memory overruns;

  • buffer and region types to prevent memory overruns;

  • A memory management system that keeps statistics on memory usage, and helps to debug memory leaks and other mistakes.

The bug was caused by failing to use buffers well, and hidden by the memory management system.

The bug occurred when initializing an rdataslab data structure, which is an in-memory serialization of a set of DNS records. The records are copied into the rdataslab in traditional C style, without using a buffer. (This is most blatant when the code manually serializes a 16 bit number instead of using isc_buffer_putuint16.) This code is particularly ancient which might explain the poor style; I think it needs refactoring for safety.

It's ironic that the bug was hidden by the memory management code - it's supposed to help expose these kinds of bug, not hide them! Nowadays, the right approach would be to link to jemalloc or some other advanced allocator, rather than writing a complicated wrapper around standard malloc. However that wasn't an option when BIND9 development started.

Conclusion

Memory bugs are painful.

The first Oracle to PostgreSQL trial

2017-12-24

I have used ora2pg to do a quick export of the IP Register database from Oracle to PostgreSQL. This export included an automatic conversion of the table structure, and the contents of the tables. It did not include the more interesting parts of the schema such as the views, triggers, and stored procedures.

Oracle Instant Client

Before installing ora2pg, I had to install the Oracle client libraries. These are not available in Debian, but Debian's ora2pg package is set up to work with the following installation process.

  • Get the Oracle Instant Client RPMs

    from Oracle's web site. This is a free download, but you will need to create an Oracle account.

    I got the basiclite RPM - it's about half the size of the basic RPM and I didn't need full i18n. I also got the sqlplus RPM so I can talk to Jackdaw directly from my dev VMs.

    The libdbd-oracle-perl package in Debian 9 (Stretch) requires Oracle Instant Client 12.1. I matched the version installed on Jackdaw, which is 12.1.0.2.0.

  • Convert the RPMs to debs (I did this on my workstation)

    $ fakeroot alien oracle-instantclient12.1-basiclite-12.1.0.2.0-1.x86_64.rpm
    $ fakeroot alien oracle-instantclient12.1-sqlplus-12.1.0.2.0-1.x86_64.rpm
    
  • Those packages can be installed on the dev VM, with libaio1 (which is required by Oracle Instant Client but does not appear in the package dependencies), and libdbd-oracle-perl and ora2pg.

  • sqlplus needs a wrapper script that sets environment variables so that it can find its libraries and configuration files. After some debugging I foud that although the documentation claims that glogin.sql is loaded from $ORACLE_HOME/sqlplus/admin/ in fact it is loaded from $SQLPATH.

    To configure connections to Jackdaw, I copied tnsnames.ora and sqlnet.ora from ent.

Running ora2pg

By default, ora2pg exports the table definitions of the schema we are interested in (i.e. ipreg). For the real conversion I intend to port the schema manually, but ora2pg's automatic conversion is handy for a quick trial, and it will probably be a useful guide to translating the data type names.

The commands I ran were:

$ ora2pg --debug
$ mv output.sql tables.sql
$ ora2pg --debug --type copy
$ mv output.sql rows.sql

$ table-fixup.pl <tables.sql >fixed.sql
$ psql -1 -f functions.sql
$ psql -1 -f fixed.sql
$ psql -1 -f rows.sql

The fixup script and SQL functions were necessary to fill in some gaps in ora2pg's conversion, detailed below.

Compatibility problems

  • Oracle treats the empty string as equivalent to NULL but PostgreSQL does not.

    This affects constraints on the lan and mzone tables.

  • The Oracle substr function supports negative offsets which index from the right end of the string, but PostgreSQL does not.

    This affects subdomain constraints on the unique_name, maildom, and service tables. These constraints should be replaced by function calls rather than copies.

  • The ipreg schema uses raw columns for IP addresses and prefixes; ora2pg converted these to bytea.

    The v6_prefix table has a constraint that relies on implicit conversion from raw to a hex string. PostgreSQL is stricter about types, so this expression needs to work on bytea directly.

  • There are a number of cases where ora2pg represented named unique constraints as unnamed constraints with named indexes. This unnecessarily exposes an implementation detail.

  • There were a number of Oracle functions which PostgreSQL doesn't support (even with orafce), so I implemented them in the functions.sql file.

    • regexp_instr()
    • regexp_like()
    • vzise()

Other gotchas

  • The mzone_co, areader, and registrar tables reference the pers table in the jdawadm schema. These foreign key constraints need to be removed.

  • There is a weird bug in ora2pg which mangles the regex [[:cntrl:]] into [[cntrl:]]

    This is used several times in the ipreg schema to ensure that various fields are plain text. The regex is correct in the schema source and in the ALL_CONSTRAINTS table on Jackdaw, which is why I think it is an ora2pg bug.

  • There's another weird bug where a regexp_like(string,regex,flags) expression is converted to string ~ regex, flags which is nonsense.

    There are other calls to regexp_like() in the schema which do not get mangled in this way, but they have non-trivial string expressions whereas the broken one just has a column name.

Performance

The export of the data from Oracle and the import to PostgreSQL took an uncomfortably long time. The SQL dump file is only 2GB so it should be possible to speed up the import considerably.

How to get a preseed file into a Debian install ISO

2017-12-12

Goal: install a Debian VM from scratch, without interaction, and with a minimum of external dependencies (no PXE etc.) by putting a preseed file on the install media.

Sadly the documentation for how to do this is utterly appalling, so here's a rant.

Starting point

The Debian installer documentation, appendix B.

https://www.debian.org/releases/stable/amd64/apbs02.html.en

Some relevant quotes:

Putting it in the correct location is fairly straightforward for network preseeding or if you want to read the file off a floppy or usb-stick. If you want to include the file on a CD or DVD, you will have to remaster the ISO image. How to get the preconfiguration file included in the initrd is outside the scope of this document; please consult the developers' documentation for debian-installer.

Note there is no link to the developers' documentation.

If you are using initrd preseeding, you only have to make sure a file named preseed.cfg is included in the root directory of the initrd. The installer will automatically check if this file is present and load it.

For the other preseeding methods you need to tell the installer what file to use when you boot it. This is normally done by passing the kernel a boot parameter, either manually at boot time or by editing the bootloader configuration file (e.g. syslinux.cfg) and adding the parameter to the end of the append line(s) for the kernel.

Note that we'll need to change the installer boot process in any case, in order to skip the interactive boot menu. But these quotes suggest that we'll have to remaster the ISO, to edit the boot parameters and maybe alter the initrd.

So we need to guess where else to find out how to do this.

Wiki spelunking

https://wiki.debian.org/DebianInstaller

This suggests we should follow https://wiki.debian.org/DebianCustomCD or use simple-cdd.

simple-cdd

I tried simple-cdd but it failed messily.

It needs parameters to select the correct version (it defaults to Jessie) and a local mirror (MUCH faster).

$ time simple-cdd --dist stretch \
        --debian-mirror http://ftp.uk.debian.org/debian
[...]
ERROR: missing required packages from profile default:  less
ERROR: missing required packages from profile default:  simple-cdd-profiles
WARNING: missing optional packages from profile default:  grub-pc grub-efi popularity-contest console-tools console-setup usbutils acpi acpid eject lvm2 mdadm cryptsetup reiserfsprogs jfsutils xfsprogs debootstrap busybox syslinux-common syslinux isolinux
real    1m1.528s
user    0m34.748s
sys     0m1.900s

Sigh, looks like we'll have to do it the hard way.

Modifying the ISO image

Eventually I realise the hard version of making a CD image without simple-cdd is mostly about custom package selections, which is not something I need.

This article is a bit more helpful...

https://wiki.debian.org/DebianInstaller/Preseed

It contains a link to...

https://wiki.debian.org/DebianInstaller/Preseed/EditIso

That requires root privilege and is a fair amount of faff.

That page in turn links to...

https://wiki.debian.org/DebianInstaller/Modify

And then...

https://wiki.debian.org/DebianInstaller/Modify/CD

This has a much easier way of unpacking the ISO using bsdtar, and instructions on rebuilding a hybrid USB/CD ISO using xorriso. Nice.

Most of the rest of the page is about changing package selections which we already determined we don't need.

Boot configuration

OK, so we have used bsdtar to unpack the ISO, and we can see various boot-related files. We need to find the right ones to eliminate the boot menu and add the preseed arguments.

There is no syslinux.cfg in the ISO so the D-I documentation's example is distressingly unhelpful.

I first tried editing boot/grub/grub.cfg but that had no effect.

There are two boot mechanisms on the ISO, one for USB and one for CD/DVD. The latter is in isolinux/isolinux.cfg.

Both must be edited (in similar but not identical ways) to get the effect I want regardless of the way the VM boots off the ISO.

Unpacking and rebuilding the ISO takes less than 3 seconds on my workstation, which is acceptably fast.

Ongoing DNSSEC work

2017-10-05

We reached a nice milestone today which I'm pretty chuffed about, so I wanted to share the good news. This is mostly of practical interest to the Computer Lab and Mathematics, since they have delegated DNSSEC signed zones, but I hope it is of interest to others as well.

I have a long-term background project to improve the way we manage our DNSSEC keys. We need to improve secure storage and backups of private keys, and updating public key digests in parent zones. As things currently stand it requires tricky and tedious manual work to replace keys, but it ought to be zero-touch automation.

We now have most of the pieces we need to support automatic key management.

regpg

For secure key storage and backup, we have a wrapper around GPG called regpg which makes it easier to repeatably encrypt files to a managed set of "recipients" (in GPG terminology). In this case the recipients are the sysadmins and they are able to decrypt the DNS keys (and other secrets) for deployment on new servers. With regpg the key management system will be able to encrypt newly generated keys but not able to decrypt any other secrets.

At the moment regpg is in use and sort-of available (at the link below) but this is a temporary home until I have released it properly.

Edited to link to the regpg home page

dnssec-cds

There are a couple of aspects to DNSKEY management: scheduling the rollovers, and keeping delegations in sync.

BIND 9.11 has a tool called dnssec-keymgr which makes rollovers a lot easier to manage. It needs a little bit of work to give it proper support for delegation updates, but it's definitely the way of the future. (I don't wholeheartedly recommend it in its current state.)

For synchronizing delegations, RFC 7344 describes special CDS and CDNSKEY records which a child zone can publish to instruct its parent to update the delegation. There's some support for the child side of this protocol in BIND 9.11, but it will be much more complete in BIND 9.12.

I've written dnssec-cds, an implementation of the parent side, which was committed to BIND this morning. (Yay!) My plan is to use this tool for managing our delegations to the CL and Maths. BIND isn't an easy codebase to work with; the reason for implementing dnssec-cds this way is (I hope) to encourage more organizations to deploy RFC 7344 support than I could achieve with a standalone tool.

https://gitlab.isc.org/isc-projects/bind9/commit/ba37674d038cd34d0204bba105c98059f141e31e

Until our parent zones become enlightened to the ways of RFC 7344 (e.g. RIPE, JANET, etc.) I have a half-baked framework that wraps various registry/registrar APIs so that we can manage delegations for all our domains in a consistent manner. It needs some work to bring it up to scratch, probably including a rewrite in Python to make it more appealing.

Conclusion

All these pieces need to be glued together, and I'm not sure how long that will take. Some of this glue work needs to be done anyway for non-DNSSEC reasons, so I'm feeling moderately optimistic.

DNS server rollout report

2015-02-16

Last week I rolled out my new DNS servers. It was reasonably successful - a few snags but no showstoppers.

Read more ...

Recursive DNS rollout plan - and backout plan!

2015-01-30

The last couple of weeks have been a bit slow, being busy with email and DNS support, an unwell child, and surprise 0day. But on Wednesday I managed to clear the decks so that on Thursday I could get down to some serious rollout planning.

My aim is to do a forklift upgrade of our DNS servers - a tier 1 service - with negligible downtime, and with a backout plan in case of fuckups.

Read more ...

BIND patches as a byproduct of setting up new DNS servers

2015-01-17

On Friday evening I reached a BIG milestone in my project to replace Cambridge University's DNS servers. I finished porting and rewriting the dynamic name server configuration and zone data update scripts, and I was - at last! - able to get the new servers up to pretty much full functionality, pulling lists of zones and their contents from the IP Register database and the managed zone service, and with DNSSEC signing on the new hidden master.

There is still some final cleanup and robustifying to do, and checks to make sure I haven't missed anything. And I have to work out the exact process I will follow to put the new system into live service with minimum risk and disruption. But the end is tantalizingly within reach!

In the last couple of weeks I have also got several small patches into BIND.

Read more ...

Recursive DNS server failover with keepalived --vrrp

2015-01-09

I have got keepalived working on my recursive DNS servers, handling failover for testdns0.csi.cam.ac.uk and testdns1.csi.cam.ac.uk. I am quite pleased with the way it works.

Read more ...

Network setup for Cambridge's new DNS servers

2015-01-07

The SCCS-to-git project that I wrote about previously was the prelude to setting up new DNS servers with an entirely overhauled infrastructure.

Read more ...

Uplift from SCCS to git

2014-11-27

My current project is to replace Cambridge University's DNS servers. The first stage of this project is to transfer the code from SCCS to Git so that it is easier to work with.

Ironically, to do this I have ended up spending lots of time working with SCCS and RCS, rather than Git. This was mainly developing analysis and conversion tools to get things into a fit state for Git.

If you find yourself in a similar situation, you might find these tools helpful.

Read more ...

The early days of the Internet in Cambridge

2014-10-30

I'm currently in the process of uplifting our DNS development / operations repository from SCCS (really!) to git. This is not entirely trivial because I want to ensure that all the archival material is retained in a sensible way.

I found an interesting document from one of the oldest parts of the archive, which provides a good snapshot of academic computer networking in the UK in 1991. It was written by Tony Stonely, aka <ajms@cam.ac.uk>. AJMS is mentioned in RFC 1117 as the contact for Cambridge's IP address allocation. He was my manager when I started work at Cambridge in 2002, though he retired later that year.

The document is an email discussing IP connectivity for Cambridge's Institute of Astronomy. There are a number of abbreviations which might not be familiar...

  • Coloured Book: the JANET protocol suite
  • CS: the University Computing Service
  • CUDN: the Cambridge University Data Network
  • GBN: the Granta Backbone Network, Cambridge's duct and fibre infrastructure
  • grey: short for Grey Book, the JANET email protocol
  • IoA: the Institute of Astronomy
  • JANET: the UK national academic network
  • JIPS: the JANET IP service, which started as a pilot service early in 1991; IP traffic rapidly overtook JANET's native X.25 traffic, and JIPS became an official service in November 1991, about when this message was written
  • PSH: a member of IoA staff
  • RA: the Rutherford Appleton Laboratory, a national research institute in Oxfordshire the Mullard Radio Astronomy Observatory, an outpost at Lords Bridge near Barton, where some of the dishes sit on the old Cambridge-Oxford railway line. (I originally misunderstood the reference.)
  • RGO: The Royal Greenwich Observatory, which moved from Herstmonceux to the IoA site in Cambridge in 1990
  • Starlink: a UK national DECnet network linking astronomical research institutions

Edited to correct the expansion of RA and to add Starlink

    Connection of IoA/RGO to IP world
    ---------------------------------

This note is a statement of where I believe we have got to and an initial
review of the options now open.

What we have achieved so far
----------------------------

All the Suns are properly connected at the lower levels to the
Cambridge IP network, to the national IP network (JIPS) and to the
international IP network (the Internet). This includes all the basic
infrastructure such as routing and name service, and allows the Suns
to use all the usual native Unix communications facilities (telnet,
ftp, rlogin etc) except mail, which is discussed below. Possibly the
most valuable end-user function thus delivered is the ability to fetch
files directly from the USA.

This also provides the basic infrastructure for other machines such as
the VMS hosts when they need it.

VMS nodes
---------

Nothing has yet been done about the VMS nodes. CAMV0 needs its address
changing, and both IOA0 and CAMV0 need routing set for extra-site
communication. The immediate intention is to route through cast0. This
will be transparent to all parties and impose negligible load on
cast0, but requires the "doit" bit to be set in cast0's kernel. We
understand that PSH is going to do all this [check], but we remain
available to assist as required.

Further action on the VMS front is stalled pending the arrival of the
new release (6.6) of the CMU TCP/IP package. This is so imminent that
it seems foolish not to await it, and we believe IoA/RGO agree [check].

Access from Suns to Coloured Book world
---------------------------------------

There are basically two options for connecting the Suns to the JANET
Coloured Book world. We can either set up one or more of the Suns as
full-blown independent JANET hosts or we can set them up to use CS
gateway facilities. The former provides the full range of facilities
expected of any JANET host, but is cumbersome, takes significant local
resources, is complicated and long-winded to arrange, incurs a small
licence fee, is platform-specific, and adds significant complexity to
the system managers' maintenance and planning load. The latter in
contrast is light-weight, free, easy to install, and can be provided
for any reasonable Unix host, but limits functionality to outbound pad
and file transfer either way initiated from the local (IoA/RGO) end.
The two options are not exclusive.

We suspect that the latter option ("spad/cpf") will provide adequate
functionality and is preferable, but would welcome IoA/RGO opinion.

Direct login to the Suns from a (possibly) remote JANET/CUDN terminal
would currently require the full Coloured Book package, but the CS
will shortly be providing X.29-telnet gateway facilities as part of
the general infrastructure, and can in any case provide this
functionality indirectly through login accounts on Central Unix
facilities. For that matter, AST-STAR or WEST.AST could be used in
this fashion.

Mail
----

Mail is a complicated and difficult subject, and I believe that a
small group of experts from IoA/RGO and the CS should meet to discuss
the requirements and options. The rest of this section is merely a
fleeting summary of some of the issues.
Firstly, a political point must be clarified. At the time of writing
it is absolutely forbidden to emit smtp (ie Unix/Internet style) mail
into JIPS. This prohibition is national, and none of Cambridge's
doing. We expect that the embargo will shortly be lifted somewhat, but
there are certain to remain very strict rules about how smtp is to be
used. Within Cambridge we are making best guesses as to the likely
future rules and adopting those as current working practice. It must
be understood however that the situation is highly volatile and that
today's decisions may turn out to be wrong.

The current rulings are (inter alia)

        Mail to/from outside Cambridge may only be grey (Ie. JANET
        style).

        Mail within Cambridge may be grey or smtp BUT the reply
        address MUST be valid in BOTH the Internet AND Janet (modulo
        reversal). Thus a workstation emitting smtp mail must ensure
        that the reply address contained is that of a current JANET
        mail host. Except that -

        Consenting machines in a closed workgroup in Cambridge are
        permitted to use smtp between themselves, though there is no
        support from the CS and the practice is discouraged. They
        must remember not to contravene the previous two rulings, on
        pain of disconnection.

The good news is that a central mail hub/distributer will become
available as a network service for the whole University within a few
months, and will provide sufficient gateway function that ordinary
smtp Unix workstations, with some careful configuration, can have full
mail connectivity. In essence the workstation and the distributer will
form one of those "closed workgroups", the workstation will send all
its outbound mail to the distributer and receive all its inbound mail
from the distributer, and the distributer will handle the forwarding
to and from the rest of Cambridge, UK and the world.

There is no prospect of DECnet mail being supported generally either
nationally or within Cambridge, but I imagine Starlink/IoA/RGO will
continue to use it for the time being, and whatever gateway function
there is now will need preserving. This will have to be largely
IoA/RGO's own responsibility, but the planning exercise may have to
take account of any further constraints thus imposed. Input from
IoA/RGO as to the requirements is needed.

In the longer term there will probably be a general UK and worldwide
shift to X.400 mail, but that horizon is probably too hazy to rate more
than a nod at present. The central mail switch should in any case hide
the initial impact from most users.

The times are therefore a'changing rather rapidly, and some pragmatism
is needed in deciding what to do. If mail to/from the IP machines is
not an urgent requirement, and since they will be able to log in to
the VMS nodes it may not be, then the best thing may well be to await
the mail distributer service. If more direct mail is needed more
urgently then we probably need to set up a private mail distributer
service within IoA/RGO. This would entail setting up (probably) a Sun
as a full JANET host and using it as the one and only (mail) route in
or out of IoA/RGO. Something rather similar has been done in Molecular
Biology and is thus known to work, but setting it up is no mean task.
A further fall-back option might be to arrange to use Central Unix
facilities as a mail gateway in similar vein. The less effort spent on
interim facilities the better, however.

Broken mail
-----------

We discovered late in the day that smtp mail was in fact being used
between IoA and RA, and the name changing broke this. We regret having
thus trodden on existing facilities, and are willing to help try to
recover any required functionality, but we believe that IoA/RGO/RA in
fact have this in hand. We consider the activity to fall under the
third rule above. If help is needed, please let us know.

We should also report sideline problem we encountered and which will
probably be a continuing cause of grief. CAVAD, and indeed any similar
VMS system, emits mail with reply addresses of the form
"CAVAD::user"@....  This is quite legal, but the quotes are
syntactically significant, and must be returned in any reply.
Unfortunately the great majority of Unix systems strip such quotes
during emission of mail, so the reply address fails. Such stripping
can occur at several levels, notably the sendmail (ie system)
processing and the one of the most popular user-level mailers. The CS
is fixing its own systems, but the problem is replicated in something
like half a million independent Internet hosts, and little can be done
about it.

Other requirements
------------------

There may well be other requirements that have not been noticed or,
perish the thought, we have inadvertently broken. Please let us know
of these.

Bandwidth improvements
----------------------

At present all IP communications between IoA/RGO and the rest of the
world go down a rather slow (64Kb/sec) link. This should improve
substantially when it is replaced with a GBN link, and to most of
Cambridge the bandwidth will probably become 1-2Mb/sec. For comparison,
the basic ethernet bandwidth is 10Mb/sec. The timescale is unclear, but
sometime in 1992 is expected. The bandwidth of the national backbone
facilities is of the order of 1Mb/sec, but of course this is shared with
many institutions in a manner hard to predict or assess.

For Computing Service,
Tony Stoneley, ajms@cam.cus
29/11/91