Changes to the NDF (.sdf) Data Format

Those of you who carefully read our Starlink release notes (everyone, right?) will have noticed that our last release (2018A) included a new HDF5-based data format.

The 2018A release could read data in the new format, but by default still wrote data in the old format. In preparation for our next release we have now changed the default data format in our codebase. This means if you download the Linux* nightly build from EAO (see http://starlink.eao.hawaii.edu/starlink/rsyncStarlink) you will now start producing data files in the new format.

Our data files will still have the same ‘.sdf’ extension. You can tell if a file is in the old or new format most easily by using the unix command ‘file’. For an old-style .sdf file you will see ‘data’. for the new-style .sdf files you will see: ‘Hierarchical Data Format (version 5) data.

If you need to convert a new style .sdf file to the old format, you can set the environmental variable ‘HDS_VERSION’ to 4, and then run the ndfcopy command on your file. The output file will now be in the old style format. You’ll need to unset the variable to go back to writing your output in the new format.

E.g. in bash, you could do:

kappa
export HDS_VERSION=4
ndfcopy myfile.sdf myfile-hdsv4.sdf
unset HDS_VERSION

If you need to use the old format all the time you could set HDS_VERSION=4 in your login scripts.

If you want to know more detail, you should know that the NDF data structures you use with Starlink are written to disk in a file-format known as HDS. This new file format we are now using is version 5 of the HDS file format, and is based on the widely used Hierarchical Data Format (HDF) standard. It is possible to open and read the new-style files with standard HDF5 libraries — e.g. the python H5py library. There are a few unusual features you may find, particularly if looking at the history or provenance structures. The new format is described in Tim Jenness’ paper (https://ui.adsabs.harvard.edu/#abs/2015A&C….12..221J/abstract). Our file format and data model are described in the HDS manual ( SUN/92) and the NDF manual (SUN/33). If you’re interested in the history of NDF and how it compares to the FITS file format/data model you could also read Jenness et al’s paper on Learning from 25 years of the extensible N-Dimensional Data Format

We haven’t changed the format that the JCMT or UKIRT write raw data in, so all raw data from both these telescopes will continue to be written in the old format. We also haven’t yet changed the format for new JCMT reduced data available through the CADC archive, but that is likely to happen at some point. We don’t intend to remove the ability for Starlink to read old files. The version of the starlink-pyhds python library available through pypi can read the new-format data files already.

If you have any questions or issues with the new file format you can contact the Starlink developers list https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=STARDEV, or if you prefer you can directly email the observatory via the  helpdesk AT eaobservatory.org email address. If you’re subscribed to the JCMT/EAO software-discussion@eaobservatory.org  email address you can send comments there (see https://www.eaobservatory.org/jcmt/help/softwareblog/ for subscription links/instructions).

*Only the Linux build because we currently have an issue with our OSX nightly build and it is not updating right now– our hardware has died and I haven’t got the temporary replacement up and running yet.