New IP models for POL2 data

Summary:

New IP models are now available when reducing POL2 data. At 450 um the new model is a distinct improvement over the previous model. At 850 um the new and old models show little difference. To use the new models, it is necessary to add ipmodel=AUG2019 to the configuration when running pol2map:

% pol2map config='ipmodel=AUG2019'

Details:

Some POL2 users have seen evidence of significant inaccuracy in the model used to correct 450 um data for Instrumental Polarisation (IP). For instance, this can be seen in the following plots, which compare Q maps for DR21, created from observations taken in 2015 and 2017. The bottom left image shows the 450 um Q map from 2017 data, the bottom centre image shows the 450 um Q map from 2015 data and the  bottom right image shows the difference.  All these images use the same scaling. The difference appears to be a scaled version of the total intensity image (shown at top right), as confirmed by the scatter plot shown in the top centre.

The fact that the difference between the 2015 and 2017 Q maps looks like a scaled version of the total intensity map suggests strongly that the difference is caused by an inaccuracy in the IP correction (the IP correction subtracts a fraction of the total intensity map from the Q and U maps).

The current model used for IP correction is based on a set of observations of a bright unpolarised point source (Uranus). However, the high noise and low Q/U levels at 450 um in these Uranus observations makes it difficult to determine the model accurately. So a procedure has been developed that allows the IP model to be determined instead from a set of observations of any bright extended (possibly polarised) sources. This procedure has been applied to existing observations of OMC1/OrionA, DR21 and G034.257+0.155 to determine new IP models at both 850 um and 450 um. Observations of OrionB and Serpens main field2 were also used, but failed to produce any usable data at 450 um.

The new procedure is described in detail in the attached file.

Using the new 450 um IP model, there seems to be no significant difference between the Q/U maps made from the 2015 data and the 2017 data for DR21, as shown below:

and for completeness the corresponding U maps are shown below – again there is no significant difference between 2015 and 2017:

Controlling the masks used by pol2map

The map-making process used by the pol2map command uses two masks, each of which divides the field up into source and background regions:

  • The ‘AST’ mask: this is used to define the background regions that are to be forced to zero after each iteration of the map-maker algorithm (except the last iteration). This form of masking helps prevent the growth of artificial large scale structures within the map. Any real astronomical signal present within the masked background regions will tend to be suppressed in the final map, so it is important that the AST mask correctly identifies regions of significant emission down to a low level.
  • The ‘PCA’ mask: this is used to define the source regions that are to be excluded from the Principal Component Analysis. This analysis is used to remove the correlated backgrounds in the bolometer time stream data. The time-stream data for astronomical sources are not correlated across bolometers, and so tend to disrupt the PCA. For this reason source regions are excluded from the analysis.

Two separate masks are used because experience has shown that disruption of the PCA is caused mainly by the brighter central source regions. Consequently, the source regions within the PCA mask can  be smaller than the source regions within the AST mask.

A total intensity map of DR 21 showing the AST mask in green and the slightly smaller PCA mask in blue.

Default masks are created automatically by pol2map in a manner specified by the MASK parameter.

  1. On ‘step 1’ of a typical POL2 data reduction, MASK is left at its default valuer of “Auto”, causing new masks to be generated automatically at the end of each iteration of the map-making algorithm. This ‘auto-masking’ process identifies an initial set of sources by thresholding the current map estimate at the SNR value specified by configuration parameter “xxx.ZERO_SNR”, where “xxx” is either “AST” or “PCA”, depending on which mask is being generated. Each of these initial source regions is then expanded to include adjoining pixels down to the SNR level specified by configuration parameter “xxx.ZERO_SNRLO”.
  2. On ‘step 2’ and ‘step 3’ of a typical POL2 data reduction, MASK is set to the coadd of all the total intensity maps created at step 1. The pol2map script first creates a pair of AST and PCA masks from the supplied coadded map and then uses these masks on all iterations of the map-making algorithm. The findclumps command in the Starlink CUPID package is used by pol2map to create the masks. The process used by findclumps is the same as describe above for step 1 – initial sources are defined by a fixed SNR threshold within the supplied coadd and these are then extended down to a lower SNR threshold. The pol2map command sets these threshold values to the values of the four configuration parameters listed above – “AST.ZERO_SNR”, “AST.ZERO_SNRLO”, “PCA.ZERO_SNR” and  “PCA.ZERO_SNRLO”.

These configuration parameter all default to the following values specified in the pol2map script:

AST.ZERO_SNR = 3 
AST.ZERO_SNRLO = 2 
PCA.ZERO_SNR = 5 
PCA.ZERO_SNRLO = 3

If you wish to investigate the effects of changing these values, you should supply new values using the CONFIG parameter of the pol2map command. For instance:

% more conf
ast.zero_snr = 2.5
ast.zero_snrlo = 1.5
% pol2map config=^conf

Any values not specified will retain the default value listed above.

Note, prior to 10th July 2019 the above method could only be used at step 1 (the supplied settings were ignored if supplied at step 2 or 3). Later versions of pol2map do not suffer from this problem – the supplied values are honoured at all steps.

Footnote – for completeness it should be mentioned that the  COM and FLT models are also masked, in addition to the AST and PCA models. At step 1 (the auto-masking stage) the masking of COM and FLT is controlled by a similar set of configuration parameters to AST or PCA, except that “XXX” becomes “COM” or “FLT”. At steps 2 and 3 (the external-masking stages), the COM and FLT models are masked using the PCA mask generated by findclumps, and so COM and FLT masking cannot be controlled independently  of the PCA mask.

Investigating the effects of uncertainties in the POL2 IP model

Vector maps created from POL2 data need to have the instrumental polarisation (IP) removed to be useful. Determining a good model for the IP has been a challenge, and not surprisingly the currently accepted model has somewhat uncertain parameter values.  These uncertainties in the IP model translate into uncertainties in the final vector maps in complicated ways that cannot easily be estimated analytically. However, a possibly approach to estimating the uncertainties in the vector map is to produce several vector maps using slightly different IP models and then look at the differences between the resulting maps.

The pol2map command provides two options for doing this:

  1. The level of IP – as a percentage if the incoming total intensity – removed by the IP model can be raised or lowered by a specified amount from its default value. To do this, use configuration parameter “ipoffset“. For instance, to create a vector map using an IP level that is 0.3% greater than the default, do:

    % cat conf
    ipoffset=0.3
    % pol2map config=^conf ...

  2. The IP removed by the IP model can be rotated by a specified angle from its default orientation. To do this, use configuration parameter “ipangoff“. For instance, to create a vector map using an IP at an angle of 0.5 degrees to its default angle, do:

    % cat conf
    ipangoff=0.5
    % pol2map config=^conf ...

The above changes should be applied when running pol2map to create the Q and U maps, together with the final vector catalogue (“step 3”).

Previous investigation of the IP model suggests that the above values for ipoffset and ipangoff (i.e. 0.3 % and 0.5 deg.) are reasonable estimates of the uncertainties in the IP model parameters. Rotating the IP by up to 0.5 degree should usually have little effect on the final map, so it is probably  reasonable to ignore ipangoff and concentrate on  the effects of changing ipoffset. A possibly strategy is to use pol2map (step 3) three times to generate three sets of Q and U maps, with corresponding vector catalogues, using ipoffset values of -0.3, 0 and +0.3. The visual differences between the vector maps should give a handle on the uncertainties caused by the IP model.

Checking for convergence in pol2map log files

When processing POL2 data, it is important to know that the map-making algorithm converged correctly for all observations. This information is available in the pol2map log file, along with the rest of the makemap or skyloop output.  However, the log file can be very long and so finding the relevant information may  not be straight-forward, particularly if you are unfamiliar with the screen output usually created by makemap or skyloop.

To help with this, I have written a simple python script called pol2logcheck.py, which searches a specified pol2map log file for the relevant information and reports any observations that did not converge. Use it as in the following example:

% pol2logcheck.py omc1/pol2map.log.3

omc1/pol2map.log.3: 

 Looks like a step 1 log file
  The following observation(s) failed to converge:
    20190104 #14


Combining multiple POL2 fields

If you wish to combine POL2 observations for multiple overlapping fields, the best way to proceed is probably as follows (note, for this to work correctly you will need a Starlink build from 9th June 2019 or later):

    1.  Run “step 1” independently for each field. In other words, use pol2map to create an auto-masked total intensity map for each field. The following assumes that pol2map is run within directories called “field1″, field2”, etc, to create the auto-masked maps and the I/Q/U time-stream data for each field.
    2. Co-add the auto-masked total intensity maps for all fields. First create a text file holding the paths to the separate auto-masked total intensity maps, and then run pol2map as follows:
      % more infiles
      field1/iauto
      field2/iauto
      % pol2map in=^infiles iout=iauto_mosaic \
                qout=! uout=! multiobject=yes
      
      
    3. Run “step 2” and “step 3” for each field, using the mosaic created above as the mask field for all fields. For instance, for the first field:
      % cd field1
      % pol2map in=qudata/\* iout=iext qout=! \
                uout=! mapdir=maps mapvar=yes \
                skyloop=yes mask=../iauto_mosaic 
      % pol2map in=qudata/\* iout=! qout=qext \
                uout=uext mapdir=maps mapvar=yes \
                skyloop=yes mask=../iauto_mosaic \
                ipref=iext cat=mycat debias=yes

      Then do the same for field2, field3, etc. If preferred, steps 2 and 3 can be combined into a single invocation of pol2map:

      % cd field1
      % pol2map in=qudata/\* iout=iext qout=qext \
                uout=uext mapdir=maps mapvar=yes \
                skyloop=yes mask=../iauto_mosaic \
                ipref=iext cat=mycat debias=yes
    4. Co-add the external-masked I/Q/U maps for all fields and create a vector catalogue from the co-added maps. First create a text file holding the paths to the external-masked I/Q/U maps for all fields, and then run pol2map as follows:
      % more infiles
      field1/iext
      field1/qext
      field1/uext
      field2/iext
      field2/qext
      field2/uext
      % pol2map in=^infiles iout=iext_mosaic \
                qout=qext_mosaic uout=uext_mosaic \
                multiobject=yes cat=mycat_mosaic \
                debias=yes ipcor=no

Fix for skyloop convergence problem

A bug in the smurf:skyloop command has recently been found and fixed. This bug could cause negative or zero values to be included in the extinction  (“EXT”) model. This in turn could cause effectively random behaviour in the other models, resulting in very poor convergence and some spurious values being introduced into the final map (if convergence does eventually happen). This bug is triggered by one or more observations having some time slices for which no extinction values are available (indicated by the presence of the string ” EXT:” in the skyloop log file). If such holes in the extinction data extend over only a few time slices, skyloop interpolates across them using the extinction values on either side of the hole. However,  there was an error in this interpolation that led to the holes being filled with negative or zero values. This bug has now been fixed (as of 22nd May 2019).  This fix affects both direct use of skyloop and indirect use via the pol2map command.

David

A new dimmconfig that uses PCA

The $STARLINK_DIR/share/smurf directory includes several “dimmconfig” files that package up commonly used groups of configuration parameter values for use by the makemap command.  A new file called dimmconfig_pca.lis has recently been added, which can be combined with  other dimmconfig files to tell makemap to include a PCA model in its iterative algorithm (the PCA model removes noise signals that are correlated between multiple bolometer time-streams). For instance, to use a PCA model when creating a map of a bright extended source, you could run makemap as follows:

% more conf
^$STARLINK_DIR/share/smurf/dimmconfig_bright_extended.lis
^$STARLINK_DIR/share/smurf/dimmconfig_pca.lis
% makemap config=^conf

To process compact sources, change “bright_extended” above to “bright_compact“.

Using a PCA model can help to reduce the spurious extended structures that often appear in SCUBA-2 maps (although this benefit is bought at the cost of a much extended run time). For instance, below are four 850 um maps of DR 21 – the top row shows the maps made with the basic “bright extended” dimmconfig, and the bottom shows eh results of adding in the new PCA dimmconfig:

Below are the mosaics of the four observations, with the difference map shown in between:

As another example, the following panes show similar maps for three observations of the Serpens South field:

Faster PCA

By default, the SMURF makemap command identifies and removes a single correlated background signal – called the common-mode – from all bolometer time-streams. However, in some cases there is clear evidence that there is more than one correlated background signal present in the bolometer time streams. This is particularly evident in POL2 data, where the varying instrumental polarisation causes different parts of the focal plane to see different levels of sky polarisation. For POL2, better maps are created if multiple correlated background signals are identified and removed . This is achieved within makemap using a process called Principal Component Analysis (PCA). The down side to using PCA is that it is very very slow – it is just about practical in the case of POL2 data because of the very low scan speed and consequent very low sample rate. 

A change introduced into SMURF on 1st April  2019 should speed up PCA by a factor of 2 or 3, making POL2 reductions quicker and maybe making PCA background removal practical for non-POL2 data. Maps created using the new PCA system will not be identical to maps made with the old system, but the differences should be well within the noise levels (the pixel variances remain largely unaffected).

To use PCA within a normal run of makemap it is recommended to add the following to your config file:

modelorder = (com,gai,pca,ext,flt,ast,noi) 
pca.pcathresh = -50

It is usually a good idea to mask out the source when calculating the PCA model on the first few iterations, since this seems to aid convergence. The same sort of mask should be used with PCA as is used with AST, but it should only be applied for a few iterations. So for instance for a point source you could add:

pca.zero_circle = 0.01667
pca.zero_niter = 2

This masks the PCA model on the first two iterations using a circle of radius 60 arc-seconds (0.01667 degrees).

Using PCA usually causes makemap to converge more slowly, but often produces maps with lower levels of artificial structures. If pca.pcathresh is negative, the absolute value indicates the number of correlated signals to remove as the background in each bolometer. Smaller numbers result in a lower level of noise reduction in the final map, but faster convergence. Larger numbers result in a higher level of noise reduction in the final map, but slower convergence. The default value is 50, which usually seems to be a reasonable compromise.

The left map below was made with a default common-mode background (no PCA)  and the centre map was made with PCA background removal as shown above. Each observation took about 6 minutes to create without PCA and about 25 minutes with the new faster PCA. The right map shows the difference between the other two maps. All three use the same scaling.

Combining POL2 observations with different reference positions

A bug has recently been fixed in the SMURF pol2map command  that could cause the maps created by pol2map to be blurred or show bad negative bowling around bright sources if the input observations do not all use the same reference point. So if you are using pol2map to combine data from POL2 observations with different reference positions, then you should ensure that you are using a version of starlink that includes the bug fix – i.e. was created on or after 10th January 2019. To check this, do:

% more $STARLINK_DIR/manifests/starlink.version

This will show details about your starlink build, including the date. To update it using rsync from Hilo, see these instructions.

The bug caused the pointing corrections determined at “step 1” to be in error by an amount roughly equal to the change in reference position between observations. These bad pointing corrections in turn lead to various problems (bad bowling or blurring) in the maps created at “step 2”. Consequently, it is not possible to correct an existing bad reduction – the only option is to do the whole reduction again from the start (having first ensured that your starlink installation is up to date).

POL-2: Beware of using ‘mapvar=yes’ at ‘step 1’

Summary: Do not use mapvar=yes on step 1 of a POL-2 reduction as it can cause negative bowling around bright sources in the I map created at step 2.

Production of polarisation maps from POL-2 data is usually performed using the pol2map command within the Starlink SMURF package, as described in the POL-2 Data Reduction cookbook – SC22. An option is available to create the final I, Q and U variances based on the spread of pixel values in a set of maps created from different individual observations. This option is selected by running pol2map with the parameter setting mapvar=yes. Without this option, output variances are calculated using simple error propagation from the variance values stored within the maps made from individual observations.

When creating POL-2 polarisation maps, it is usual first to run pol2map in ”auto-masking” mode to create an initial map of total intensity (I). This initial map is used simply to define a mask outlining the areas on the sky where bright emission is expected. This mask is then used in subsequent runs of pol2map to (amongst other things) exclude bright bolometer samples from the estimation of the COM, PCA and FLT background levels – including bright samples would bias these estimates. The first run of pol2map is usually referred to as “step 1”, with subsequent runs being “step 2” and “step 3”.  The I map created by step 1 is first converted from total intensity (in pW) into a map of signal-to-noise ratio (SNR). This is done by dividing the pixel data values by the square root of the pixel variance values. This SNR map is then thresholded at a specific value to define the mask containing the source areas.

But the size of the pixel variance values, and consequently the SNR values and the resulting mask,  depends on whether the mapvar=yes option was used when running pol2map. In general, using mapvar=yes causes the variances to be higher, meaning that the SNR values are lower. So there are fewer source pixels above the SNR threshold value. Thus, the mask is less effective at excluding bright bolometer samples from the various background estimates. This means the background is over estimated in the vicinity of bright sources, resulting in negative bowling around bright sources when the background is removed.  For obvious reasons this effect is worse for brighter sources than for weaker sources.

Below are examples of the the externally masked I map created at step 2, based on step 1 I maps created using mapvar=yes (left) and mapvar=no (right). The negative bowling in the centre of the left image is obvious (images provided by Kate Pattle).