The photometer section of SPIRE is one of the key instruments on board of Herschel . Its legacy depends very much on how well the scanmap observations that it carried out during the Herschel mission can be converted to high quality maps . In order to have a comprehensive assessment on the current status of SPIRE map-making , as well as to provide guidance for future development of the SPIRE scan-map data reduction pipeline , we carried out a test campaign on SPIRE map-making . In this report , we present results of the tests in this campaign . The goals are : ( 1 ) Compare the map-makers in the SPIRE pipeline with other mapmakers . ( 2 ) In particular , identify the strengths and limitations of different mapmakers in dealing with the known SPIRE map-making issues , such as the cooler burp effect . ( 3 ) Assess the resolution-enhancement capabilities of the super-resolution mappers , as compared to the destriper ( the pipeline default ) , and investigate their applicability to various kinds of data as well as caveats or pitfalls to avoid . ( 4 ) Enable users to choose the right map-maker for their science . ( 5 ) Provide guidance for future development of the SPIRE scan-map data reduction pipeline . For these purposes , 13 test cases were generated , including data sets obtained in different observational modes and scan speeds , with different map sizes , source brightness , and levels of complexity of the extended emission . They also include observations suffering from the “ cooler burp ” effect , and those having strong large-scale gradients in the background radiation . The input data for these test cases are time-ordered data ( TODs In this report , TOD is used in the broad sense of a collection of samples containing time , flux density and position information . The data were not formatted as a single HIPE Tod product , but rather consisted of many FITS files , one per scan . Each file is known within HIPE as a Photometer Scan Product ( PSP ) and contains tables of the calibrated signal , right ascension and declination , with each row corresponding to a time sample and with separate columns for each bolometer . ) . The map-making process turns the TODs into maps . Among the test cases , 8 are simulated and 5 are real observations . Comparing to real observations , a simulated test case has the advantage of possessing the “ truth ” , namely the sky model , based on which the simulation is carried out . The truth map provides an unbiased standard against which test maps made by different map-makers are to be compared . Allowing for the effects of noise in a given map , deviations from the truth can be used as objective measures for the bias introduced by the map-making process . In the simulations , TODs were generated using two layers of data : a noise layer taken from real SPIRE observations of dark fields ( this allows the simulation to include both instrumental noise and confusion noise ) , and a truth layer which is a sky-model map based either on a real Spitzer 24 \mu m map or a map of artificial sources . Seven map-makers participated in this test campaign , including ( 1 ) Naive mapper ( default of the SPIRE standard pipeline until HIPE 8 ) ; ( 2 ) Destriper in two flavors : ( i ) Destriper-P0 : Destriper with polynomial-order = 0 ( default of SPIRE standard pipeline since HIPE 9 ) , and ( ii ) Destriper-P1 : Destriper with polynomial-order = 1 ; ( 3 ) Scanamorphos ; ( 4 ) SANEPIC ( GLS mapmaker ) ; ( 5 ) Unimap ( GLS mapmaker ) ; ( 6 ) HiRes ( super-resolution map-maker ) ; ( 7 ) SUPREME ( super-resolution map-maker ) . Because of time constraints , not all map-makers processed all the test cases ( see Table LABEL : tbl0 : mapmaker_testcases for details ) . 0.1 Caption for LOF Table 0.1 Test Cases Processed by Different Map-Makers [Table Here] Results of tests are presented in the framework of four sets of metrics : ( 1 ) Deviation from the truth . These metrics include : ( i ) visual examinations of the difference map Map - Map _ { true } ; ( ii ) a scatter plot of ( S – S _ { true } ) vs S _ { true } for individual pixels ; ( iii ) slopes of these plots ; ( iv ) absolute deviations : mean and standard deviation of S – S _ { true } ; ( v ) relative deviations : mean and standard deviation of ( S – S _ { true } ) /S _ { true } . They are applied to maps of 5 simulated test cases ( Cases 2 , 4 , 6 , 9 , 10 ) that are based on real MIPS 24 \mu m maps ( simulated cases based on artificial sources are excluded ) . The results clearly demonstrate the applicability and limitation of individual map-makers . Destriper-P0 produces the least deviations in most cases , but its maps show artificial stripes for the cases with “ cooler burp effect ” . Scanamorphos , running with the “ galactic option ” and without the “ relative gain corrections ” , can minimize the “ cooler burp effect ” . However , bright pixels in Scanamorphos maps display large deviations , likely due to a slight positional offset introduced by the mapper , and a slight change in the beam size . Destriper-P1 , SANEPIC , and Unimap introduce different types of large spatial scale noise . For SANEPIC , this is likely due to mismatches between the assumptions made in the map-maker and the properties of the test data . For example , SANEPIC assumes that data are circulant , which is not true for the Case 9 . For Unimap , the large scale distortion in maps of Case 6 is triggered by the “ cooler burp effect ” , which the map-maker does not know how to handle . For Naive-mapper ( with simple median background removal ) , many maps show large deviations due to the over-subtraction of the background when extended emission is present . ( 2 ) Spatial ( 2-D ) power spectra . These metrics include ( i ) plots and comparisons of power spectra of maps made by different map-makers ; ( ii ) for simulated cases , plots and comparisons of the divergence from the truth power spectrum of the maps by different map-makers . Most of the power spectra , either coming from real or simulated data , noise-only  or with extended emission , show very similar results . In the “ middle part ” ( k = [ 0.1 , 1 ] arcmin ^ { -1 } ) , results among different map-makers vary little : \sim 1 \% for cases where a truth map was available as benchmark . At smaller scales ( k > 1 arcmin ^ { -1 } ) , the standard Naive mapper produces higher powers than other map-makers , presumably due to the fine-stripes ( baseline removal errors ) found in its maps . Meanwhile , at the same scale , results of Destriper-P0 , Destriper-P1 , Unimap , and SANEPIC are always very close , and those of Scanamorphos are usually lower . The low power at high spatial frequencies in Scanamorphos maps is likely due to the fact that , unlike other map-makers , Scanamorphos distributes the signal measured at a sky position among multiple adjacent map pixels . This is equivalent to a map smoothing , which takes away high frequency powers . At larger scales ( k < 0.1 arcmin ^ { -1 } ) , again the Naive mapper produces higher powers because of the poor baseline removal , while the results of other map-makers are all comparable . In the special cases with the “ cooler burp ” , the power spectra of Naive and Destriper-P0 maps are clearly affected , showing much higher power at k < 0.1 and a peak at k \sim 1.5 in the PLW map . No significant effects due to the “ cooler burp ” are found in results of the other map-makers It should be noted that Naive mapper and Destriper were not designed to treat the cooler burp . Previous parts of the standard pipeline will do this in future HIPE versions . . ( 3 ) Point source and extended source photometry . These metrics include ( i ) astrometry of point sources ; ( ii ) point source and extended source photometry ; ( iii ) detection rates of faint point sources , obtained using Starfinder ( a point source extractor ) ; ( iv ) PSF profiles . They are applied to the simulated test cases with artificial sources ( Cases 1 , 5 and 8 ) . The results show that bright sources in maps made by Scanamorphos have systematically larger position errors ( \lower 2.0 pt \hbox { $ { > \atop \hbox { \raise 4.0 pt \hbox { $ \sim$ } } } $ } 0.1 pixel ) than those in maps made by other map-makers , consistent with the results on position offsets in Scanamorphos maps found in Metrics ( 1 ) for the deviation from the truth . Photometry for bright point sources in all maps has small errors , indicating good energy conservation by all map-makers . On the other hand , photometry of extended sources in the Naive mapper are significantly affected by a known bias due to the over-subtraction of baselines , while other maps have no such issue . For faint point sources ( f = 30 mJy ) , no significant difference is found among results for different map-makers on both detection rate and photometry . Also , there is no significant difference between beam profiles of sources in maps made by different map-makers . ( 4 ) Metrics for super-resolution maps . These metrics are applied to maps made by HiRes and SUPREME , the two super-resolution mappers , and compare them to maps made by the destriper ( the pipeline default ) . They include : ( i ) visual examinations of the maps ; ( ii ) spatial power spectra ; ( iii ) point source profiles . The results show that SUPREME and HiRes yield similar resolution enhancements ( factors of 2-3 ) at spatial scales around 2 arcmin ^ { -1 } for the limited datasets tested at 250 microns . At higher spatial frequencies corresponding to spatial scales smaller than the beam size , there is less power in the SUPREME maps ( intentionally , to smooth and reduce the noise at scales smaller than the beam ) . HiRes contains more power than either SUPREME or Destriper-P0 maps between spatial scales of 15-20 arcseconds . The differences in SUPREME and HiRes arise mainly because SUPREME is tuned to enhance extended emission features , and HiRes is essentially performing a deconvolution in image space . Summary of Results : • The Destriper with polynomial order of 0 ( Destriper-P0 ) , which is the default map-maker in the SPIRE scanmap pipeline since HIPE 9 , performed remarkably well and compared favorably among all map-makers in all test cases except for those suffering from the “ cooler burp ” effect , as it does not have a mechanism to deal with this effect . In particular , it can handle observations with complex extended emission structures and with large scale background gradient very well . • In contrast , the Destriper with the polynomial order of 1 ( Destriper-P1 ) compared poorly among its peers , introducing significant artificial large scale gradient in many cases . • Scanamorphos showed noticeable differences in all comparisons . On the positive side , its maps have the smallest deviation from the truth for faint pixels ( f < 0.2 Jy ) in nearly all cases . Particularly , as shown in both the difference maps and in the power-spectra , it can handle the “ cooler burp ” effect very well . On the negative side , for bright pixels ( f > 0.2 Jy ) , its maps show significant deviations from the truth , likely due to a slight positional offset introduced by the mapper as well as a slight change in the beam size . This effect is also seen in the astrometric errors of the bright sources . However the offset is very small ( \sim 0.1 pexel ) , therefore it does not affect the photometry of both point sources and extended sources , and does not show up in the comparison between beam profiles ( resolution : 0.2 pixels ) . The power spectrum analysis indicates some smoothing of the data compared to the other mapmakers . • The GLS mapper SANEPIC can also minimize the “ cooler burp ” effect . It performed quite well in most cases . However , for those cases with strong variations in very large scales ( i.e . comparable to the map size ) , its maps show significant deviations from the truth . This is because some of its assumptions ( e.g . TODs are circulant ) are invalid for the data . • Unimap , another participating GLS mapper , is among the best performers in most cases . However , because it does not include a mechanism for handling the “ cooler burp ” , its maps show significant deviations from the truth in the cases affected by the artifact . • The Naive-mapper ( with simple median background removal ) is inferior among its peers in general . The most severe bias it introduces is the over-subtraction of the background when extended emission is present . In the cases where the extended emission is in complex structures , this bias can not be avoided by simple masks in the background removal . • The two super-resolution mapmakers , SUPREME and HiRes , yield similar resolution enhancements ( factors of 2-3 ) at spatial scales around 2 arcmin ^ { -1 } for the limited datasets tested at 250 microns . At higher spatial frequencies corresponding to spatial scales smaller than the beam size , there is less power in the SUPREME maps ( intentionally , to smooth and reduce the noise at scales smaller than the beam ) . HiRes contains more power than either SUPREME or Destriper-P0 maps between spatial scales of 15-20 arcseconds . The differences in SUPREME and HiRes arise mainly because SUPREME is tuned to enhance extended emission features , and HiRes is essentially performing a deconvolution in image space .