HoaxEye
Senior Member
I made a small helper script for checking the pipeline status per tile: https://github.com/jannefi/vasco/blob/main/scripts/tile_status.py (note: it assumes data is in ./data and tiles under ./data/tiles, modify if needed). I went through the data and noticed a large number of coordinates without a fits file. It's a known issue that POSS-I doesn't cover the whole northern sky and it's handled, but looking at the coordinates, it became obvious that something is wrong.
Script uses STScI for downloading tiles (https://stdatu.stsci.edu/dss/script_usage.html) Sometimes it returns images from other surveys if the area is not covered by POSS-I. But it didn't work correctly with many coordinates inside POSS-I. Plate finder: https://archive.stsci.edu/cgi-bin/dss_plate_finder revealed that the service does contain many POSS-I plates but those are just not returned via cgi-bin interface. After some github and other code repository searches, I found an undocumented value for -v parameter (survey number): poss1_red. It's not a survey number mentioned in the documentation, just a string. But it works. I hacked the downloader code so that it always uses possi1_red parameter. I broke the old, "clean" internal survey parameter handling, because this pipeline won't need images from any other survey for now.
I have to backfill all tiles that have failed to download for any reason, and run them through the whole pipeline (steps 1-6 + post steps). That's over 3000 tiles and will take probably 24-48 hours to complete. In practise even longer, because I can't leave my laptop running alone.
If any of you have used the software, please use the latest version, rebuild your docker image, and run ./scripts/backfill-complete.sh. Before that you might want to take a look at your tile status using the new small script that revealed this latest issue.
Script uses STScI for downloading tiles (https://stdatu.stsci.edu/dss/script_usage.html) Sometimes it returns images from other surveys if the area is not covered by POSS-I. But it didn't work correctly with many coordinates inside POSS-I. Plate finder: https://archive.stsci.edu/cgi-bin/dss_plate_finder revealed that the service does contain many POSS-I plates but those are just not returned via cgi-bin interface. After some github and other code repository searches, I found an undocumented value for -v parameter (survey number): poss1_red. It's not a survey number mentioned in the documentation, just a string. But it works. I hacked the downloader code so that it always uses possi1_red parameter. I broke the old, "clean" internal survey parameter handling, because this pipeline won't need images from any other survey for now.
I have to backfill all tiles that have failed to download for any reason, and run them through the whole pipeline (steps 1-6 + post steps). That's over 3000 tiles and will take probably 24-48 hours to complete. In practise even longer, because I can't leave my laptop running alone.
If any of you have used the software, please use the latest version, rebuild your docker image, and run ./scripts/backfill-complete.sh. Before that you might want to take a look at your tile status using the new small script that revealed this latest issue.
