Transients in the Palomar Observatory Sky Survey

I got some of the code running and experimented with visualizing the filtering.

What the pipeline is doing is finding all the blobs of light in the planes, and then rejecting ones that are known stars or that don't look like points of light. The first step is filtering against the Gaia catalog. something like 90% of all detected blobs of light are rejected. Here I show the 15 most tenuous rejections, and five others at random.

Of note there WAS a cyan "spike", which was for rejecting blobs that were detected in things like diffraction effects from nearby bright stars. This had a couple of issues. Firstly, the range of a "nearby" star was coded as 90' when it should be 90". Correcting that surfaced some spike rejections where the star had simply moved a bit (the proper motion) and was detecting itself as a "spike" source. So I instructed my robot to add the Proper Motion calculations where possible to figure out where the star WAS in the 1950s.

There were a few other issues, but it think this has been a good exercise in validating and improving @HoaxEye's pipeline. I'm documenting everything I do. I'd like to get it into a trivially replicable Docker build.
Thank you.
"Firstly, the range of a "nearby" star was coded as 90' when it should be 90"
This is incorrect and a clear bug in MNRAS 2022. I have posted and commented about it. Correct range is 90' (arcsecs) and not 90" (arcmins). 90" is huge, size of a small continent, so it does not make any sense.

Quote from the paper, spikes' removal:
External Quote:

(a) For each SEXTRACTOR source, we look for counterparts in the USNO B-1.011 (Monet et al. 2003) in a circular region of 90-arcmin radius.
Source: https://academic.oup.com/mnras/article/515/1/1380/6607509

"So I instructed my robot to add the Proper Motion calculations where possible to figure out where the star WAS in the 1950s."
Proper motion part is addressed by the paper and implemented as-is in Vasco60:
External Quote:
For each one of the SExtractor sources fulfilling all the conditions described in the previous steps, we cross-matched them with Gaia EDR3 in a 180 arcmin radius, keeping all the counterparts in that radius. For these Gaia counterparts, we kept those having proper motion information, which was used to correct the position of the Gaia counterparts to the POSS I epoch. The adopted epoch for Gaia was J2016.0. SExtractor sources having a Gaia counterpart (corrected at POSS I epoch) at less of 5 arcsec were flagged as high proper motion sources and, therefore, removed from the list of candidates
 
"Firstly, the range of a "nearby" star was coded as 90' when it should be 90"
This is incorrect and a clear bug in MNRAS 2022. I have posted and commented about it. Correct range is 90' (arcsecs) and not 90" (arcmins). 90" is huge, size of a small continent, so it does not make any sense.
Your code also uses 90', is this to replicate their code?

https://github.com/jannefi/vasco60/blob/5dbf452/vasco/mnras/spikes.py#L117
Python:
class SpikeConfig:
    rmag_key: str = "rMeanPSFMag"
    rules: List[Any] | None = None
    search_radius_arcmin: float = 90.0
    rmag_max_catalog: float = 16.0
...
        # If none within search radius (convert arcmin->arcsec), keep
        if not (dmin_arcsec <= cfg.search_radius_arcmin * 60.0 and m_near is not None):
            r2 = dict(r); r2["spike_reason"] = ""
            kept.append(r2)
            continue
 
Your code also uses 90', is this to replicate their code?

https://github.com/jannefi/vasco60/blob/5dbf452/vasco/mnras/spikes.py#L117
Python:
class SpikeConfig:
    rmag_key: str = "rMeanPSFMag"
    rules: List[Any] | None = None
    search_radius_arcmin: float = 90.0
    rmag_max_catalog: float = 16.0
...
        # If none within search radius (convert arcmin->arcsec), keep
        if not (dmin_arcsec <= cfg.search_radius_arcmin * 60.0 and m_near is not None):
            r2 = dict(r); r2["spike_reason"] = ""
            kept.append(r2)
            continue
Yes. I fixed it in the calling code without touching the original SpikeConfig.

https://github.com/jannefi/vasco60/commit/6c9e393c3b79406ec3734661b44dce485994f3e8 - this is the fix commit
 
Proper motion part is addressed by the paper and implemented as-is in Vasco60:
But it seems in your code, the 5-arcsecond Gaia cone match is done before the PM back-propagation, and the back-propagation is then only applied to the sources that matched — i.e., the ones already eliminated as known Gaia stars. The sources that drifted more than 5″ from their Gaia J2016 positions (the high-PM ones that actually needed epoch correction) never reach the back-propagation step, because they failed the upstream cone match.

My change was to do the back-propagation before the 5" check.

This came up because of:
some spike rejections where the star had simply moved a bit (the proper motion) and was detecting itself as a "spike" source
In those cases, the PM was more than 5", so they were not filtered by the 5" check.

2026-04-10_15-42-17.jpg
 
4. Astroquery Vizier class has an issue where setting ROW_LIMIT as class attribute doesn't propagate to instances. If your cone queries are only returning 50 rows, the veto is effectively not running. This will inflate survivor counts.
That's fixed in my version of the code, vasco/external_fetch_usnob_vizier.py

Code:
  Hunk 1 — default cap bumped:                                                                                                     
  -    row_limit: int = 20000,          
  +    row_limit: int = 200000,                                                                                                    
                              
  Hunk 2 — the actual row_limit bug fix:
  -    # Configure Vizier                                                                                                          
  -    Vizier.ROW_LIMIT = int(row_limit) if row_limit and row_limit > 0 else -1
  +    # row_limit must be passed to the Vizier() ctor — setting Vizier.ROW_LIMIT as a class attr is shadowed by the instance      
  default (50).                                                                                                                    
       cols = columns or _USNOB_COLUMNS                                                                                            
  -    viz = Vizier(columns=cols)                                                                                                  
  +    rl = int(row_limit) if row_limit and row_limit > 0 else -1                                                                  
  +    viz = Vizier(columns=cols, row_limit=rl)

Before that, it was returning 50 rows, after 32383 on my test tile
 
No, it's not. It's reducing the PS1 bright star fetch from 35' to 3.0' (which should actually be 45' to cover the 60x60' tile). The 90' is still there.
You're right to question the radii here, my bad. I said that this commit "fixed the MNRAS 2022 60 arcmin/sec bug" without re‑reviewing the actual spike code. That unit bug (arcmin vs arcsec) is real, but it was already addressed earlier in the MNRAS spike logic (the spike rejection itself is arcsecond‑based). I might have fixed it back in "Vasco30" project already and that commit history isn't visible in "Vasco60".

The commit I posted does not change the spike physics or the arcsec‑scale rejection rule. It only affects how bright stars are queried/filtered around candidates, and it mixes legacy tile‑level fetch radii with candidate‑level proximity checks. I should have documented this.

The 3′ value is intended as a per‑candidate proximity limit, not a tile‑coverage radius. The 90′ value is a legacy catalog prefetch radius. They serve different purposes. I should have explained that more clearly. The commit message/comment was misleading, thanks for noticing this.
 
The commit I posted does not change the spike physics or the arcsec‑scale rejection rule. It only affects how bright stars are queried/filtered around candidates, and it mixes legacy tile‑level fetch radii with candidate‑level proximity checks.
It seems to be a query around the tile center, not stars.

That unit bug (arcmin vs arcsec) is real, but it was already addressed earlier in the MNRAS spike logic (the spike rejection itself is arcsecond‑based). I might have fixed it back in "Vasco30" project already and that commit history isn't visible in "Vasco60".
Vasco60 use the 90' as you have a default

search_radius_arcmin: float = 90.0

in class SpikeConfig: and cli_pipeline.py:610 does not override it to 1.5
 
It seems to be a query around the tile center, not stars.


Vasco60 use the 90' as you have a default

search_radius_arcmin: float = 90.0

in class SpikeConfig: and cli_pipeline.py:610 does not override it to 1.5
search_radius_arcmin: float = 90.0 - yes, that's in the code. But:
- It is not used as the spike cutoff
- It is not the distance used in the rejection rule

Spike rejection is governed by arcsecond‑scale rules. Arcminute radii only decide which stars are even considered.

I think we're talking past each other because there are three different radii involved, and they serve different purposes.
  • Spike physics is arcsecond‑scale (the MNRAS rule uses d_arcsec). That part is correct in the code and was already fixed earlier; the commit I linked earlier does not touch it.
  • The 90′ value in SpikeConfig is a matching gate, a "don't bother evaluating the spike rules if the nearest bright star is farther than this". It is never reached in practice because the upstream fetch (radius_arcmin=3.0) only ever supplies stars within 3′. The gate is dead code relative to the fetch constraint.
  • The 3′ value introduced here is a per‑candidate proximity prefilter: stars farther away physically cannot generate spikes affecting that detection, so they are ignored for efficiency.
So: 3′ is not intended to "cover a 60×60′ tile", and it does not replace the arcsecond‑scale spike rule. I realise my earlier comment was misleading about this.

I tried to clarify all this with a new code comment:
https://github.com/jannefi/vasco60/blob/main/vasco/cli_pipeline.py#L585
External Quote:

# 3) Bright-star spike removal via PS1 (within ~3′, r<=16)
# The spike physics operates on arcsecond separations
# the "90 arcsec" mentioned in MNRAS 2022 is the intended scale (the "90 arcmin" wording in the paper is a confirmed typo).
# The 3 arcmin value here is a per-candidate prefilter to limit which bright stars
# are even considered; it does NOT replace or redefine the arcsecond-scale spike rule.
# A small margin is used while staying physically motivated for Schmidt-plate spikes.
Edit: clarifying 90' value in SpikeConfig more precisely
 
Last edited:
A new paper has been submitted to ArXiv:

Independent Recovery of Vanishing Sources on POSS-I Photographic Plates Using Automated Source Detection and Cross-Epoch Matching by Zachary Hayes
Source:
https://arxiv.org/abs/2604.04810
PDF attached.

I have a question concerning section 3.4., Temporal Correlation with Nuclear Tests (page 4 of paper and PDF), apologies in advance if it is naïve or a non-issue for you guys who have been following this subject:

External Quote:
The baseline rate—days more than 3 days from any test—is 13.3% (271/2,038 days).
Why is the baseline rate 3 days from any test, not 2 (i.e. excluding days -1 and +1)?

-Unrelated to this question, if the times of the relevant tests and plate exposures were known (I'm not sure they are), would it have been more methodologically elegant to use a window 36 hours each side of each test, as opposed to calendar days?
(This still takes it as axiomatic that it makes sense to include a span of time before successful tests, indicating a very high level of understanding of human affairs, or precognition, on the part of ETI).
 
I have a question concerning section 3.4., Temporal Correlation with Nuclear Tests (page 4 of paper and PDF), apologies in advance if it is naïve or a non-issue for you guys who have been following this subject:

External Quote:
The baseline rate—days more than 3 days from any test—is 13.3% (271/2,038 days).
Why is the baseline rate 3 days from any test, not 2 (i.e. excluding days -1 and +1)?

-Unrelated to this question, if the times of the relevant tests and plate exposures were known (I'm not sure they are), would it have been more methodologically elegant to use a window 36 hours each side of each test, as opposed to calendar days?
(This still takes it as axiomatic that it makes sense to include a span of time before successful tests, indicating a very high level of understanding of human affairs, or precognition, on the part of ETI).
I recall there being a comment in the initial nuclear-association paper that the authors blindly chose the +/- 1 day date range to register their target ahead of running the numbers, presumably to avoid p-hacking -- picking data after the fact to bring the association up to something statistically meaningful.

The problem, as you note, is that using dates is wildly imprecise, as all the observations are clustered around midnight Pacific time (and some plates where the 40-minute period crosses midnight are effectively both days) and the British and Soviet tests were in time zones offset by about 12 hours. I don't want to repeat the discussion from the other thread -- but we have both the timestamps for when the dates were exposed and the precise times of all the nuclear tests, so the analysis could have been done with regularized UTC times. However, having to pick an interval of hours rather than days for testing for associations would highlight the arbitrariness of the interval given the lack of any underlying physical mechanism.
 
Why is the baseline rate 3 days from any test, not 2 (i.e. excluding days -1 and +1)?
I think the headline here should be that this new paper seems to prove the null hypotheses. As it says
External Quote:


A Bruehl-style calendar-day comparison gives
a descriptive post-test asymmetry (RR = 1.35, 95%
CI [0.91, 2.00]), but that statistic is tied to the survey
schedule in our dataset. A negative binomial model of
nightly counts with nightly patch coverage as exposure
is null (IRR = 1.03, 95% CI [0.89, 1.18], p= 0.71).
The RR = 1.35 number looks interesting at first glance, but it's actually meaningless. Hayes's catalog is so contaminated (2.85 million candidates from loose detection parameters) that every single one of the 368 observation nights has at least one candidate. So the "calendar-day" metric ("did this day have a vanished source?") degenerates into "did Palomar happen to be observing that day?" You can reproduce the RR = 1.35 and p = 0.17 purely from the observation schedule, with no transient data at all. It's just that POSS-I happened to observe on 17.9% of [+1] days vs. 13.3% of baseline days.

When Hayes uses the correct statistical test (a negative binomial model that looks at actual nightly counts with sky coverage as an exposure term) the result is flatly null across every window tested. ±1 day: IRR = 1.03, p = 0.71. Post-test only: IRR = 1.06, p = 0.60. Pre-test only: IRR = 1.01, p = 0.95.

So, as discussed earlier, the nuclear correlation is just a function of which days Palomar was observing on. And even that is not statistically significant. It's just noise.
 
(My emphasis).

The authors presuppose the existence of "transients"- their chosen term for hypothesized, previously undetected physical objects in space near to the Earth (and it is clear from the general discourse that they are at least amenable, without any other evidence whatsoever, that these "transients" if they ever existed might be ETI technological artefacts), and state it is a
External Quote:
...reasonable working assumption
that both photographic defects and "transients" are present.

They do not state why this is a "reasonable" assumption.

A reasonable assumption, based on everything that we know about astronomical observations, both optical and radio, is that there are no technological artefacts of unknown origin near the Earth now, or in the past.
Another reasonable assumption is that the authors have, at best, misinterpreted historical photographic records, made unrealistic assumptions and drawn unrealistic conclusions in line with their pre-existing beliefs and/or wants. At best, a sort of type-1 experimental error.
Yes. The assumption of the 'mysterious transient' crowd seems to be that before 1950, aliens placed many, many thousands of small, reflective satellites in geosynchronous orbit for unknown reasons, and they have all since disappeared.
 
"Seventy years later, in 2021, Dr. Beatriz Villarroel and her VASCO project team identified a puzzling anomaly in its digitized version. Within a 10×10 arcminute section — about the size of a dime held at arm's length — they spotted nine stars, only to see them disappear 30 minutes later in the subsequent blue-sensitive plate."

Source: https://medium.com/@izabelamelamed/not-seeing-the-star-cloud-for-the-stars-a010af28b7d6


DUH! Using plates sensitive to different colors is deliberate, it's how astronomers determine the color of stars. A red star that is bright on a red-sensitive plate might well disappear in the corresponding blue sensitive plate, and vice-versa. This is not a bug, it's a feature. Why do you think they use two different color-sensitive plates in the first place? Comparing a star's image on both plates enables astronomers to determine a its color. And a star that is very red can be expected to disappear on the blue-sensitive plate, and vice-versa.

Really, this whole argument sounds like it was invented by someone who did not understand why astronomers were using two different color-sensitive plates,
 
Maybe you smarter people can help a dullard out here. All of these papers floating around seem to use 2 related sets of "transients", either 5,399 or 107,000. Am I correct that the 5,399 number is a sub-set of the 107,000 which is a sub-set of 268,165 original candidates as described by Villarroel:

External Quote:

The 5,399 catalogue was therefore not created for completeness but for the search for a candidate vanishing star.
Despite multiple threads, multiple attempts and a whole lot of coding by what seem like smart people, no one has yet been able to replicate the exact filtering process, right? The authors have not shared the coding pipeline they used to arrive at these numbers, but have apparently shared some of the resulting data with other "independent researchers".

The above claim is that the 5,399 number was not about "...completeness but the search for a candidate vanishing star". At face value, I would take that to mean one or maybe a few of the 5,399 transients left after all the filtering and cross matching took place, might be a star that was successfully photographed in the '50s and was/is no longer visible in the sky when later surveys were conducted. Not an unreasonable idea I guess.

But, any quick read of the VASCO website makes it clear that's not what it means. "Vanishing stars" is a euphemism for glinting or flashing objects in Earth orbit during the '50s, but not now. Or as they call them, transients. Further reading shows that the VASCO project strongly suggests that these transients are in fact techno-signatures in Earth orbit. A dog-whistle term for alien UFOs. Something Villarroel has confirmed in interviews.

So, 5399 transients that don't correspond to any known celestial body, are not defects of any kind and might be aliens craft/probes orbiting Earth in the early to mid 20th century. Having established this in the original Solon paper, Villarroel and others then looked for correlations between these transient/techno-signatures and UFO related sightings, stories and dubious UFO nuclear testing lore.

But wait, when they went looking for correlations, Villarroel recommends further filtering of the 5,399 transients:

External Quote:

Anyone using the 5,399 transients is also recommended to add similar additional filtering steps, corresponding to what was done for the 298,165 to ∼ 107,000 objects, to remove potential duplicates and sources that escaped theGaia and PanSTARRS filtering.
Just me, but I would think adding "...additional filtering steps" would further dilute the pool of possible transients to a number less than 5,399. However, it seems the opposite is the case. When looking for correlations, one is to use a much larger 107,000 sample size (bold by me):

External Quote:

Thus, the 107,000-object sample is far more appropriate for statistical inference than the 5,399-object subset used by Watters et al., both because of its substantially larger size, removal of certain false positives and because it avoids cross-matching against a zoo of catalogues across the electromagnetic spectrum with differing and uneven sky coverages.
https://arxiv.org/pdf/2602.15171, Villarroel et al - A Response to Watters et al (2026).

So, if one were to search for correlations with the 5,399 sample, would none be found? It seems that's what happened when Villarroel's former co-author, W. Watters, tried to search for a transient-nuclear test relationship using the smaller number he helped arrive at. Thus, Villarroel let's Watters and others know, that they should use a sample set nearly 20 times larger than the original one.

If the point is to filter out as many known celestial bodies and as much noise as possible from the original POSS 1 plates to find the true transients, than isn't the smaller number more likely to be more accurate? If so, why fool around with the much larger one? IIRC, somewhere back in this thread, the suggestion was made that a larger number was needed to be more statistically accurate, or meaningful, which is what Villarroel suggests in the above quote.

But if there is a finite number of nuclear tests in a giving time period and theoretically a finite number of UFO reports in the same time frame, why can the number of transients being correlated with these finite number be ~20 times larger than the original and presumably more filtered sample? If the 107,000 sample size had failed to arrive at a meaningful correlation, would the suggestion then be to use the larger 298,165 sample that they started with? Are they just using a larger sample to get the results they want and then sending the larger sample to other researchers so they arrive at the same conclusions? The above quotes from Villarroel suggests that further sampling and filtering will change the 5,399 transients into the more useful 107,000 transients.

I guess it may be that after finding the 5,399 transients, they went back and did it again on a bigger sample set and found a lot more transients using the same technique. Maybe?

I'm not getting something.
 
Despite multiple threads, multiple attempts and a whole lot of coding by what seem like smart people, no one has yet been able to replicate the exact filtering process, right? The authors have not shared the coding pipeline they used to arrive at these numbers, but have apparently shared some of the resulting data with other "independent researchers".
I very much doubt that the process will ever be independently fully replicated. I've been doing a bit of work with @HoaxEye on this, and there are numerous pitfalls and sensitivities. Unless you have the original source code, with pinned modules, and the original data, and the online databases still work and are the same, then there will inevitably be differences.

There's also significant subjectiveness in the code. What number do you use for certain criteria? 5" vs 6" can make a big difference. How do you determine if something is star-like when it's <5 pixels across?

That said, if there's a signal in there, it might be possible to get through that noise. I'm not seeing it yet.
 
Back
Top