Deinterlacing The Navy UAP Videos

logicbear

Member
This is what frame 984 of Gimbal looks like, showing the object's geometric properties detected by the method described here.

1712837679172.png


Clearly the accuracy of the method is heavily degraded on such frames where the object makes sudden movements leading to these more pronounced combing artifacts due to interlacing. The automated motion tracking of the clouds filters and averages errors over a large area to counteract this to a degree but the accuracy of that is also affected by this. So what follows is a lengthy but still inconclusive investigation into the question of what can be done in order to improve their accuracy in the face of such artifacts. This post mainly focuses on describing the problem in detail, figuring out what happened to the video.

1. Attempts to use existing deinterlacing algorithms


Many different deinterlacing algorithms exist that can often do a decent job of alleviating such artifacts. From among the methods supported natively by FFMPEG for example I tried BWDIF (Bob Weave Deinterlace Filter).

The frames from the original WMV file were extracted to the current folder, cropped and stored in a lossless image format with the following FFMPEG command:
Bash:
FORMAT_CROP="format=gray, crop=428:428:104:27"
OUT_IMAGES="-start_number 0 gimbal%04d.png"
ffmpeg -i "../2 - GIMBAL.wmv" -vf "$FORMAT_CROP" $OUT_IMAGES
With BWDIF deinterlacing the command is as follows. Note that the command requires some of the variables from above to be defined.
Bash:
BWDIF="bwdif=mode=send_frame:parity=bff:deint=all" # send_frame keeps the same frame rate, and bottom frame first (bff) looks better on frame 984 than parity=tff
ffmpeg -i "../2 - GIMBAL.wmv"  -vf "$FORMAT_CROP, $BWDIF" $OUT_IMAGES

Here's a comparison between the original frame 1002 and the same frame deinterlaced with BWDIF:
[compare width=400]
1002_orig.png

1002_bwdif.png

[/compare]
And for frame 984:
[compare width=400]
981_orig.png

981_bwdif.png

[/compare]
So BWDIF helps but the results are still significantly wavy.

QTGMC is another widely recommended deinterlacing method that I tried. It has many dependencies but for Windows someone made a package that contains all of them so you can just extract it all and start using it. I changed the last four lines of "qtgmc.avs", going for the highest quality possible, although many settings remain to be tweaked:
Code:
FFMpegSource2("2 - GIMBAL.wmv")
QTGMC(preset="Very Slow", ShowSettings=false, NNSize=3, NNeurons=4, EdiQual=2)
SelectOdd()
Then the following command runs the filter:
Bash:
../ffmpeg.exe -i ../qtgmc.avs -vf "$FORMAT_CROP" $OUT_IMAGES

It does substantially improve the results on some frames. Here's a comparison between BWDIF and QTGMC on frame 1002:
[compare width=400]
1002_bwdif.png

1002_qtgmc.png

[/compare]
But on other frames like 984 the difference is much smaller.
[compare width=400]
981_bwdif.png

981_qtgmc.png

[/compare]

None of the methods I tried were able to completely remove the artifacts and tended to blur the images. Perhaps state of the art deinterlacing methods like this one could do even better, but after digging deeper it quickly became clear that the video actually has artifacts that these methods were not designed to handle.

2. A closer look at the frames


Normally interlaced video should just consist of one snapshot of the scene on the even lines and another snapshot at a different time on the odd lines of the frames. But that is not the case in Gimbal and GoFast. Mick illustrated this by overlaying a grill of alternating horizontal lines over frame 1002.
[compare width=600]
mick_1002_grill_1.jpg


mick_1002_grill_2.jpg

[/compare]

Alternatively we can use the following commands to extract the top/bottom fields of the video:
Bash:
SCALE_VERT="scale=iw:2*ih:sws_flags=neighbor"
FIELD_TOP="setfield=tff, field=0" # make sure the field dominance is set to 'tff' to get consistent results
FIELD_BOTTOM="setfield=tff, field=1"
ffmpeg -i "../2 - GIMBAL.wmv" -vf "$FORMAT_CROP, $FIELD_TOP, $SCALE_VERT" -start_number 0 gimbal%04d_0.png
ffmpeg -i "../2 - GIMBAL.wmv" -vf "$FORMAT_CROP, $FIELD_BOTTOM, $SCALE_VERT" -start_number 0 gimbal%04d_1.png

[compare width=600]
gimbal1002_0.png

gimbal1002_1.png

[/compare]
The object shows some degree of combing artifacts on both the top and bottom fields, but the track bar is solid on the the bottom field, and in the top field it fades in and out.

Arranging the fields to appear in sequence, we can see that the bottom field (B) of a frame always shows events that happened before the top field (T) of the same frame. This appears to match what the Video4Linux documentation says about NTSC:
Article:
The first line of the top field is the first line of an interlaced frame, the first line of the bottom field is the second line of that frame. ... The temporal order of the fields (whether the top or bottom field is first transmitted) depends on the current video standard. M/NTSC transmits the bottom field first, all other standards the top field first.

field_order(2).gif

Presumably the object is constantly moving to the left during this time, but it remains unclear why the combing distortion affecting it is always much greater in the top fields.

Comparing the top field of frame 980 vs 981 we see alternating dark and bright horizontal bands that are separated by about 7-8 lines.
[compare width=600]
gimbal0980_0.png

gimbal0981_0.png

[/compare]
But comparing the bottom fields of the same two frames we don't see the same bands (maybe some barely noticeable ones). Instead we just see a significant change in the overall brightness of the image.
[compare width=600]
gimbal0980_1.png

gimbal0981_1.png

[/compare]
In fact those clear bands related to illumination changes never appear on the bottom field. This observation already has actionable consequences. Tracking algorithms should work a bit better without these illumination bands, so they can either be run on the bottom field only, or the illumination changes can be more accurately measured on the bottom field and that information might allow the illumination bands to be removed from the top fields as well.

There's an oddity related to the top field of frame 373, just when the video changes from white hot to black hot. We can see alternating bands affecting most of the image but it's unclear why the last 22 rows, the bottom ~10% of the field is unaffected. It looks like the recording is not synchronized with the cockpit display, so it already starts displaying part of the next frame while recording the current one. Here I show the fields of frames 372-374 in sequence:



The above is where this desync is most clearly visible, but it does also occur over the rest of the video if we look more closely. For example on frame 1022, just before a change in brightness, we see the lower half of the following is darker than the top half:



We see the same interlacing patterns in GoFast as well. Here's frame 38:

1713116520384.png


In FLIR1 the compression makes it more difficult to see these patterns, but they do appear on some frames:



3. Simulating the artifacts


I came across a thread in another forum where someone had a similar problem. The difference is that in their clip the alternating pattern is more jagged, rather than smoothly fading in and out, but there are thick bands showing a combing pattern instead of alternating lines, vaguely similar to Gimbal's top/bottom fields.

1712132271986.png

In that case they found the cause and were even able to write some code to correct the artifact.
Article:
Artifacts like this are caused by taking an interlaced source and resizing down with a scaler that expects progressive input.

After some investigation it seems very likely that this is indeed what happened in Gimbal as well, just involving a different interpolation method used while resizing. The sequence of events for how the video originated was summarized here based on an ATFLIR manual. A paper describing the ATFLIR also confirms that its sensor has 640x480 pixels. So the original frames might've been 480x480 in size, recorded onto modified RS-170 video, but in Gimbal the cropped frames are 428x428 in size. The video has 30 progressive frames per second but it originated from an interlaced video with 60 fields per second. It's unclear why it got resized or by whom, but while doing so perhaps some hardware/software was used that did not properly take into account that the video is interlaced, that downscaled interwoven frames instead of scaling the fields separately or deinterlacing first.
The theory is that 60 full frames (480x480) in image I were recorded as 60 fields (480x240). They were converted to 60 interlaced frames (480x480) in image I1 by simply weaving together the fields. Each of those 60 interlaced frames were downscaled to 428x428 in image I2 with some interpolation that smoothly blended together the alternating lines. Finally the video was reinterlaced, weaving alternating lines from the current and next frame into image I3, padding to 640x480 and discarding every other frame to arrive at the 30 FPS Gimbal video. More precisely this is described by the following formulas:
1713489762897.png

Here t is a frame number with I2, I1 and I having twice as many frames as I3. Going from I2 to I1 the coordinates for the center of each pixel (+0.5 offset) are scaled up. I2 samples pixels (in,jn) from I1 in a neighborhood N around those scaled coordinates. The image scaling is typically implemented as a horizontal scaling followed by a vertical scaling with lookup tables for the horizontal and vertical scaling coefficients ch and cv. The size of the neighborhood depends on the scaling method used. For ffmpeg's "bilinear" scaling mode it varies, typically 2x2 but up to 3x3 blocks of pixels with nonzero coefficients. Whether it's the even or odd rows of I1 that samples the next frame of I alternates between frames.

To simulate the effect I first created the following 480x480 2FPS test video. I posted the code for all of this here, on Github, with links to run it on either Google Colab or Kaggle.

Then I applied the following filters to it, with tests to make sure that their result matches the formula above:
Bash:
# using tinterlace filters (https://ffmpeg.org/ffmpeg-filters.html#tinterlace)
# that implement going from I to I1 and going from I2 to I3
INTERLACE_BOTTOM="tinterlace=interleave_bottom, tinterlace=interlacex2"
INTERLACE_TOP="tinterlace=interleave_top, tinterlace=interlacex2"
# Using the 'bilinear' method for the scaling, but it remains to be seen which one matches best
SCALE="scale=428:428:sws_flags=bilinear"
SELECT_EVEN="select='eq(mod(n,2),0)'"
FULL_FILTERS="$INTERLACE_TOP, $SCALE, $INTERLACE_BOTTOM, $SELECT_EVEN"
INPUT_PNG="-start_number 0 -i test_%04d.png"
ENCODE_WMV="-codec wmv1 -b:v 3M" # try to match the original quality
ffmpeg -r 2 $INPUT_PNG -vf "$FULL_FILTERS" $ENCODE_WMV -r 1 test_vid_full.wmv


And here's a video of its top/bottom fields in the same temporal order as Gimbal:


This appears to reproduce most of the effects seen in the Gimbal frames above. We can see clear horizontal bands in the top fields when the background brightness changes. The bottom field doesn't show those clear bands since I made sure to only change the background color on even frames when generating the input video. But the WMV1 codec also adds some faint horizontal lines to the bottom fields since the full interlaced image is compressed in blocks and the information from one line bleeds over into the next. When the bar moves it appears duplicated and one of them is more solid. In the top field the bar is fading in and out and it's always solid in the bottom one. I made the object only move on even input frames, leading to much greater distortion of the object in the top field with some slight distortion in the bottom field as well due to the compression but some further research is needed to see if we can get closer to the degree of combing seen in Gimbal's bottom fields. The idea is that the overlay, e.g the position of the bars is only updated 30 times per second or less, so you only see it move during the period captured by the top field. Similarly the pod's auto level gain (ALG) algorithm might only be updating the level/gain settings at most 30 times per second. But it's unclear whether the ATFLIR's sensor is only able to capture 30 full images of the scene per second.

To quantify whether the simulation is comparable to Gimbal's actual frames I looked at the horizontal bands seen when, in both cases, the image suddenly becomes less bright. At the top of the following plot I show the mean pixel intensity over even and odd rows of Gimbal's frame 980. We can see that the even row intensity periodically dips below the odd row intensity. At the bottom of the plot I show the difference between consecutive even/odd rows of Gimbal, and compare that to the same difference calculated for the simulated test frame. In the test video I made the background color change a lot more than in Gimbal, so I scaled the amplitude and shifted the mean of the test signal for a better match, but it's a remarkable result that the frequency and phase of the test signal matches what is observed in Gimbal.
plot_bands.png

In Gimbal's frame 981 and the test frame 12 the image is about to become brighter again so the phase of the signals is reversed.
1713603767533.png

The frequency depends on the ratio between the source and destination height of the image, 480/428. If either of those dimensions were off by just a few pixels we'd see the peaks of the test signal drifting away from the peaks in the observed signal over time. The phase also depends on whether it's the even or odd rows of I1 that sample pixels from the next frame. So in the filter chain for the ffmpeg command above I chose two opposing filters, "interleave_top" at first then "interleave_bottom" later. If the first one were also "interleave_bottom" then the phase of the test signal would be the exact opposite, with the peaks of the test signal matching the troughs of the observed signal. The reason for this difference in filters remains unclear. One possibility is that the final interlacing is done on the full 640x480 frames, and the Gimbal frames start from an odd offset (27) so the even rows of the cropped Gimbal frames are the odd rows of the full video, and the first interlacing of the 480x480 frames would've happened without this padding. But that would lead to a temporal order in which the top fields are transmitted first, not how NTSC should work, so it's unclear. Another vague idea is that it might somehow be related to the desync in the recording of the frames that is most prominent when switching from WH to BH, affecting the bottom 10% of the fields as shown above. If the frames had already been resized (improperly) but then a recording of that happened to start on an odd frame rather than an even one, then perhaps that might also cause the phase reversal that we see.

4. Future work

The next step should be to try to turn the information above into something that can correct some of the observed artifacts and improve the accuracy of some of the methods used to analyze Gimbal. It should be possible to correct the horizontal bands. I found a thread where someone was able to do something like that, in their case due to upscaling interlaced footage. It should be possible to correct the recording desync by taking information from a previous frame/field for the bottom 10% of the image. Just running most algorithms on the bottom field might help. While still imperfect it's at least combining information from fewer consecutive frames. A brute force method of figuring out how some things moved might be to generate several motion hypotheses, resize/interlace the generated frames with the method above, then compare that with the observations until the best match is found. It remains an open question whether it's possible to actually deinterlace the image and fully remove the artifacts from the object itself, whether it's possible to generate temporally consistent frames, or whether it's possible to modify the equations of the motion tracking algorithms in such a way that they produce more accurate results in spite of blending together information from consecutive frames.
 
Last edited by a moderator:
Correct me if I am wrong, but it is my understanding that this and all frames in the video show what the glare looks like: what the actual object looks like is lost in the glare.
His goal is to produce a clean and consistent geometry, regardless of what it represents, so other algorithms would not be thrown off by artefacts, improving the accuracy of the analysis of the Gimbal footage.
 
His goal is to produce a clean and consistent geometry, regardless of what it represents, so other algorithms would not be thrown off by artefacts, improving the accuracy of the analysis of the Gimbal footage.
It's important to make sure the terminology is consistent yet it be used to misrepresent.
 
His goal is to produce a clean and consistent geometry, regardless of what it represents, so other algorithms would not be thrown off by artefacts, improving the accuracy of the analysis of the Gimbal footage.

This analysis was basically all over my head. What are the summary conclusions? Is there any indication that the shape is or is not the actual shape of the object?
 
Is there any indication that the shape is or is not the actual shape of the object?
I'll let logicbear speak to whether there are indications in the analysis, but just for anybody stopping by the site who has not read the threads on this UAP vid, yes there are very strong indications that the shape is not the shape of the object but is glare, largely based on the conclusion that the perceived rotation is an artifact of the imaging system, not the object rotating. Reading the other threads here on Gimbal is very interesting, to see how thinking on the video has evolved, and whet conclusions have been reached

For those not wanting to read the admittedly now-voluminous threads on this, a summary of the observables leading to that conclusion is here:
Source: https://youtu.be/qsEjV8DdSbs


Key points are the four observables, described here --

Capture.JPG


timecodes there will allow anybody interested in a specific point to skip directly to that one, but the whole video is worth watching. Highlighting is mine. EDIT: Forgot to stick the picture in, so did so in an edit!

Sorry to have horned in, I to am interested in what logicbear's analysis reveals. Just didn't want a casual vistor miss what has already been shown to be the case with this video.

EDIT: referenced wrong member, many apologies!
 
Last edited:
This analysis was basically all over my head. What are the summary conclusions? Is there any indication that the shape is or is not the actual shape of the object?
There are already plenty of other indications that it's not the actual shape of the object, that it's glare which obscures the real object, but as others have correctly noted, this particular analysis sets that question aside and just tries to gain a deeper understanding of the various video artifacts that degraded the quality of the video we have. So in this context when I refer to "the object" really I'm just talking about a blob of pixels in the middle of the frame, some 2D pattern in the infrared radiation that the ATFLIR's sensor received, not the physical object which emitted that radiation. A main conclusion so far is that improperly resizing interlaced video is very likely the cause of many of the artifacts we observe. The ultimate goal is to take this knowledge and somehow improve the accuracy of the algorithms used to detect the motion of the blob of pixels in the middle as well as the motion of the pixels representing the clouds in the background. And the reason why I decided to try to further improve their accuracy is that I suspect it will lead to even more evidence that what we see is glare rotating due to the ATFLIR, and that the object behind the glare is just flying along without making any sudden movements. In particular, I have a method of calculating the rotation of the glare from the slight rotation of the clouds that happens at the same time, something that you would not expect to be possible if this was really an object rotating independently of the ATFLIR, but those calculations are quite sensitive so I figured I should try to improve their accuracy a bit first.
 
Last edited:
This analysis was basically all over my head. What are the summary conclusions? Is there any indication that the shape is or is not the actual shape of the object?
The conclusion is that, some of the artefacts that are making it hard for other software to automatically track the rotation of the geometry frame by frame, for the analysis of the Gimbal footage, were caused by resizing an interlaced video using a technique that assumes the video is progressive, i.e. the video should have been deinterlaced before it was resized by a 3rd party.

The aim of this analysis was to establish what caused some of the variation in the geometry between frames in order to either fix it, or account for it, and then rerun the software used for the analysis of the Gimbal footage.
 
resized by a 3rd party.
At this time I'm leaning more towards the idea that it was resized by some hardware/software on the jet, prior to making a tape recording of it, and not by some 3rd party afterwards. E.g when switching from WH to BH there are frames where 90% of a field has horizontal bands, while the bottom 10% of it does not. So the resizing that causes the bands should occur prior to whatever causes that split, as otherwise the entire field would be affected. That split looks more like an analog recording issue, not something I'd expect to happen after it's digitized, so if that is indeed the case then the resizing must occur prior to the recording. Perhaps the jet's recording system might do this because the tapes have some limited bandwidth and they need to make a recording of multiple displays at the same time. Maybe we can find some F/A-18 or ATFLIR technician somewhere who can confirm whether the frames on the tape are already resized when they receive them.
 
Last edited:
The conclusion is that, some of the artefacts that are making it hard for other software to automatically track the rotation of the geometry frame by frame, for the analysis of the Gimbal footage, were caused by resizing an interlaced video using a technique that assumes the video is progressive, i.e. the video should have been deinterlaced before it was resized by a 3rd party.

The aim of this analysis was to establish what caused some of the variation in the geometry between frames in order to either fix it, or account for it, and then rerun the software used for the analysis of the Gimbal footage.
I was working for a company that did military imaging h/w and s/w back in the mid-90s, and I'm 100% sure nothing that came out of our systems was interlaced. Interlacing is a way of cheapening things, and hoping that the humans don't notice what you've done. The military didn't care much for cheap solutions, they'd happily pay for the latest and greatest. We were so cutting edge, we were even using TI's chips before their EVBs (evaluation boards) became available - that's one of the benefits of being the company that designed the EVBs. (Then again, I think much of the military stuff was SHARC-based, not TI's DSPs.) So there really ought to be a non-interlaced original to work with, had people known or cared about what they were dealing with.
 
So there really ought to be a non-interlaced original to work with, had people known or cared about what they were dealing with.
Marik von Rennenkampff mentioned they had cameras recording the instrument's screen in the cockpit in 2004, in respect to the FLIR1 footage:

(01:43:03 - 01:43:28) - "This is back when they actually had cameras recording the screen, like an actual camera, it wasn't digital, right?"


Source: https://www.youtube.com/watch?v=lbTh4bX0dYw&t=6183s

Moreover, Post #15 in the Reverse Engineering the ATFLIR to find Range and Temp to/of the Gimbal UAP provides a summary of the system:

  • The raw signal is analog (i.e. essentially the waveform of a voltage level).
  • This gets converted to 14-bit and sent to the DNUC
  • The DNUC does some math on this for level and gain, where it will clip values out of range, and sends a 10-bit version to the (internal) SBNUC for more processing
  • The symbology (numbers, etc) is added on top of this digital signal, and it's re-converted to 14-bit
  • This gets converted to modified RS-170. RS-170 is the US standard interlaced monochrome TV signal format in use since 1954. The "modification" is likely to restrict it to 480 lines. But essentially it's black and white NTSC. Interlaced.
  • That mono NTSC signal is what is seen in the cockpit on the DDI screens
  • The DDI's mono NTSC signal is recorded using (I think) TEAC 8mm hi-8 analog tapes. part of the the CVRS (Cockpit Video Recording System)
  • These tapes are then physically transported elsewhere, and digitized from taped NTSC into 8-bit WMV files
  • The WMV file is leaked to NYT and TTSA
  • Unknown conversions and operations are done, and the files are loaded on YouTube
  • Someone downloads from YouTube, resulting in a smoothed low information versions
  • Taylor uses this.
 
Last edited:
At this time I'm leaning more towards the idea that it was resized by some hardware/software on the jet, prior to making a tape recording of it, and not by some 3rd party afterwards.
If that's the case we'd expect the banding in the Nimitz video to be similar.

Comparing frame 373 with frame 568 of the video as released by the Navy.

The Nimitz frame was cropped to 241x235 starting at 54:17.

nimitz_crop.png


The GIMBAL frame to 428x428 starting at 104:27 and scaled down to 235x235 using bilinear interpolation to match vertical resolution.

gimal_crop.png


Full frame mean line brightness comparison shows a possible correlation.
full_frame.png


First 80 lines scaled and shifted for easier comparison shows a close match.
close_matched.png


Notably the comparison is between visible light and thermal modes, meaning that the sensor behavior isn't directly implicated.
 
If that's the case we'd expect the banding in the Nimitz video to be similar.
The bands might also be similar in Nimitz if it got resized after being recorded to the tapes, so not by some hardware/software on the jet but during or after the digitization of the tapes, if both in 2004 and 2015 there was some other common reason for resizing it to that same specific size.

Based on the F/A-18 E/F NATOPS every version of the Cockpit Video Recording System (CVRS) made two recordings at the same time. Almost all versions used two 8mm tape recorders to do so, although a later version used a solid state recording solution. For two seater aircraft there were two CVRS control panels with switches. The front control panel always had two switches to control what should be recorded on each tape. The aft control panel's one or two switches (depending on the version) would always override the front control panel to record one or two of the displays form the back seat instead, so in total there would still only be two tape recordings at the same time.
Article:
The two CVRS video tape recorders (VTRs) are located behind the ejection seat in the F/A−18E and behind the rear cockpit ejection seat in the F/A−18F. Each VTR provides a minimum of 2 hours recording time on removable 8 mm video tape cartridges.
...
The aft cockpit switches override the front cockpit switches for any selection on LOT 26 AND UP aircraft.
...
1714603219451.png

Forward CVRS Control Panel (LOT 26 AND UP):
HMD Selects HMD video camera.
LDDI Selects LDDI direct video.
RDDI Selects RDDI direct video.
HUD Selects HUD video camera.
RDDI Selects RDDI direct video.
MPCD Selects MPCD direct video.
...
1714602587326.png

Aft CVRS Control Panel (LOTs 26-29 After AFC 445 AND LOT 30 AND UP):
CNTR Selects center display direct video. Electrically held in this position.
FWD Selects front cockpit video. Default start−up position.
LDDI Selects LDDI direct video. Electrically held in this position.
HMD Selects HMD video camera.
FWD Selects front cockpit video. Default start−up position.
RDDI Selects RDDI direct video. Electrically held in this position


There's a video from last year of an EA-18G Growler's cockpit, showing that it still has the same CVRS control panels that the NATOPS describes.
1714593241702.png

1714593380343.png


Here's a video of a pilot inserting the two tapes into an F/A-18 C.

Video8 was introduced in 1984, Hi8 in 1999, so you'd think they would use the newer analog standard available to them. Digital8 used a digital encoding to record more data onto the same physical tapes from Hi8, but that format was not as popular.
1714594330943.png

If it was Hi8 then according to Wikipedia its resolution should be ~560×486 (NTSC) so it could comfortably fit a single 480x480 display. Video8 was only ~320×486 (NTSC) so if perhaps some earlier CVRS models used Video8 then the video would have to be downscaled, by more than the amount we ultimately got, but only horizontally. So could using Video8 instead of Hi8 explain the lower horizontal resolution that Mick noticed below ?
For example, the text and the horizon indicator are not really blurred; they are just aliased due to the low resolution. We can draw a single pixel line in photoshop, pixel perfect, and it looks the same as the horizon line:

View attachment 45459

Notice, though, that the vertical lines have spread much more; this does not seem to be a result of the reduced resolution but rather a limitation of the system, possibly just reduced horizontal resolution. Here's the same thing in GoFast
View attachment 45461
Notice horizontal lines are sharp, and vertical lines are blurry as if they are half the resolution. This is also visible in the text, again in Gimbal:
View attachment 45463
Note the brighter, mostly 2-pixel thick horizontal lines and the darker 4-pixel thick vertical lines. We see this repeated (at a much lower resolution) in Nimitz/Flir1
View attachment 45464

So what we see is that the reduction in resolution in the video is not adding significant blurring to the image beyond what is expected from aliasing. Vertical spread is one pixel. The horizontal spread is two pixels.

If two displays were recorded separately then naturally one might want to synchronize the videos and show the two displays side by side, for everyone to see during the debrief. This video from 2012 shows they used video projectors aboard aircraft carriers, possibly for this purpose.
1714593989460.png


This blog post from 2005 said that FWVGA, 854x480, was a "popular" resolution format for projectors at the time, and even argued against using higher resolution ones.
Article:
There are three popular 16:9 resolution formats at the moment. The first is 854x480, the second is 1024x576, and the third is 1280x720. ... stepping up in price to the next highest resolution, 1024x576, does not give you a sharper picture from DVD ... There is no price point at which it would make sense to step up from 480p, but forego the incremental expense to get to 720p

So if they wanted to resize the videos to fit two of them side by side onto the native resolution of common/popular projectors at the time, then 428x428 is almost exactly what they'd end up with. It should be 427 horizontally, to be more precise, so it's a little unclear why we'd see that extra column, but still it's so close that it's certainly worth considering. In this case they would've had to remove the other display afterwards, perhaps because that contained the SA page which would've been classified, then padding it to 640x480 for some reason to arrive at the video that we have. We also see some nonzero pixels outside of the 428x428 cropped area, but perhaps that could just be related to compression artifacts ? Here I flood filled all of the (0,0,0) pixels to show this:
1714598210329.png
But in this case it's still unclear how the desync related to the bottom 10% of the image might've occurred since it had to happen after the resize. One idea is that the two analog video tapes might've been digitized and merged directly onto one digital FWVGA video, so then the hardware that did that could've had some desync issues near the end of that processing pipeline. We also see this desync in GoFast below, but it only seems to affect the bottom ~5% of the image during its WH/BH transitions, so perhaps it slowly goes more and more out of sync, or this was from a different tape recording even if it happened during the same flight.


 
Last edited:
The bands might also be similar in Nimitz if it got resized after being recorded to the tapes, so not by some hardware/software on the jet but during or after the digitization of the tapes, if both in 2004 and 2015 there was some other common reason for resizing it to that same specific size.
True. However what I find interesting is that the match is between 1x zoom in 2004 and 2x zoom in 2015. I'm not sure I understand the proposed pipeline well enough to say whether that is expected behavior if the effect is induced after the fact.
If it was Hi8 then according to Wikipedia its resolution should be ~560×486 (NTSC) so it could comfortably fit a single 480x480 display. Video8 was only ~320×486 (NTSC) so if perhaps some earlier CVRS models used Video8 then the video would have to be downscaled, by more than the amount we ultimately got, but only horizontally. So could using Video8 instead of Hi8 explain the lower horizontal resolution that Mick noticed below ?
It's very plausible. Even at higher bandwidth we'd expect horizontal detail to be softer as analog video simply doesn't deal with horizontal detail in discrete units.
We also see some nonzero pixels outside of the 428x428 cropped area, but perhaps that could just be related to compression artifacts ?
I haven't found enough detail on WMV1 to say for sure that it's impossible. However, extracting the outer boundaries of nonzero pixels using the following.
Python:
def get_limits(img):
    assert(2 < img.ndim < 4)
    axes = (1, 0) if img.ndim == 2 else ((1, 2), (0, 2))
    indices = [np.any(img, axis=axis).nonzero()[0] for axis in axes]
    return indices[0][0], indices[0][-1], indices[1][0], indices[1][-1]

limits = np.array([get_limits(img) for img in frames])
y0, y1, x0, x1 = [limits[:, i] for i in range(4)]
extremes = [y0.min(), y1.max(), x0.min(), x1.max()]
The result was [16, 463, 96, 543]. With a presumed actual content area of 428x428 placed at 104:27 and 8x8 encoding blocks we shouldn't see static encoding artifacts outside of [24, 456, 104, 534]. Perhaps motion vectors could push some macroblocks a little outside of that. But how do we get all the way up to 16 pixels from the top?

outer_top.png
 
Back
Top