NUFORC sightings categorized by decade and shape

MonkeeSage

Senior Member.
I found that someone had scraped the NUFORC database (https://nuforc.org/databank/) a couple of years ago and thought it would be interesting to categorize the reports by decade and shape.

The NUFORC database is completely based on self-reporting, and sighting records are often added years after the reported event occurred, so take this with a big grain of salt. It may reflect inaccurate reporting and may not represent the true trends due to missing or incorrect data. Some reports don't give the actual date the event occurred and just use the epoch date of '1970-01-01 00:00 Local' so the actual sighting date is unknown, e.g., https://nuforc.org/sighting/?id=103828. But even if the data is imperfect I thought it would still be interesting to look.

The data set I used is here (I used nuforc.json):
External Quote:
147,890 UFO sightings from NUFORC, scraped on January 16, 2024.
Source: https://huggingface.co/datasets/kcimc/NUFORC

I wrote a python script to dump the JSON out to an sqlite database and a script to query the database for this information, excluding ambiguous shape categories like "changing" and "fireball" and filtering out any that use the epoch date as the date the event occurred. The number of reports with unambiguous shapes and valid report dates comes out to 72,566. The last decade from 2021-2031 is incomplete because the data was only scraped up to the start of 2024 and due to the galactic federation ban on time machine usage.

NOTE: All report counts below are excluding these shapes:

Changing
Fireball
Flash
Formation
Light
Other
Unknown

Total number of reports by shape:

CIRCLE : 14375
TRIANGLE : 13104
DISK : 8716
SPHERE : 7659
OVAL : 6371
ORB : 5924
CIGAR : 3756
RECTANGLE : 2614
CYLINDER : 2487
DIAMOND : 2118
CHEVRON : 1742
EGG : 1290
TEARDROP : 1238
CONE : 605
CROSS : 501
STAR : 184
CUBE : 37
------------
TOTAL : 72566

Number of reports by shape per decade:

Decade: 1901-01-01 to 1911-01-01 (5 reports)

DISK : 2
CIGAR : 1
SPHERE : 1
TRIANGLE : 1

Decade: 1911-01-01 to 1921-01-01 (3 reports)

DISK : 2
CIGAR : 1

Decade: 1921-01-01 to 1931-01-01 (6 reports)

DISK : 3
ORB : 2
CIRCLE : 1

Decade: 1931-01-01 to 1941-01-01 (13 reports)

DISK : 3
CIGAR : 2
CIRCLE : 2
OVAL : 2
CHEVRON : 1
CYLINDER : 1
RECTANGLE : 1
SPHERE : 1

Decade: 1941-01-01 to 1951-01-01 (129 reports)

DISK : 60
CIRCLE : 21
CIGAR : 12
OVAL : 12
SPHERE : 9
CYLINDER : 6
RECTANGLE : 3
TRIANGLE : 3
CHEVRON : 1
CONE : 1
ORB : 1

Decade: 1951-01-01 to 1961-01-01 (427 reports)

DISK : 173
CIRCLE : 65
CIGAR : 55
OVAL : 39
SPHERE : 31
CYLINDER : 15
TRIANGLE : 14
DIAMOND : 10
ORB : 7
RECTANGLE : 6
EGG : 5
TEARDROP : 4
CHEVRON : 2
CONE : 1

Decade: 1961-01-01 to 1971-01-01 (1261 reports)

DISK : 462
CIRCLE : 183
SPHERE : 131
OVAL : 130
CIGAR : 118
TRIANGLE : 75
CYLINDER : 36
ORB : 32
EGG : 28
RECTANGLE : 20
CHEVRON : 17
DIAMOND : 12
TEARDROP : 7
CONE : 6
CROSS : 3
STAR : 1

Decade: 1971-01-01 to 1981-01-01 (2268 reports)

DISK : 738
CIRCLE : 331
TRIANGLE : 281
OVAL : 221
SPHERE : 194
CIGAR : 182
RECTANGLE : 85
CYLINDER : 56
ORB : 51
CHEVRON : 36
EGG : 31
DIAMOND : 29
TEARDROP : 14
CONE : 13
CROSS : 5
STAR : 1

Decade: 1981-01-01 to 1991-01-01 (1771 reports)

DISK : 368
TRIANGLE : 361
CIRCLE : 222
SPHERE : 197
OVAL : 144
CIGAR : 101
RECTANGLE : 87
ORB : 72
CYLINDER : 55
CHEVRON : 54
DIAMOND : 48
EGG : 30
TEARDROP : 14
CONE : 13
CROSS : 4
STAR : 1

Decade: 1991-01-01 to 2001-01-01 (6596 reports)

TRIANGLE : 1624
DISK : 990
CIRCLE : 976
SPHERE : 721
OVAL : 529
CIGAR : 338
ORB : 226
CYLINDER : 215
RECTANGLE : 205
DIAMOND : 203
CHEVRON : 188
EGG : 172
TEARDROP : 116
CONE : 56
CROSS : 34
STAR : 3

Decade: 2001-01-01 to 2011-01-01 (22098 reports)

TRIANGLE : 4509
CIRCLE : 3850
DISK : 2855
SPHERE : 2378
OVAL : 2149
CIGAR : 1192
ORB : 1122
RECTANGLE : 747
CYLINDER : 721
DIAMOND : 699
CHEVRON : 560
TEARDROP : 489
EGG : 472
CONE : 210
CROSS : 145

Decade: 2011-01-01 to 2021-01-01 (30965 reports)

CIRCLE : 7153
TRIANGLE : 5370
ORB : 3449
SPHERE : 3396
OVAL : 2631
DISK : 2455
CIGAR : 1266
RECTANGLE : 1156
CYLINDER : 980
DIAMOND : 955
CHEVRON : 701
TEARDROP : 520
EGG : 435
CONE : 246
CROSS : 243
STAR : 7
CUBE : 2

Decade: 2021-01-01 to 2031-01-01 (7018 reports)

CIRCLE : 1536
ORB : 958
TRIANGLE : 840
SPHERE : 591
DISK : 572
OVAL : 503
CIGAR : 472
CYLINDER : 394
RECTANGLE : 301
CHEVRON : 181
STAR : 171
DIAMOND : 159
EGG : 111
TEARDROP : 72
CROSS : 64
CONE : 58
CUBE : 35
 
Interesting. I've been working on a report rating scheme to grade UFO reports as to quality of location info, date/time info, etc.
 
That is fascinating, but the bit that caught my eye was 72,566 cases, EXCLUDING those with a shape that wasn't really determined, and STILL no compelling evidence much less proof! I'd expect some darned good film and video by now if nothing else, given this many cases were close enough to see what the shape of the thing was -- something where you could clearly see what you were looking at, how it was generally put together and the like.

PS: I'd guess there would be a lot of overlap among "circle," "disk," "sphere," "oval," "orb" and possibly "egg" shaped UFOs depending on viewing angle and word choice of the witness, assuming actual objects with actual shapes were generating such reports. Possibly "triangle" and "chevron" as well, especially in cases based on lights plus assumed but not seen structure connecting them...
 
First thing I noticed was location only specified down to the level of the nearest city. That makes any sort of disambiguation or deconfliction difficult to say the least.
 
I put this info into a spreadsheet for easier visualization and I noticed that the numbers don't quite add up.

NUFORC_spreadsheet1.png


You list a total of 72,566 reports in your original post, but if you sum the decade totals you only add up to 72,560 reports and if you sum the total of shapes reported you get 72,711. In addition, the reports by shape in each decade do not sum to the total number of reports of each shape over time.
 
I put this info into a spreadsheet for easier visualization and I noticed that the numbers don't quite add up.

View attachment 86754

You list a total of 72,566 reports in your original post, but if you sum the decade totals you only add up to 72,560 reports and if you sum the total of shapes reported you get 72,711. In addition, the reports by shape in each decade do not sum to the total number of reports of each shape over time.

Likely due to the way I filtered the data. Here are the scripts used to generate the sqlite database and then query the database to produce that report.

To run them you just need the JSON file from here: https://huggingface.co/datasets/kcimc/NUFORC/resolve/main/nuforc.json?download=true

nuforc_db.py
Python:
#!/bin/env python

import json
import sqlite3

with open("nuforc.json", "r") as fh:
    nuforc_json = sorted(json.load(fh), key=lambda d: d["Sighting"])

# Example record
#  {
#    "Sighting": 114864,
#    "Occurred": "2014-09-21 13:00:00 Local",
#    "Location": "Huntsville, TX, USA",
#    "Shape": "Rectangle",
#    "Duration": "several seconds",
#    "No of observers": 1,
#    "Reported": "2014-10-23 11:11:17 Pacific",
#    "Posted": "2014-11-06 00:00:00",
#    "Characteristics": [
#      "Lights on object",
#      "Aura or haze around object"
#    ],
#    "Summary": "Rectangle shaped UFO ..."
#    "Text": "I observed a rectangle..."
#  },


con = sqlite3.connect("nuforc.db")
cur = con.cursor()
cur.execute(
    """
DROP TABLE sightings;
"""
)
cur.execute(
    """
CREATE TABLE sightings (id INTEGER PRIMARY KEY,
                        occurred TEXT,
                        location TEXT,
                        shape TEXT,
                        duration TEXT,
                        observers INTEGER,
                        reported TEXT,
                        posted TEXT,
                        characteristics TEXT,
                        summary TEXT,
                        report TEXT);
"""
)
cur.execute(
    """
CREATE INDEX idx_occured_shape ON sightings (occurred, shape);
"""
)

for sighting in nuforc_json:
    # print(sighting)
    cur.execute(
        """
    INSERT INTO sightings VALUES (?, ?, ?, ?, ?, ?, ?, ?, ? ,?, ?)
    """,
        (
            sighting.get("Sighting"),
            sighting.get("Occurred"),
            sighting.get("Location"),
            sighting.get("Shape"),
            sighting.get("Duration"),
            sighting.get("No of observers"),
            sighting.get("Reported"),
            sighting.get("Posted"),
            str(sighting.get("Characteristics", [])),
            sighting.get("Summary"),
            sighting.get("Text"),
        ),
    )
    con.commit()

nuforc_shapes.py
Python:
#!/bin/env python

import sqlite3

con = sqlite3.connect("nuforc.db")
cur = con.cursor()

shapes = []

res = cur.execute(
    """SELECT DISTINCT shape FROM sightings WHERE shape IS NOT NULL
       AND UPPER(shape) NOT IN (
         "CHANGING",
         "FIREBALL",
         "FLASH",
         "FORMATION",
         "LIGHT",
         "OTHER",
         "UNKNOWN"
       );"""
)
for shape in res.fetchall():
    shapes.append(shape[0].upper())

shapes = sorted(list(set(shapes)))
shape_counts = {}


print(
    "NOTE: All report counts below are excluding these shapes:\n"
    """
    Changing
    Fireball
    Flash
    Formation
    Light
    Other
    Unknown\n\n"""
    "Total number of reports by shape:\n"
)
for shape in shapes:
    res.execute(
        "SELECT COUNT(*) FROM sightings WHERE UPPER(shape) = ?", (shape,)
    )
    shape_counts[shape] = res.fetchone()[0]

for k, v in sorted(
    shape_counts.items(), key=lambda item: item[1], reverse=True
):
    print(f"{k:<12}: {v}")

res = cur.execute(
    """SELECT COUNT(*) FROM sightings WHERE shape IS NOT NULL
       AND occurred != 'Local'
       AND UPPER(shape) NOT IN (
         "CHANGING",
         "FIREBALL",
         "FLASH",
         "FORMATION",
         "LIGHT",
         "OTHER",
         "UNKNOWN"
       );"""
)
print("------------")
print(f"{"TOTAL":<12}: {res.fetchone()[0]}")


decades = (
    ("1901-01-01", "1911-01-01"),
    ("1911-01-01", "1921-01-01"),
    ("1921-01-01", "1931-01-01"),
    ("1931-01-01", "1941-01-01"),
    ("1941-01-01", "1951-01-01"),
    ("1951-01-01", "1961-01-01"),
    ("1961-01-01", "1971-01-01"),
    ("1971-01-01", "1981-01-01"),
    ("1981-01-01", "1991-01-01"),
    ("1991-01-01", "2001-01-01"),
    ("2001-01-01", "2011-01-01"),
    ("2011-01-01", "2021-01-01"),
    ("2021-01-01", "2031-01-01"),
)

print("\nNumber of reports by shape per decade:")
for decade in decades:
    start, end = decade
    res.execute(
        """SELECT COUNT(*) from sightings
           WHERE occurred != 'Local'
           AND occurred > ?
           AND occurred < ?
           AND UPPER(shape) NOT IN (
             "CHANGING",
             "FIREBALL",
             "FLASH",
             "FORMATION",
             "LIGHT",
             "OTHER",
             "UNKNOWN"
           );""",
        (start, end),
    )
    decade_total = res.fetchone()[0]
    print(f"\nDecade: {start} to {end} ({decade_total} reports)\n")

    shape_counts = {}
    for shape in shapes:
        res.execute(
            """SELECT COUNT(*) from sightings WHERE UPPER(shape) = ?
               AND occurred != 'Local'
               AND occurred > ?
               AND occurred < ?""",
            (shape, start, end),
        )
        count = res.fetchone()[0]
        if count > 0:
            # print(f"{shape:>12}: {count}")
            shape_counts[shape] = count

    for k, v in sorted(
        shape_counts.items(), key=lambda item: item[1], reverse=True
    ):
        print(f"{k:<12}: {v}")
 
Back
Top