Perspective is difficult, both to understand and to explain. This was recently brought home to me listening to flat earthers Nathan Oakley and Anthony Riley attempt to explain this photo of mountains in Washington and Oregon: All decent analyses show that it aligns perfectly well with a globe earth model and is completely incompatible with being on a flat plane, yet the gentlemen in question demonstrated that they are unaware of how viewing angles are calculated, and how perspective works. The question, then, is how do we explain how perspective works mathematically; show that calculators do include perspective; and help people who don't understand what perspective is, who struggle with maths, and who are mistakenly convinced that they already have the answers? Theory The theory and methodology is fairly straightforward: viewing angles (angle of elevation) between two points can be calculated using trigonometry. These angles will show where something will appear in a photograph, or in our actual field of vision. Larger angles will appear higher and smaller angles will appear lower. To calculate the angle, all we require is distance and elevation. Here is a picture demonstrating this (angles, distances, and elevations are not to scale): This shows how the line of sight from the observer in the bottom left corner to each of the peaks forms the hypotenuse of a right-angled triangle, which can then be calculated by tan(x)=opposite/adjacent (that is, elevation over distance). The angle to Mt Rainier is not the largest because it's the tallest mountain, but because of a combination of distance and elevation: if we bring Mt Adams closer to the observer, for example, to the position where Mt Hood is, the angle to its peak will be larger than the angle to Rainier, and it will be predicted to appear highest in a photograph (assuming a flat plane): Verifying the theory using distant mountains, however, is difficult to do. So we need a way to do this that is available to anyone. Example To demonstrate that calculators include perspective, all we need are a list of distances and elevations of some known landmarks and a photograph of these landmarks. Any photograph of a flat-ish street will do, but it would be best if it included buildings of different heights, with some taller buildings in the background. This one of 5th Avenue in New York, taken at the intersection of E/W17th Street, should be a good candidate (building numbers added): Here are some of the landmarks seen in this picture: Empire State Building - distance, 4270 feet; height, 1454 feet to tip, 1250 feet to roof HSBC, 145 5th Avenue - distance, 867 feet to nearest corner, 959 feet to turret; height, 165 feet (roof), 200 feet (turret) 119 5th Avenue - 250 feet to near corner, 455 to far corner; estimated height 130 feet Flatiron Building, 175 5th Avenue - d. 1240, 1462; h. 285 feet 245 5th Avenue - d. 2697; h. 308 feet Langham Place, 400 5th Avenue - d. 4940; h. 632 feet 425 5th Avenue - d. 5460; h. 618 feet Camera height I believe to be very close to 6 feet - certainly within a foot or so - based on the parallel lines in the images, the vehicles, and the people's heads. Now let's put those figures into a calculator and find the predicted viewing angles by using tan(x)=elevation/distance. The largest angle indicates which building will appear highest in the photo, and so on: Looking at the photo, we see that the predicted apparent height order and the actual apparent height order are the same:

The above post represents the easy version, but we can take it to the next level and look at the angles in more detail. If we place a line at the camera height to represent zero degrees and one at the tip of the Empire State Building to represent 18.7°, we can create a scale, such as might be seen in a theodolite: This is based on the pixel height of eye level (417) minus the pixel height of the tip of the Empire State Building (8) divided by 18.7. This gives a count of 21.83 pixels per degree, which allows us to calculate the approximate apparent height at which each point is predicted to appear: I would add each of the predicted points to the image, to compare with the actual points, but it seems a bit redundant, given how close they are: most of them are pretty much bang on, with only three of the ten points varying by more than four pixels, and none by anything approaching significance. I think this well and truly proves the point that perspective can be calculated; that apparent height order can be determined; that both curve and plane calculators do already account for perspective; and that very accurate positions in photos can be predicted merely from the distances to and elevations of landmarks. Spreadsheet containing the calculations used above is attached.

For completion, here is the above method applied to the South Sister photo, flat first: While the above photo shows the complete incompatibility of the flat earth 'model' with reality, the results for the spherical earth are almost perfectly aligned: Calculator attached.

This is what i consider a "simple explanation" of "perspective calculations. To calculate Perspective "size" from an observer (point A) to a mountain peak (point B), you divide the height by the distance. Height ÷ Distance Perspective means objects get smaller the further from you they are. The greater the distance from the observer, the smaller the object will appear. Fieldset Ex. Both Mountains are 1000 feet high. Mountain 1 peak is 1000 feet high. Mountain is 5000 feet from the observer. 1000 ÷ 5000 = 0.2 Mountain 2 peak is 1000 feet high Mountain is 2000 feet from the observer. 1000 ÷ 2000 = 0.5 0.5 is bigger than 0.2 Mountain 2 looks bigger than Mountain 1 because it is closer to the observer. If you want to determine the elevation angle in degrees from the observer to the mountain peak, you use a scientific calculator. https://www.desmos.com/scientific

Or Google - which I find much easier nowadays. https://www.google.com/search?q=atan(1000/5000)+in+degrees

That's quite useful, and does work for giving the predicted apparent height order on a flat plane. For example, doing "elevation above/below observer" over "distance" gives: All in the same order, either way one does it.

While that is indeed the simples math, and the way I normally do it (divide by distance!!!) That diagram is just gobbledygook to most people.

so if i see something like that "P=(x,f)" that comma means divide? thats why i showed a real calculator because alot of people (ie me) only know 'divide' by the division sign or if the numbers are on top of each other.

Those are coordinates of a point, P isn't a value, it's a point, defined by two values (x and f, or X and Z) P=(x,f) means "the point P is a distance f horizontally from O, and a height x above O" It's a bad diagram in many ways. The "image plane" is essentially the back of the pinhole camera, where the image is projected. But here it's in front, which is perfectly normal when you are doing the math, but it's not really clear what is going on.

The division is here: It's using a key (and very simple) thing called "similar triangles". You can see there's two traingles in the diagram, this one made by the actual point. And the one where the line to the O point (the pinhole) intersects the image plane They are "similar", meaning they have all the same angles. That means the ratio of the sides is the same (it's just the same triangle scaled up). Hence you get x/F = X/Z, or you could say x/X = f/Z, same thing. Then you just rearrange to get a solution for x. x gives you the size of the object in the image plane. f is like the focal length in a camera. x is like the size of the object projected onto the sensor.

Perspective comes from projection. Projection in the sense of a camera is kind of like a slide projector in reverse. Instead of the light from the slide being projected on a wall, the light from a scene is projected (via the lens) onto the film. Projection is all about straight lines and similar triangles. There's a view cone called a "view frustrum", which is a good word to search for explanatory images. Here I'm projecting a view of NYC from across the river. The green is the frustrum. The red lines go through the tops of some buildings You could stick a view plane anywhere across the frustrum, and you'd get exactly the same image, because the red lines go through the same point (just scaled). Have a look at it in GE with the attached.

At large distances, the result is essentially the same. Just saying "distance" should cover the basics. Here we are concerned mostly with longer distances (like, to the horizon and beyond) For a while today I was typing fustrum, then changed to frustrum. I've been getting this word wrong in a variety of ways for decades.

The calculator I've used above on the 5th Avenue photo is based purely on flat plane trig without refraction. But I thought I'd see how the sphere earth calculator fared, with refraction. Predicted apparent height order is the same (i.e., correct) and the pixel height predictions were very similar. What I immediately notice, though, is that the sphere earth calculator predicts the objects as being higher than the flat plane calculator, whereas they're usually predicted lower. What I found was that something like the tip of the Empire State Building is predicted to be higher with the sphere earth calculator until it's 3.1 miles away, where they're the same. More distant than that, and it will be predicted to be lower. Not sure whether that reflects reality, or a need to fine tune the equations. The second thing I notice is that they're all still predicted very accurately - I'd call within ten pixels "very accurate" - except the same one that the flat plane calculator has as an outlier, the turret of 145. This leads me to believe that the turret is probably a little big higher, or a little bit closer, than the figures I first used. Changing the refraction coefficient makes no difference to the angles. Also, if I change my marker pixel to the roof of the Empire State, rather than the tip, the results are even more accurate. So perhaps a clearer landmark for marker is best. 5th Avenue, by the way, slopes up very, very slightly. Not much for most of this length, but around 20 feet higher up by the Empire State. It doesn't make any significant distance to these calculations though. https://caltopo.com/map.html#ll=40.74087,-73.99031&z=15&b=t

@Rory says: I think it has been mentioned before that a simple ratio of pixels to visual angular size is a very good approximation when the angles are relatively small, but may be inaccurate (as compared with a photograph) when the angles are large. Very tall buildings, close to the observer, may be a case in point. In your New York photo some of the angles are over 15 degrees. I haven't worked out the maths, but this might be why some of your predictions are a bit off.

Hang five: I'm pretty sure it's the latter. I had a bit of an issue with the obstruction calculator that I did, where it worked differently for points above or below eye level, and the same may be happening here (mountains below, buildings above). Creases being ironed as we speak.