The Once and Future Wallace

 

Simulations: Randomly Generated Classes on Spheroid Surfaces.


     Early on in the simulation studies I generated two structures that I needed for attacking problems related to the distribution of various elements across the surface of the earth. The first was a reasonably simple algorithm for identifying surface locations on a spheroid (that is, as a known distance at a certain latitude from the center of the spheroid to points along that latitude). As is well known, the earth is not quite a sphere, being flattened at the poles by about 26 miles--nor is it technically a spheroid, exactly, but the difference for my purposes was trivial. In the end I modified a program for locating points on a sphere to one for a spheroid which was accurate to about one mile as compared to the actual condition; thus I am accepting a one part in 26 or nearly four percent error in terms of the flattening itself. The overall error as compared to the whole earth is, however, only about a maximum of two parts in 7914, the earth's mean diameter in miles, and is very likely inconsequential at the scale of sampling I will be describing.

     The second and much more difficult problem (at least if you had no one to turn to for help at the time, which I didn't) was to generate a maximally spaced sampling grid of 1000 points spread out over the entire spheroid surface. I began with a guess-timate assignment of 1000 points at particular whole longitude/latitude combinations, with successive manual adjustments based on the mean arc distances (and variances) that could be calculated from each point to all the others. After just a few iterations those general parts of the spheroid which clearly had "too many" sample points quickly evaporated, leaving me with perhaps another dozen iterations to get things down to a pretty close representation of maximally-spaced sampling.

     I do not mean to suggest that these exact approaches should be applied in a possible next round of more precise analyses, but I am quite confident that these starting points were more than adequate for present purposes, as (1) I only wanted to get a general idea of what would play out, and (2) the data I would connect to them in the real world analyses are in any case comparatively less precise than this anyway.

     Only one kind of simulation was attempted here. I first randomly assigned class membership in one of four groups to each of the 1000 points, then calculated the spatial autocorrelation properties of the resulting sets. The resulting four by four matrices of values were then double-standardized.

     I employed three different variations of the spatial autocorrelation algorithm, but with this many points involved, random assignment is a very powerful influence, and very little organization emerged in the relation of the groups to one another. This was easily seen, as all the input matrices calculated consisted, regardless of the spatial autocorrelation algorithm, of values that varied from one another by only a few percent. A typical input matrix was:

1.728 1.698 1.745 1.703
1.698 1.729 1.712 1.719
1.745 1.712 1.695 1.720
1.703 1.719 1.720 1.720

     I ultimately ran a total of 1500 double-standardizations (500 for each spatial autocorrelation measure); 16, 16, and 19 of the four by four input matrices eventually passed the test of producing the right type of symmetric double-standardized values. This comes to about 3.4 percent. Thus, rather more pass the test on this kind of surface than under conditions in which purely random numbers are fed into the d.-s. algorithm, and probably around the same as on arbitrarily delimited flat surfaces.

     There are several important things to take away from this, but none of them involve absolute verdicts. First, and most importantly, even when faced with near total randomness of pattern, some of the resulting pattern summaries (i.e., 4 by 4 matrices) calculate out as the kind of structure that projects as a three dimensional space.

     Second, however, the exact number that will under these basic conditions may vary widely depending on the distance metric used (e.g., direct line Euclidean distance or arc distance along the surface), and the density of sampling of the surface. Concerning the latter, it should be apparent that if I attempted to set up four classes of points using only, say, a total of fifty evenly spaced sample points, by chance some of the classes set up will have only a handful of points and thereby greatly affect the relative magnitude of the values going into the summary 4 by 4 matrix. Thus, one must be careful to consider how conditions will change according to density of sampling.

     Further, it is very likely that other class assignment algorithms will produce very different results altogether. I only used a random form of assignment here, but one can imagine any variety of other types (and, of course, on variously shaped surfaces). One such exercise follows in the next section.

_________________________


Continue to Next Essay
Return to Writings Menu
Return to Home

Copyright 2006-2014 by Charles H. Smith. All rights reserved.
Materials from this site, whole or in part, may not be reposted or otherwise reproduced for publication without the written consent of Charles H. Smith.

Feedback: charles.smith@wku.edu
http://people.wku.edu/charles.smith/once/spa4.htm