Now technically speaking, we are not dealing with perfectly two-dimensional systems here: there is vertical range in the topographical component of any drainage basin, and beyond this the general curvature of the earth. Nevertheless, if one begins with relatively small systems the curvature complication becomes negligible (and in any case may be argued irrelevant on gravitational grounds), and in this instance the topography itself is the measured variable whose pattern in two dimensions is really what we are looking at. (For those who are still not convinced, I actually did run a parallel set of analyses to the ones described below in which I calculated distances between sample points not on the basis of their relative two-dimensional coordinates, but instead within three dimensions: sample point as intersecting the earth at x feet above sea level. The results differed only to the most trivial degree from the strictly two-dimensional ones.) What we are left with is a system comprised of areally distributed potential energies; that is, a "field" of varying elevations above sea level every location within which has a potential for doing work that is directly proportional to its elevation. In the case of the earth's internal zonation described in the last write-up, the four zones had evolved as a function of very large-scale forces exerted rather evenly and consistently over many millions, even billions, of years. Stream basins, however, are dynamic over a much shorter period of time and of course are less closed: with changes in sea level due to glaciation episodes or major tectonic events (or even local events such as stream piracy) energy conditions across any basin may change relatively rapidly. Thus, equilibrium within its surrounding environment may prove, even if it becomes possible under just the right conditions, fleeting. As a result, one finds few systems within which all portions of the basin are evenly balanced in terms of being exactly transitional between erosional and depositional environments. As
we will see in the next write-up, the conditions of non-equilibrium within
stream basins provide a number of potential secondary tests of the model
under discussion here, but for the moment we will concentrate on the primary
issue. If indeed the areal distribution of potential energies (elevations)
within such systems can be interpreted through the model, we should expect
that a(n efficient) classification of the elevations into four maximally-different
ranges of elevation will yield patterns whose class-level spatial autocorrelation
properties, again represented as a four by four matrix of values, will
double-standardize into a symmetric arrangement of z scores. This should
be true of In
this instance there will be no [A
methodological note: Those familiar with the use of information statistic-based
nonhierarchical clustering algorithms are aware that the solutions are
reached through an iterative process. This means they are prone to accidentally
becoming "entrapped" within local minima, and reporting suboptimal
results. The way around this problem is to initialize the process at a
variety of starting configurations, and ultimately select the particular
solution that accounts for the most variation in the data. In the many
analyses I performed here (two- through six-class solutions for each of
the twenty-five basins) this was done Once this basic plan of investigation was thought out, twenty-five drainage basins were chosen for examination. Over half of the basins selected were from northeastern United States 7.5 minute (1:24000) USGS quadrangle series maps, with the remainder from other U.S. areas and map series at various scales. Data collection was painfully manual, with transparent triangularly gridded overlays: each pinpointed sample location fell randomly between successive contour lines, requiring a careful act of interpolation to retrieve an actual (estimated) value. In this first round of studies, the number of points sampled varied from less than three hundred on the least sampled drainage basin to 550 on the most sampled one. These choices were a gamble: I figured that for an initial look at the matter a range of sources, geographical locations, and sample grid densities should be employed lest complaints be raised on this basis. However, it was unclear initially just how dense a sampling of the surfaces would be needed to reasonably capture the essence of the organization postulated. Table 1 provides basic stats on the basins studied: Table 1. Drainage basins used in the study. In the regression studies described in the next writeup, the dependent variable (Y) consists of the means of the correlation coefficients associated with each system's corresponding set of spatial autocorrelation scores; the X
Results? 24 of the 25 basins yielded double-standardized scores that were in fact symmetric in the sense anticipated. The one that didn't, "nearly converged" (especially after denser sampling was applied), but was the most problematic of the 25 to begin with because: (1) it was much the smallest basin looked at (only about 0.6 square miles in extent) (2) its drainage divides with surrounding basins were among the most poorly defined of the 25, and, most importantly, (3) ten or twenty percent of its area (per an on-site field check) had been artificially re-landscaped to support the building of a high school at the crest of a hill. Inasmuch
as the basins studied included a wide variety of system sizes, and evolved
under a considerable range of climate types and geologies, I have little
doubt at this point that just about any such system, if sampled adequately,
will yield the same results (though special cases such as karst landscapes
and/or areas of internal drainage may present additional challenges to
predictive modelling). A further interesting discovery concerns the relative lack of variation among the arrays of z scores produced from the 25 analyses: most of them look pretty much like one another. Were much denser samplings of the basins taken, the resulting refinements in characterization of internal organization might well expose even greater similarities--and, conceivably, lack of variation altogether. As mentioned above, it should not be surprising to find here that, unlike the internal zonation patterns of the earth, there is no obvious division of each basin into four zones of elevation: simply, drainage basins are much more open systems that continually respond to all sorts of upsetting conditions that would prevent them from attaining conditions of dynamic equilibrium resulting in obvious, permanent, zonations. That said, however, a close enough look at a basin's organization might yet reveal some tendencies in that direction. As it turns out, this particular data set does exhibit evidence of such. First, there is the overall average of the mean r values connected with the correlation matrices for the twenty-five spatial autocorrelation matrices data that ultimately were double-standardized. In the two-dimensional simulations reported earlier, the mean r's across the eighteen groups of four-class solutions ranged from .029 to .198, with only one mean being below .050 (see the table in the 'Simulations: Two-Dimensional Systems' essay). Across the main (the first of the two "model #2" variations shown in Table 2 below) model for the twenty-five analyses reported here, the mean r values were: for three-class classifications of the sample elevations, .0572; for four-class classifications, .0348; for five-class classifications, .0326; and for six-class classifications, .0298. Notice, then, the rather lower mean for the four-class real world systems than for the simulated systems. Table 2. Statistics comparing the spatial and aspatial models. See text for explanation.
The lower portion of Table 2 puts into better perspective the importance of the results obtained here. In the top half, the actual amount of reduction in the mean correlation/variation values as one increases the number of classes of elevations in the analyses. For example, for spatial autocorrelation model #2, "mean" for three-class classification is .0572, and for the four-class classification, .0348 (both as shown in the upper portion of Table 2); the difference is .0224, and this latter value appears in the corresponding spot in the top half of the lower portion of the table. In the column at the far right of the lower portion of the table, the number '0.148' appears; this corresponds to the increase in variation explained in the initial series of cluster analyses as one moves from the two-class analysis (.705) to the three-class one (.853). In the bottom half of the lower portion of Table 2, these values are translated into percentage improvements as the mean spatial autocorrelation values approach zero, and the variation explained approches 1.0. In the latter instance, and for example, the increase from a two-class to a three-class clustering model has absorbed just over fifty percent (50.20%) of the remaining variation (that is, .502 of the difference between 1.0 and .705, or .148). What
all of this very clearly shows is that there is a clear difference between
the spatial (spatial autocorrelation) and aspatial (nonhierarchical cluster
analysis) models in terms of the pattern of reduction of unexplained variation
as number of classes imposed increases. Over the twenty-five analyses,
the pattern of increase in variation explained as more classes are added
to the classification exercise is a rather smooth one, with the proportion
of the remaining unexplained variation decreasing at a fairly uniformly
decreasing rate--this is not surprising. However, notice what happens
in the spatial autocorrelation models (I have included the results from
all three I used here, though it is the second such model that invariably
gives the most efficient and consistent results): the increase from three-
to four-class models is large, but further addition of classes produces
relatively small increases. In two of the models, there are actual If one turns to (s. a. model #2 of) the four-class solutions alone and examines them individually, from basin to basin, something else very interesting emerges. For the twenty-four basins whose patterns double-standardized to symmetric results, I correlated their associated mean r value (i.e., of the correlation matrices for the spatial autocorrelation scores, as listed in Table 2) with each basin's "additional variation explained going from three-class solution to four-class solution" value. On the average, and per Table 2, this latter number is .9122 - .8530 = .0592, but of course it varies from basin to basin. The question was, is there a significant relationship between our measure of system equilibrium, and the degree to which the statistical explanation of the classification structure deviates from an assumption of no structural controls? The correlation coefficient of the relationship turns out to be a rather high r = -.7125 (with the sign being the anticipated negative), significant at alpha = .001. By contrast, the parallel relationship for going from a four-class solution to a five-class one produces an r of -.2777, which is not significant at alpha = 0.1. So, in these data, at least, it appears that in the four-class solutions in particular there is a heightened general connection between structural peculiarity of the system and its level of measurable equilibrium. Entropy
maximization modelling within geography has a long history, extending
back to the derivations of Alan G. Wilson in the late 60s. The work described
here differs starkly from this tradition not in its basic statistical
methodology, but in looking toward an In any case, I submit that these constitute very interesting results--results that strikingly support the spatial structure model being explored here. In the next write-up, I report a series of secondary analyses on the same basins that are equally revealing. _________________________
Copyright 2006-2014 by Charles H. Smith.
All rights reserved. Feedback: charles.smith@wku.edu |