Binned probability distributions (BPDs) summarize big datasets.
This page is a tutorial/discussion on fractal and lacunarity analyses using BPDs with FracLac. The page has judged you and thinks you know how to use FracLac but could do with some discussion of a few nitty gritty details. Should the page be wrong and you need less grit and more background, try the general explanations of box counting and lacunarity.
Are you wondering about certain unpronouncable abbreviations in this manual and in your results files? Do the odd combinations of letters "BPDλ", "BPDL", and "BPD" strike fear in your heart? Worry no longer. Help is here. You won't ever find out how to pronounce it, but here you will learn how to use BPD for fractal and lacunarity analyses.
Despite what I just said above, you actually can pronounce BPD, if you consider saying "binned probability distributions" a reasonable pronunciation. But what is a BPD? The binned probability distribution is a summary of all of the data for pixel mass from a box count. The data have been sorted and poured into their respective bins so you can keep track of them in big, general groups instead of as thousands and thousands and thousands…well, anyway, lots of individual values. What you get when you order the BPD with your FracLac is a separate distribution for each ε.
The binned distribution
is used much like the raw data for finding
fractal dimensions and
lacunarity for an image.
The only real difference is that we use statistics from
the distribution instead of the raw data.
That statement, however, has some
important caveats attached.
To elaborate, whereas probability
distributions are very popular and we see them everywhere
trying to convince us to believe in fanciful ideas like
"the mean income" of a nation, there is no definitive
distribution for any set of data. The sizes of the bins,
for instance, define the way the distribution spreads out.
In FracLac, the user can manipulate the
bins in the
options panel for the relevant scan.
Bins are important, but before we can really know how they affect our results, we need to know a bit about those results. The next section walks you through the basic steps of using BPDs in FracLac.
Distributions of Pixels Per Box
Some of the data from the BPD file for a 136 x 141 pixel image. Each curve represents the binned distribution of pixels per box at a particular box size.
What you see in the image above is a graph of some of the data you would find in the BPD file for a 136 x 141 pixel image. Each coloured curve in the image represents the binned distribution of pixels per box at a particular box size. Now that's a lot of pixel distributions to be messing around with, no? This section tells you how all those curves can be distilled into a number for the fractal dimension and one for lacunarity for the image. It also gives you some pointers and shows you where to find all the data you might need as you analyze away. You will see various calculations—that will help if you need them to write your paper; but you don't need them to use the program.
The fractal dimension from BPDs that you are going to learn about here is not the regular FracLac Dʙ, but a type of DF called a mass dimension or BPDDʍ. For brevity, we can drop the BPD and use "Dʍ" here. It is a mass dimension because it comes from mass instead of , the number of boxes it took to cover all the foreground pixels or count as a proxy for detail when using the basic method of deducing a fractal dimension, finding the limit as the slope of the ln-ln plot.
The mass you use is the mean of the BPD for any ε. Don't be frightened. You don't have to calculate it - FracLac does that. You just need to know about it in case someone asks what the Dʍ you are feeding them is made of.
In the case that someone does ask, tell them all you did was to add up all the probabilities times their midpoints from the BPD. Or if they are someone grand and complicated, show them the complicated looking equation for the mean of the probability distribution at some ε:
BPDμ(ε) = (for i=1 to Bins)Σ (m(i,ε)× p(i,ε))
where m is the mass or midpoint and p the Probability reported in the Probabilities and Masses file.
The next step toward finding the Dʍ, one number for all box sizes at this orientation, is to find the limit or the slope of the ln-ln regression line. You don't have to do that yourself, because FracLac can graph the regression line for you. And you could use all those BPDμε to find the slope for the data set, but you don't have to calculate that yourself, either, of course. It will be waiting for you in the Data File.
If there are multiple grid orientations used, there will be several columns of μ for BPD. In that case, to find one value for the entire image, you average all of the Dʍs over all grid orientations.
What we call BPD lacunarity is a measure of relative variation, essentially the same as the other basic forms of λ. It depends on the coefficient of variation, except, of course, the data have been sorted prior to calculating. BPDλ is found from the same mean, BPDμε, that the BPDDʍ is found from, along with the standard deviation, BPDσε, of the BPDε.
The standard deviation is:
BPDσε
=√BPDvε,
where BPDvε
is the variance(ε)
and the equation for the variance is:
BPDv(ε) = (for i=1 to Bins)Σ (m(i,ε) − BPDμ (ε)) ² × p(i,ε)
The whole thing, BPDλ for one ε is calculated:
BPDλ(ε) =
(
(for i=1 to Bins)Σ
[m[i, ε]² ×
p(i,ε)] −
BPDμ(ε)²
)
/
BPDμ(ε)²
You can't stop there, though. Now that you know essential BPDλ, you can take your new knowledge to learn how to find the various levels of λ including the slope and mean for the whole image. Before you do, however, there is one last point to ponder…
The second last topic on this page is the number of bins. Earlier, we noted that there is no definitive distribution. What does this mean for the results you get with FracLac? There are two points to be aware of when answering this question. First, it is common practice in statistics to use no less than 5 bins when making frequency distributions. This practical rule translates into FracLac parlance well enough. Too few bins and you get strange results.
Second, it is important to note that although the number of bins selected does not change with each ε, the bin midpoints are different for each grid calibre—that is, FracLac uses a different set of bins for each ε, rather than one set of bins for the entire image. FracLac uses 1 as the smallest possible bin size and determines the maximum based on the user's choice and the image itself. The actual bin midpoints at each ε are printed in the data file, as discussed above. The point of calculating bins this way is to generate for all εs over an image relative distributions where the combined results from the distributions are robust. The images shown below illustrate this using 4 bins and 40 bins for the same image.
An important related topic. EBPD and EBPDΛ. This discussion has not touched on the empties BPD, which is another way of looking at an image with box counting. In sum, it accounts for the 2-d space an image occupies a bit differently than does a regular box count.