Tuesday, November 11, 2014

The Power Law of Slavery

This post may not be about what you think. The words "power", "law", and "slavery" are accurate but polyvalent. Here I discuss a type of mathematical distribution that appears in slave ownership in the antebellum southern United States. The form the distribution takes is called a power law.

A few weeks ago I was watching the Antiques Roadshow (American edition) on public television, and they showed a pre-Civil War historic map--I think it was of a county in the state of Virginia--, and in the legend I saw a table that listed the number of slave owners who owned given numbers slaves. I only got a glimpse, but I noticed that there seemed to be a lot of people who owned a few slaves and a tiny number of people who owned large numbers of slaves.

I am interested in the distribution of wealth, in which the right tail--representing the wealthy--is a power law, or Pareto, distribution, while the left part of the curve seems to fit an exponential distribution, although other mathematical functions have been proposed. Of course, I'm particularly interested in historical and especially archaeological examples. Archaeologically, the emergence--presumably through self-organization--of the power-law tail in the distribution of wealth appears related to the evolution of economic complexity.

Of course, in the Old South, slaves were economically significant assets, and I thought that slave ownership might be an interesting proxy for wealth, ghoulish as that certainly is.

The table on the map suggested a very skewed distribution, which could be a power law. So, I did some Googling, looking for data, and found the 1860 census, which listed number of slave owners by number of slaves owned. In the following table, the first and fourth columns come from the 1860 Census (Vol. III, Agriculture, p. 247).


No. of Slaves  Midpoint of Bin Width of Bin No. of Slaveholders Normalized frequency Log(Midpoint) Log(Normalized frequency)
1 1 1 77333 0.2025251 0 -0.693521113
2 2 1 43105 0.1128864 0.301029996 -0.947358321
3 3 1 34859 0.0912912 0.477121255 -1.039571046
4 4 1 28979 0.0758922 0.602059991 -1.119802576
5 5 1 24278 0.0635809 0.698970004 -1.196673064
6 6 1 20632 0.0540325 0.77815125 -1.267344642
7 7 1 17280 0.0452541 0.84509804 -1.344342233
8 8 1 14864 0.0389269 0.903089987 -1.409750274
9 9 1 12522 0.0327935 0.954242509 -1.484212271
10-14 12 5 40388 0.0211542 1.079181246 -1.674603628
15-19 17 5 21322 0.0111679 1.230448921 -1.952028036
20-29 24.5 10 20796 0.0054462 1.389166084 -2.263906162
30-39 34.5 10 9648 0.0025267 1.537819095 -2.597448676
40-49 44.5 10 5179 0.0013563 1.648360011 -2.86764006
50-69 59.5 20 5218 0.0006833 1.774516966 -3.165411892
70-99 79.5 20 3149 0.0004123 1.900367129 -3.384743306
100-199 149.5 100 1980 5.185E-05 2.174641193 -4.285220781
200-299 249.5 100 224 5.866E-06 2.39707055 -5.231637952
300-499 399.5 200 74 9.69E-07 2.601516784 -6.013684247
500-999 749.5 500 13 6.809E-08 2.874771637 -7.166912623
>1000* 1499.5 1000 1 2.619E-09 3.17594647 -8.581885971

*Actually, the largest slaveholder said to be the Estate of JOSHUA J. WARD, at SC, Georgetown, roll 1235 page 212, holding 1,130 slaves (http://freepages.genealogy.rootsweb.ancestry.com/~ajac/biggest16.htm).

The data are in a mixed format. If you look at the first column, you will see that some of the "intervals" are simple integers (1, 2, 3...) while the higher categories are intervals or ranges (called "bins" by statisticians). To adjust for the effects of the different bin widths I normalized the data (See Brown and Liebovitch 2010: 11-14). Graphing the last two columns on a double logarithmic graph (in which both axes are logarithmic) is a simple way of establishing whether the distribution is a power law: if the data form a straight line on a double log graph, the relationship is a power law. (Actually, I logged the data rather than logging the axes of the graph.)

So, here is a graph of the last two columns (which were already logged in the table above).

Figure 1. Double logarithmic graph showing the distribution of slave ownership.


The left tail of the graph has a gentle curve while the right tail is a straight line. Let's look at that right tail more closely by isolating just the last few rows of the table.

Figure 2. Double logarithmic graph showing the power-law right tail of the distribution.
The coefficient of determination for the regression line (the R-squared) is quite high. The maximum is 1, so our fit is nearly perfect. This statistic provides a measure of the goodness of fit between the straight line and the data.

Curiously, though, the left tail is not a good fit to an exponential. In Figure 3, below, the left tail would be a straight line on this semi-logarithmic graph if it were an exponential. (A semi-logarithmic graph is one in which only a single axis is logarithmic, while the other is linear.)

Figure 3. Semi-logarithmic graph of the right tail of the slave ownership distribution, which exhibits a poor fit to an exponential distribution.



In fact, there may be another power-law segment in the middle reach of the data. If we take just the middle few points of the data set, above (left) of the ones in Figure 2, and plot them on a double logarithmic graph, we see another straight line, but with a different slope than the first.

Figure 4. Double logarithmic graph showing a power law relation for the middle portion of the slave ownership distribution.

Now if we take just the points to the left (above) these, we do get an exponential distribution for the left tail provided we drop the first point, which is weird for some reason that I haven't figured out.

Figure 5. Semi-logarithmic plot showing an exponential distribution for the left tail of the slave ownership distribution excluding the first (leftmost) point.


The fit to an exponential in Figure 5 is really quite excellent.

Perhaps if we were to include the number of non-slaveholders, that is, those with 0 slaves, we would be able to extend the distribution father to the left and understand that part of the process better.

Why should we care about the mathematical form of the distribution? The form of the distribution is a clue to the type of process that created it. There is a pretty substantial literature on the economic processes that create distributions of wealth. Unfortunately, there is no consensus on either the best mathematical forms of the distributions or the dynamical processes that have created them. They do, however, seem to share a general shape much like the one discussed here. I find that interesting. 

There is a large literature on the economics of slavery, both in the Old South and elsewhere in the world. Perhaps someone who is immersed in that literature will understand the historical significance on these results. I, unfortunately, don't have the time to delve into the literature and figure it out.

Please let me know, by commenting on this post if nothing else, if you have any thoughts.