--- title: "Vignette >Extended discussion<" author: "Nils Müller-Scheeßel" date: "October 2017" output: rmarkdown::html_vignette bibliography: ../inst/REFERENCES.bib vignette: > %\VignetteIndexEntry{Vignette >Extended discussion<} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ## Introduction Life tables have only a marginal position in archaeological research, for different reasons. Columns with rows of multidigit numbers carrying strange labels can be daunting, and indeed the mathematics behind the construction of life tables based on modern census data are not for the faint-of-heart. Compared to life tables based on real census data, archaeological ones are more straightforward, but their interpretation carries its own problems. First and foremost, they pretend to have a degree of exactness which is not supported by the underlying data. This is another reason, why life-tables are considered with a certain degree of suspiciousness by archaeologists. And finally, despite the fact that the construction of archaeological life tables can be straightforward, it involves a lot of computation and number-crunching compared to other methods used in the archaeological discourse. To our current knowledge, an easy to use and accessible tool for computing life tables is missing. This is why we sat down and created mortAAR for R. We hope it will be of use for archaeologists world-wide. From an outer appearance, life tables from modern census data and archaeological ones, like generated by mortAAR, look very similar. For modern census data, though, there are far more sophisticated measures available. We cannot go into more detail here but refer the reader to the abundant literature on demography and life table construction. However, perhaps even more important than the different methods of computing, are the differences in input data. Due to infrastructure, modern census data of developed countries can be regarded as fairly complete and comprehensive. This is, of course, not true for archaeological data. Life tables from modern census data are important tools for inferring the prospective development of a given population. This is, of course, not possible for archaeological data sets. Most of the time, we have to assume that we are dealing with a stationary population. --- ## Overview of features In our view, mortAAR shines in the following areas: ### Ease and flexibility of the input We worked hard to allow the user as many customisations as feasible. First and foremost, the tables can have many different kinds of formatting. A single individual approach is supported as well as already pooled data (e.g., from an already existing life table). In the latter case, the user has to specify a numerical variable which defines the count of each age class. In most data sets, the date ranges are defined rather loosely, reflecting the belief that one year more or less does not count taking the basic methods of age estimation into account. However, for the construction of life tables it should be obvious that 20--40 is not the same as 20--39.In the former case the individual in question will also contribute to the age class of 40--44 (using 5-year-classes). Therefore, the user should check the methodology the data-set uses carefully - if they did not generate it themselves. To spare the user the possible inconvenience to transform the data to match any given methodology, both possibilities -- labelled „included“ (e.g., 20--39) and „excluded“ (e.g. 20--40) -- are made available. The custom method of constructing the life table follows the approach used in modern data. It uses 5 year-classes, except for the youngest class. Following Ascadi and Nemeskeri [-@acsadi_history_1970], for Central Europe it has become practice to publish either the full life table or a collapsed form with five year-intervals [@herrmann_prahistorische_1990, 304ff.; @kokkotidis_graberfeld-_1991]. However, in the international demographic discourse [e.g., @keyfitz_world_population_1968], the first 5-year-age group is subdivided into those having died in their first year of life (0) and those having died in the second to fifth year (1--4). Because the differences are highly relevant in many archaeological contexts and the information is also -- contrary to the possibilities in the higher age groups -- anthropologically available, we decided to choose this as default methodology. With that, we hope to enhance the comparability of historic or present-day demographic data with that of archaeological origin. However, if the user has good reasons to do it differently, they are also allowed to define their own age classes. Furthermore, the user is encouraged to define any possible variable as a grouping variable. For an individual cemetery, this will mostly include the sex of the deceased, but it might as well be defined as marking different populations. ### Sophisticated means of computation To get discrete age classes from time spans of individuals or groups of individuals, as they are usually provided by anthropological studies, the time span is spread across single years; the individual values for these years are afterwards pooled together to whichever age class the user chose initially. It is tempting to define different kernels for the spreading process (e.g., Gaussian), but in our view this solution would not reflect the anthropological diagnosis correctly: An individual diagnosed of being of adult age in Central European terminology -- that is, 20 to 39 years old -- has the same chances of being in reality, say, 22 or 38. The Gaussian kernel, however, would imply the highest probability of it being 30 years. This is why we decided to follow an equi-divisional approach [called "proportional" by J. L. @boldsen_methods_1988]. Each year of the reported time interval of an individual or a group of individuals is assigned the same value, essentially the same probability of being the true age. While the general steps for computing an archaeological life table are straightforward -- leaving idiosyncrasies like the choice of the size of the radix and the exact formulas aside --, that is: all formulas circulating in archaeological textbooks reach the same results (except possible rounding errors), one detail tends to be overlooked. This detail concerns the average number of years lived by an individual that died within a specific age class (here called A(x) following the UN tables, elsewhere specified as „nax“). A(x) is of importance for computing L(x), the overall number of years lived within an age class, ultimately used to calculate life expectancy and the reconstruction of population. In virtually all archaeological text books, A(x) is defined as a/2, essentially splitting the length of the age interval in question in half. As we cannot know the exact date of death, given the current set of age estimation methods, this method seems beneficial for the beginning. The verdict of N. Keyfitz [-@keyfitz_applied_1977, 13] -- „The uniform distribution of deaths […] leads to nax = n/2, but this is not acceptable when the age interval n is as wide as 5 years.“ -- has certainly its validity when dealing with modern census data, but might be disregarded for older individuals, as the empirical value for age classes > 5 years is very close to 2.5, the computed value (empirically, it mostly hovers between 2.45 and 2.65 as older individuals tend to die later than within the middle of their class. This is far more difficult to generalize than for the younger age classes [see below]). Interestingly, this problem is valid even for empiric modern life tables for the first age group (0-1 years) where the age of death is recorded only in years, not in months. It seems that Keyfitz [-@keyfitz_probabilities_1970, 31] was the first to assume a value for nax to overcome this problem. From empirical data, he came to the following values for the youngest age classes: $a_0 = 0.07 + 1.7M_0$, and for ages from 1 to 4 to $_4a_1 = 1.5$. Elsewhere [@keyfitz_applied_1977] he has called this the "separation factor". Contrary to intuition, this separation factor for the youngest age groups is relatively small in developed countries (e.g., ibid. p. 15), but large in developing countries or historical data (and, supposedly, also in prehistoric times). This means that if young children die at all in developed countries, they do this very young. In life tables of developing countries, the average of lived years for a=0 and a=1--4 hovers around 0.3 and 1.3, thus, we decided to approximate A(x) for age classes <5 with a/3, as we think that taking the separation factor into account is well backed by empirical data [also implied by @chamberlain_demography_2006, 30; for anthropological data already: @weiss_demography_1973, 37]. The computation with this factor is the default setting. However, if the user does not agree, she is, of course, able to supply her own factor values or to disregard the factor approach altogether. ### Comprehensiveness of output mortAAR provides life tables for every group the user specified and on its own one table for all groups combined. Furthermore, it allows to visualize the resulting data sets via curve plots. The user can simply click through the plots for the different key values and thereby get a clear picture about the specificities of each data set, also -- if more than one category was specified -- about the differences between the data sets. Finally, we added the reconstruction of age distribution of the given population as an additional column (rel_popx). This is usually not provided by modern standard life tables. Following Kokkotidis and Richter [-@kokkotidis_graberfeld-_1991, 227], this is achieved by dividing L(x) by T0 [for a different equation see @weiss_demography_1973, 38]. --- ## Archaeological example ### Münsingen #### Archaeological background The cemetery of Münsingen-Rain was discovered in 1904 during quarrying and then extensively excavated in 1906. The dead were buried singly and supine in wodden coffins, often lavishly equipped with jewelry or weapons. The documentation and subsequent publication of the graves in 1908 [@wiedmer-stern_graeberfeld_1908] was outstanding for that time. This and the fact that Münsingen-Rain with 220--230 individuals and its occupation of roughly 300 years covers much of the Latène period secured it one of the most important places in typological and chronological studies of that period [@hodson_cemetery_1968; see also @mueller_muensingen_1998]. Recently, in a number of studies, the anthropological and archaeological material has been reanalysed from different angles [@jud_untersuchungen_1998; @alt_et_al_verwandtschaft_2005; @mueller_et_al_artefacts_2008; @scheeres_et_al_investigations_in_press; @moghaddam_et_al_muensingen_2016]. Though some papers explicitly dealt with the age and sex structure of the cemetery [e.g., @hinton_analysis_1986; @jud_untersuchungen_1998, 125ff.], interestingly, none has yet presented a thorough demographic analysis of the individuals from Münsingen-Rain. #### The data set The data originates from a published report of recent scientific analyses of the surviving skeletal material [@moghaddam_et_al_muensingen_2016] of 71 individuals. The data is available as supplemental online material and was directly taken from the respective xlsx-file. Beside the age-range of each individuals the sex, the archaeological phase and the binary information if grave goods, if existing, are included. Further columns that are not of interest in the present context were omitted. The authors do not compare their results with those of an older investigation of the same individuals. However, a comparison with Hug [-@hug_sammlung_1956] shows that generally there is a very good agreement in age as well as sex determination. #### Constructing the life table ```{r, message=FALSE} # load package dependencies library(mortAAR) library(magrittr) ``` ```{r} # load dataset muensingen muen <- mortAAR::muensingen ``` First of all, we have to get an overview of the age classes that were used. The easiest way would be `muen$age`. However, this results in roughly formatted row of strings. We used the following code to get an ordered and well-arranged list without duplicated age classes. ```{r} muen %>% dplyr::select(age) %>% dplyr::arrange(nchar(age), age) %>% unique() ``` As can be easily seen, the data set is comprised of 31 different age ranges. Skimming through the lines it is obvious that the anthropologists were not completely consequent in their age ranges. While „20--40“ clearly should be read as „from 20 to 39 years“ this is certainly not the case with for example „7--8“. The first one is in our terminology exclusive of the maximal age range, the second inclusive. Less clear are cases like „12--14“ or „13--15“. If we want to define 5-year-classes the differentiation does not really matter in the first case, as the individual would fall completely in the age range of 10 to 14. In the second case, the differentiation is unfortunately of importance. Taking it as inclusive, the individual would count with 1/4 to the age class of 15 to 19 while reading it as exclusive the individual would be associated with the younger age class. Unfortunately, we cannot solve this issue but for future reports we can only ask the authors to be more precise in their terminology. For the sake of the example, we assumed that age ranges only one year apart (e.g., „7-8“) are meant as inclusive and had to be adapted, but otherwise treated the age ranges as exclusive. For those individuals with an open end, we assume a maximal age of 70. In the next step, not only the aforementioned inconsistencies have to be corrected, but also non-numerical characters (e.g., ">") must be replaced to be able to feed the data into mortAAR. ```{r} # correct inconsistencies muen <- muen %>% dplyr::mutate( age = dplyr::case_when( .$age == ">50" ~ "50-70", .$age == ">60" ~ "60-70", .$age == "10+/-1" ~ "9-12", .$age == "10" ~ "10-11", .$age == "7-8" ~ "7-9", .$age == "9-10" ~ "9-11", TRUE ~ .$age ) ) ``` After adjusting the age class categories, the column "age" can be split into two separate columns "from" and "to", which mark the age span. These are then further transformed into numeric columns. ```{r} # split string column muen <- muen %>% tidyr::separate(age, c("from", "to")) %>% transform(from = as.numeric(from), to = as.numeric(to)) ``` All the steps above were necessary to prepare the raw data. We are now ready to feed it into the function `prep.life.table` that rearranges it for the actual life table function. We want to have 5-year-age categories, except for the first 5 years which are further split into the first and the subsequent 4 years. Therefore, we choose "Standard" as a `method`. As explained above, we assume that the age ranges are meant to be exclusive of the closing year (e.g., "20--40" as "20--39"). We regulate this by choosing "excluded" for `agerange`. The beginning and ending of the age classes is framed by "from" (= `agebeg`) and "to" (= `ageend`) respectively. For the first run, we choose "sex" as the grouping variable `group` but we could also have selected "latene_phase" (the archaeological phasing) or "grave_goods" (existence or non-existence of grave goods). ```{r} # apply data preparation function muen_prep <- muen %>% prep.life.table( group = "sex", agebeg = "from", ageend = "to", method = "Standard", agerange = "excluded" ) ``` The resulting data frame `muen_prep` is then fed into the life table function `life.table`. ```{r} # calculate life.table muen_result <- muen_prep %>% life.table() ``` ```{r, echo=FALSE} options(width = 999) ``` First, the tabular output is created. ```{r} muen_result ``` The visual inspection can be of great help, especially when the comparison of two or more groups is involved. ```{r} muen_result %>% plot(display = c("dx","qx", "lx", "ex", "rel_popx")) ``` #### Mortality patterns The maximum within the proportion of deaths (dx) of all individuals from Münsingen-Rain is around 40 years. Largely missing are younger individuals below 20, but also the age of 50 and above is only slightly present. Especially the apparent lack of the subadult individuals generates questions concerning the representation of the data. Already a superficial look at any modern life tables leads to the expectation of far higher numbers of younger individuals. Therefore, the obvious conclusion seems that not everybody was buried within the cemetery. However, there are also differences between the adult individuals. The maximum in females was in their 40s, while male individuals apparently died somehow later in their 50s. There are virtually no female individuals older than 55, but many males above 60. Accordingly female individuals suffered a much higher probability of death (qx), had less survivorship (lx) within all age categories. The life expectancy differs as much as nearly 13 years (35.0 for females vs. 47.9 for males), but this is also due to the fact that it was obviously possible to sex six subadult females, but nearly no subadult males. As a result the analysis leaves more open questions than answers: Representation is obviously not given, as is demonstrated by the largely missing subadult individuals. However, even with adult individuals, the highly differing mortality pattern between males and females elicits the question if this was really due to the differing life expectancy or differing cultural preferences in burying males and females. We invite the reader to regroup the data with the variables "latene_phase" and "grave_goods" and compare her results with the conclusions by P. Hinton [-@hinton_analysis_1986]. ### Magdalenenberg #### Archaeological Background One of the biggest burial mounds in Central Europe is the so-called [‘Magdalenenberg‘](https://en.wikipedia.org/wiki/Magdalenenberg) near Villingen-Schwenningen in Baden-Württemberg. Within the Early Iron Age it dates into the beginning of the Upper Hallstatt Period (c. 620--450 calBC). It had a diameter of over 100 metres and has still a height of 6,5 metres. For complete excavation 45000 cubic metres of earth and stone had to be moved. The first excavations took place in 1887 and 1890 and are in an excellent state of preservation. A modern excavation was done between 1970 and 1973 by Konrad Spindler. The central grave was probably erected in 616 BC, as established by dendrochronology [@billamboz_neyses_fuerstengrab_1999]; it had already been reopened in ancient times, but the wooden chamber was still in pristine condition when excavated. Furthermore, more than 100 additional burials were found which were dug into the burial mound a relatively short period after the installation of the central grave. These proved to be of great importance for relative chronology as well as for the social interpretation of the Hallstatt period. The importance of the ‘Magdalenenberg‘ does not only derived from the excellent preservation and the many burials that were uncovered, but also from the fact that the excavator Spindler did not choose to wait for the publication until everything was excavated or studied. He started to publish the raw results already in 1971 when the excavation was still under way. By 1980 all graves were published in six volumes [@spindler_magdalenenberg_i; -@spindler_magdalenenberg_ii; -@spindler_magdalenenberg_iii; -@spindler_magdalenenberg_iv; -@spindler_magdalenenberg_v; -@spindler_magdalenenberg_vi]. In recent years the anthropological material has been subjected to stable isotope analyses [@oelze_et_al_analysis_2012]. In this context the whole material was analysed again by established anthropological methods to get more information on the age and sex of the deceased. This resulted into a paper on the demographic structure of the population of 'Magdalenenberg' [@zaeuner_wahl_magdalenenberg_2013]. The new analyses were able to specify the results in some respects done in the 1970s, but for the sake of this paper it is important to emphasize that in general, they are both congruent. #### The data set The aforementioned paper [@zaeuner_wahl_magdalenenberg_2013] provides a complete table of the findings, but as this is very detailed, it is easier to refer to the life table the authors generated. They refrained from including an age range of 0--1, so we followed them in this. As explained above, mortAAR allows this easily. For the oldest age, we assumed, similar to Münsingen, a maximal age of 69 (because the age ranges are in this case inclusive). Because of one peculiarity their life table differs from the one mortAAR generates: they estimated the lived years of the oldest age group, 4.72 individuals, with 0, which means that they assumed that the individuals died after reaching their 60th birthday. This does not seem very plausible to us, and we think that the result of mortAAR is more realistic (average death in the middle of the age class). The differences apply specifically to the sum of lived years (Lx) and thus also to Tx, ex and the relative population measure (rel_popx). #### Constructing the life table The preparing steps for the construction of the life table of the population from 'Magdalenenberg' are less extensive than from Münsingen-Rain because the data already came from a life table and is thus better prepared. ```{r} mag <- magdalenenberg ``` The only replacement necessary concerns the oldest age group. ```{r} mag <- mag %>% replace(mag == "60-x", "60-69") ``` ```{r} mag <- mag %>% tidyr::separate(a, c("from", "to")) %>% transform(from = as.numeric(from), to = as.numeric(to)) ``` As `method` we choose this time "Equal5". This way we get 5-year-intervals throughout, as in the original life table. The closing year is meant to be counted as well, therefore `agerange` is set to "included". The biggest difference to the Münsingen example concerns the number of deceased. Within the Münsingen data each line concerns one individual, while within the Magdalenenberg data each line (= age range) concerns more than one individual. This is why it is mandatory this time to define `dec`. Because we analyze all individuals together, no grouping variable is specified and therefore only one life table will be computed. ```{r} mag_prep <- mag %>% prep.life.table( dec = "Dx", agebeg = "from", ageend = "to", method = "Equal5", agerange = "included" ) ``` Finally, the prepared table is fed into the function `life.table`, and the result printed as well as plotted. ```{r} mag_result <- mag_prep %>% life.table() mag_result ``` For the purpose of this example, we limit the graphical output to the proportion of deaths (dx), probability of death (qx) and life expectancy (ex). ```{r} mag_result %>% plot(display = c("dx","qx","ex")) ``` #### Mortality patterns Very similar to the Münsingen example, the maximum within the proportion of deaths (dx) of all individuals is in early grown up years, this time around 30 years. Again, largely missing are younger individuals below 20, but also the age of 50 and above. Because of the lack of younger individuals, life expectancy at birth is relatively high at 32.2 years. Already observed by Zäuner and Wahl, the fact that life expectancy is steadily declining from birth onwards is completely atypical. Remarkable, but certainly an artefact due to the necessary artificial age ranges, is the steep decline of the probability of death (qx) after 40 and 60, respectively. ### Comparing Early Iron Age (Magdalenenberg) and Late Iron Age (Münsingen-Rain) As a last step for our analysis, we will compare the curves of the life tables of the cemeteries from the 'Magdalenenberg' and Münsingen-Rain. #### Constructing the life table The output of the function `prep.life.table` can be addressed separately by their grouping names. We use this option to collect the results of the above examples Münsingen-Rain and Magdalenenberg. For 'Magdalenenberg' with no grouping variable the name of the corresponding `data.frame` within the list of results is "Deceased" by default. Because we want to have comparable data from Münsingen-Rain we choose the output for all individuals (`data.frame` named "All" by default). ```{r} comp <- list(mag_prep$Deceased, muen_prep$All) ``` To have meaningful names for the graphical output, we change these. ```{r} names(comp) <- c("Magdalenenberg", "Muensingen") ``` Again, the newly formed and renamed list is then fed into the function `life.table`. ```{r} comp_result <- comp %>% life.table() ``` We can skip the tabular output because this has already been provided above and go straight to the diagrams. ```{r} comp_result %>% plot(display = c("dx","qx", "lx","ex")) ``` #### Mortality patterns The similarities, but also the differences between the data sets are striking. On the one hand, the general shape of the curves of the probability of death (dx) is the same with very low values in the younger and older age classes. By far the most prominent age group in both data sets are the middle aged individuals. However, it seems that the peak for the Magdalenenberg individuals are in the 30s, while for the Münsingen population it is 10 years later. This difference is also reflected in the other diagrams: The probability of death (qx) of the Magdalenenberg individuals is higher in nearly all age categories within 20+, and their rate of survivorship (lx) is decreasing faster. Consequently, while the Münsingen have a life expectancy of 38.0 years when born, the one for the population interred in Magdalenenberg is only at 32.2 years. This difference in the remaining life expectancy remains more or less the same till the highest age group. The differences are difficult to explain. First, of course, we have to assess the possibility that the age difference reflects differences present in real life. However, as such a selection is obvious with the subadult individuals it seems hard to exclude a similar preference for older individuals. Therefore, another possible reason for the differences could be different cultural preferences in selecting the individuals to be interred in the cemetery. This would mean that they have preferred to inter younger adult individuals during the Upper Hallstatt period and older ones during the Latène period. Finally, for both data sets it has been emphasized that the skeletal material underwent repeated studies, which reached very similar results, it seems hard to justify fundamental differences in aging by different anthropologists. However, a possible reason could be the varying material that survived. For Münsingen, due to selection processes after excavation, only skulls are available for analysis, while for the Magdalenenberg whole skeletons are existent. In the past scholars have taken very different stances to these peculiar age distributions and differences. Already P. Jud [-@jud_untersuchungen_1998, 129] in reference to the population from Magdalenenberg regarded the male individuals from Münsingen-Rain as too old. However, from a paleodemographic perspective the opposite could also be argued, that is that the population of Magdalenenberg is too young. It is neither the aim nor the scope of this paper to solve this issue. However, in line with a recent overview by Eggl [-@eggl_ueberlegungen_2009] we would like to emphasize the need for more systematic studies that also take not only the paleodemographic, but also the archaeological uncertainties into account. ## Conclusion Currently, paleodemography seems to undergo a sea change. While the early criticism from the early 1980s about the representation of aging characteristics [@bocquet-appel_masset_farewell_1982] had been fervidly been rebutted [e.g., @van_gerven_armelagos_rumors_1983], the problems of getting to mortality patterns is now common knowledge. Some recent solutions by paleodemographers have been to remodel life tables, to introduce complex statistic measurements like Bayesian statistics or to revert the logic of aging skeletons [see the examples in @hoppa_vaupel_paleodemography_2002]. Our own attempt on life tables is (still) a far cry from the recent course of research. The reason why we still think this method is valid is simple: While we do not dismiss the problems of life tables with their assumption of stationary populations, we still think that even in their simple form they provide useful information, not necessarily for the paleodemographist, but for the archaeologist. What tends to be forgotten is that the treatment of the deceased is always a conscious decision of the surviving members of the society in question. Even the non-treatment of the dead, e.g., by excarnation, still demands a decision. Therefore, even if the age-pattern is as distorted as in the examples above -- and it remains the question if such a situation should not be regarded as the norm for much of prehistory --, it still provides highly relevant information about cultural behaviour and preferences. To elucidate even more hints on prehistoric practices it is of primordial importance that archaeologists, anthropologists and paleodemographers work closer together than they have in the past to get to grips with actual population developments, representation of burial data and possible sources of bias. If our package helps in this process, we would be more than happy. --- ## References