For most sets of species, particularly invertebrates and other small cryptic taxa, the number of species in any site/habitat can only be estimated from the numbers that one gets in a sample (of n replicates) taken at some spatial scale in that site/habitat. Here we explore methods to scale up the estimates of numbers of species from a set of samples (about 1.25 m2) to a site (5 m x 4 m) to a whole rocky shore (say 150 m x 50 m). If one knows, for any single site (i), the true number of species (Ni) and the number one gets in a sample of a particular size (ni), one can calculate the probability of finding a species in a sample of size n (p = ni/Ni) and use this probability to calculate the number of species in any other site (j) from any similar sample (Nj= Ni/ni*nj). This procedure assumes (a) the relationship between Ni and ni is consistent among times of sampling and among sites and shores (i.e. that spatial relationships among species do not change), (b) that n is large enough to minimise sampling error and (c) that the probability of finding a species in a sample is well-measured by whether a species is or is not found in a single sample.
It should be possible to improve this estimate by taking several samples (say 4) in a site. These can be used to estimate the probability of a species being found in all 4 samples(p4), 3 of them (p3), only 2 (p2) or only 1 sample (p1). These can then be used in other sites to estimate the total number of species in the site, based on the number found in a sample. This sort of method takes into account the ways different species are distributed (spatial variance) and therefore their frequency of occurrence in samples. It should provide a more reliable estimate of the relationship between the true number of species and the number sampled. One can predict therefore that this method should under- or overestimate the true number of species to a smaller degree than a method that assumes all species are equally likely to be sampled. There is, however, more effort (and thus cost) in establishing the relationship between Ni and ni using multiple sets of samples. If this is substantial without much gain in accuracy, then it may not be worth the additional effort.
This was evaluated by examining relationships between N (number of species found on 4 intertidal rocky shores), Ni (number of species per shore), Nij (number of species in each of 4 sites on a shore) and nijk (the number of species found in a set of replicate samples in each site on each shore).
Preliminary results indicate that there can be substantial error in scaling up from samples of organisms on rocky shores to estimate numbers of species. Using the first method of calculating probabilities, the number of species was under- or overestimated by > 20 %, whether one was estimating species richness at the scale of sites or the entire shore. Although increasing the number of samples from which the probabilities were calculated decreased the magnitude of the over- or underestimation, this decrease was not very large, even when the effort was increase more than 4-fold.
As expected, the second method of calculating the probabilities estimated the number of species more reliably, at the scale of sites and shores. Nevertheless, a substantial increase in reliability seems to require very large sample sizes.
A summary of the results obtained will be added to this report as soon as they become available.