Reply To: Two Step vs Bray & Curtis

May 9, 2008 at 4:41 am #509

Keymaster

With 380 species (and a high ordination stress), it seems appropriate to deal with the species first. There are two ways that this can be done.

You could run a classification of the sites (using Bray and Curtis or Kulzynski as both are very similar) and then look at the Box and Whisker plots to get an idea of how the species are discriminating the groups (and to some extent, relating to each other). You could also run a classification of species using two-step to produce a dendrogram and species groups. In reality, PATN does both in a single ‘analysis’ run anyway.

In fact, when you put the site and species classification together in a two-way table, you have probably the best way of looking at this style of dataset. This is the table that I would spend a fair time with – it will show you the interaction between sites and species. If you can’t make a good story from that, you have other problems.

Before I go further, what values are you using? Counts, abundance, presence/absence? With this style of data, a large proportion of the ‘information’ in the dataset is in the presence/absence component. Adding counts/cover/abundance adds a lot of noise, so if you really want to take some form of abundance into account, I’d recommend considering a transformation onto a 0-5 type scale. This reduces the noise (and the stress!). Just use the most appropriate transformation to get the data into this form. I usually transform such data into integers 0,1,2,3,4 and 5.

You would not normally use two-step for site classification or anything other than two-step for species classifications! Also, whatever association measure is used for the SITE classification IS used for the ordination. Full stop.

As my help notes state, I’d like to see ordination stress below ~0.15 to be happy about publishing anything. To achieve this, a range of options are available (which can be combined in various configurations)-

Recode as noted above. Eliminate redundant and noisy species using the species results information noted above. Produce species groups and then use these as new variables for a site classification. This last option is pretty neat and specially combined with the elimination of noisy or redundant or ubiquitous/common species.

I’ve never found these techniques to fail but iterations of the analysis are required – but this is normal anyway.

Hope this helps.

Lee

PATN - Finding Patterns in Data