I tend to like non-hierarchical classification. Hierarchical classification is designed to optimize the hierarchy while the non-hierarchical classification optimizes the groups.
Your approach (non-hierarchical classification – take centroids an use them in hierarchical classification & ordination) is a standard. It is by far the most expedient strategy when you have a lot of objects or complexity.
But as you say, how many groups? I don’t find this all that hard as I tend to focus on what results/outcomes I’m trying to communicate. If for example, I’m dealing with ‘management’, I produce 5 groups as ‘5’ different things is about as much as most managers want to hear about.
It comes down the ‘naming’ the groups. There is good reason to take the number of groups to the point where naming is not easy. Then back off knowing that you have a good notion as to what the variation is, and if the groups are say ‘ecologically justifiable’, reproducible etc.
CANOCO – I don’t like it for a number of reasons – the important of which is that it blends exploratory analysis (ordination) with confirmatory analysis (regression). I think this is philosophically abhorrent. In PATN, I take the approach of letting the data produce the patterns (ordination/groups) and then interpreting these patterns with extrinsic/environmental data. I don’t believe in mixing the two. Anyway – that’s my rationale and, as for the foreseeable future, that’s what PATN will look like.
My aim is to make the analysis more powerful (yet simple in concept), more robust, the tools easier to use and the interface/graphics fun.