lee

Forum Replies Created

Viewing 15 posts - 1 through 15 (of 59 total)
  • Author
    Posts
  • in reply to: A tip for reducing stress #527
    lee
    Keymaster

    Once outliers are identified, they disappear as a problem. Admitted, you need to get rid of them for ordination to be most effective.

    Re Monte-Carlo, ok. I should have referred to ‘MCAO’ as a permutation test rather than MC. Oh well, it’s in the system now.

    Re CANOCO, it is up to you in PATN what variables you use in an ordination (SSH) – the intrinsic variables and what you may want to use for ‘environmental variables’ extrinsic variables. Standardizing variables is a serious issue in itself. Abundance data usually needs some transformation, but not standardization. Environmental data is the opposite.

    in reply to: A tip for reducing stress #525
    lee
    Keymaster

    Classification in general is far less affected (badly) by outliers than is ordination. MDS is based on a regression (between input associations and reduced space associations). Non-hierarchical classification should not be unduly affected by outliers. PATN should identify an outlier and this ‘group’ should only attract objects that are closer to it than any other group.

    Not sure what Monte-Carlo process you are referring to.

    The non-hierarchical process in PATN v3+ uses random seeds (see the help for the complete process). While this sounds odd, it works fine in practice. In the DOS version of PATN, you could use any seed file.

    Lee

    in reply to: A tip for reducing stress #523
    lee
    Keymaster

    I tend to like non-hierarchical classification. Hierarchical classification is designed to optimize the hierarchy while the non-hierarchical classification optimizes the groups.

    Your approach (non-hierarchical classification – take centroids an use them in hierarchical classification & ordination) is a standard. It is by far the most expedient strategy when you have a lot of objects or complexity.

    But as you say, how many groups? I don’t find this all that hard as I tend to focus on what results/outcomes I’m trying to communicate. If for example, I’m dealing with ‘management’, I produce 5 groups as ‘5’ different things is about as much as most managers want to hear about.

    It comes down the ‘naming’ the groups. There is good reason to take the number of groups to the point where naming is not easy. Then back off knowing that you have a good notion as to what the variation is, and if the groups are say ‘ecologically justifiable’, reproducible etc.

    CANOCO – I don’t like it for a number of reasons – the important of which is that it blends exploratory analysis (ordination) with confirmatory analysis (regression). I think this is philosophically abhorrent. In PATN, I take the approach of letting the data produce the patterns (ordination/groups) and then interpreting these patterns with extrinsic/environmental data. I don’t believe in mixing the two. Anyway – that’s my rationale and, as for the foreseeable future, that’s what PATN will look like.

    My aim is to make the analysis more powerful (yet simple in concept), more robust, the tools easier to use and the interface/graphics fun.

    Lee

    in reply to: A tip for reducing stress #521
    lee
    Keymaster

    Hi Mark

    Masking out (intrinsic) variables with low PCC correlation should reduce stress in ordinations. Ditto if you used the Kruskal-Wallis values from the box and whisker plots (but this is one step removed).

    It is important however that the process should not be totally ‘mechanical’. Always seek to understand the trends in the ordination space and the identity of any groups from a classification. In both cases, examination of both intrinsic and extrinsic variables is important.

    in reply to: Reducing Stress #514
    lee
    Keymaster

    Mike

    You should take a look at 11. Austin, M.P., Williams, O.& Belbin, L.(1981). Grassland dynamics under sheep grazing in an Australian Mediterranean type Climate. Vegetatio 47, 201-211. This gives you an idea on how you could analyze temporal and disturbance factors in ‘noisy’ datasets.

    If there are independent environmental/disturbance factors, then it is possible that more than 3d is required for the ordination. But an overlaid MST will also give a good indication of this. See my notes on it.

    I would never publish an (MDS) ordination result without the stress value (or variance in a PCA/PCoA).

    If you have very disjunct communities, then ordination is difficult. Take a look at the histogram of association measures and see the notes on this in the help. It will tell you if there are disjunctions. If there are, then there is a strong case for independent analysis of each disjunct group. BTW: Disjunctions = high stress.

    Lee

    in reply to: Reducing Stress #510
    lee
    Keymaster

    Mark, I think my post to your first post on Bray and Curtis vs Kulczynski addresses at least a fair part of this.

    Stress of 0.2 is still too high for my comfort.

    I’ll have to go over the fidelity stuff to remember the details but unsure if you are referring to my DOS PATN approach to constancy and fidelity (with presence/absence data) or Mike Bedwood’s fidel stuff. I think Mike extended my work.

    Lee

    in reply to: Two Step vs Bray & Curtis #509
    lee
    Keymaster

    With 380 species (and a high ordination stress), it seems appropriate to deal with the species first. There are two ways that this can be done.

    You could run a classification of the sites (using Bray and Curtis or Kulzynski as both are very similar) and then look at the Box and Whisker plots to get an idea of how the species are discriminating the groups (and to some extent, relating to each other). You could also run a classification of species using two-step to produce a dendrogram and species groups. In reality, PATN does both in a single ‘analysis’ run anyway.

    In fact, when you put the site and species classification together in a two-way table, you have probably the best way of looking at this style of dataset. This is the table that I would spend a fair time with – it will show you the interaction between sites and species. If you can’t make a good story from that, you have other problems.

    Before I go further, what values are you using? Counts, abundance, presence/absence? With this style of data, a large proportion of the ‘information’ in the dataset is in the presence/absence component. Adding counts/cover/abundance adds a lot of noise, so if you really want to take some form of abundance into account, I’d recommend considering a transformation onto a 0-5 type scale. This reduces the noise (and the stress!). Just use the most appropriate transformation to get the data into this form. I usually transform such data into integers 0,1,2,3,4 and 5.

    You would not normally use two-step for site classification or anything other than two-step for species classifications! Also, whatever association measure is used for the SITE classification IS used for the ordination. Full stop.

    As my help notes state, I’d like to see ordination stress below ~0.15 to be happy about publishing anything. To achieve this, a range of options are available (which can be combined in various configurations)-

    Recode as noted above. Eliminate redundant and noisy species using the species results information noted above. Produce species groups and then use these as new variables for a site classification. This last option is pretty neat and specially combined with the elimination of noisy or redundant or ubiquitous/common species.

    I’ve never found these techniques to fail but iterations of the analysis are required – but this is normal anyway.

    Hope this helps.

    Lee

    in reply to: Alocation Radius in ALOC #508
    lee
    Keymaster

    Hi Mark

    The allocation radius does control the number of groups produced in the non-hierarchical classification in PATN.

    Experience suggested however that most people wanted to select the number of groups rather than having a stab at an allocation radius that would produce a reasonable number of groups. Most tended to manipulate the radius to get a desired number of groups.

    What PATN does is to use a branch and bound approach to manipulate the allocation radius to home in on the user-selected number of groups. In some cases, hitting this number is not possible.

    Is there any reason you want to work with the allocation radius directly?

    Lee

    in reply to: Re- SUpply of PATN 3.1 #507
    lee
    Keymaster

    Hi Max

    That depends on a few factors. Here are the situations and associated options-

    1. You have stored the downloaded PATN distribution file. If so, simply run this on your new computer. When you install PATN if it is not the latest vrsion, do a File | Check for update and follow the instructions to update it.

    2. You don’t have the file, but you do have the email that was sent to you at the time of purchase and you did purchase the eSellerate Download Service. The latter stores the PATN distribution file for you for a year after the date of purchase. In the email that was sent, there is a URL to redownload PATN. In the email is also your licence key. Again, if it is not the latest version, do as in (1) to update it.

    3. You have lost everything or most of it! In this case, it is still no problem as we have your purchase records and can try and get PATN to you and your licence key etc. The hassle with this that most email system will block anything resembling executable code. If you have an ftp site, let me know as that will work. Just email me.

    If your old system is junked, let me know. I can update the authentication records.

    If you have any problems, email me patninfo@patn.com.au and I’ll se what I can do to help.

    Lee

    in reply to: Comparing dendrograms #506
    lee
    Keymaster

    Hi Tom

    Depends on the data values and number of variables (an in this case, that would be different) and asuming same association measure, classification method/beta value).

    If you display the association matrix and look at the histograms of both datasets, you will see the ‘raw material’ for the classification. The similarity of the shapes, minimum and maximum values will be the determinants of dendrogram similarity.

    What many have done (self included) is to a) standardise the two association matrices 0-1 (say), and then b) subtract them to create a |absolute| difference matrix…then c) classify that!

    The last fusion values (an association value) in your case are 1.3165 and 1.3473 suggests (doesn’t prove) that the two association matrices were similarly structured, and does prove that space dilation was operating.

    Lee

    in reply to: Variance explained by ordination axes #504
    lee
    Keymaster

    Multidimensional scaling (MDS) works totally differently to Principal Component Analysis. Variance doesn’t come into MDS and for very good reason. Variance asumes normality and one is wise not to assme that with most ecological data.

    MDS (SSH in PATN) starts by distributing your objects randomly in your selected number of dimensions, usually 3. It then iterates between the association matrix and the Euclidean Distances in the 3d-space to maximise the relationship. Basically MDS moves the objects each iteration to improve the relationship between the two matrices. No variances involved, at all. Zip.

    The result of SSH is hopefully, the best configuration posible. The axes are 100% arbitrary because they relate to the random coordinates allocated at step 1. PATN’s strength is that you can view the configuration dynamically in 3 (or whatever) dimensions. There is absolutely no implied relationship between the axes you may care to optionally view and any trends or gradients you may interpret. The axes are there for perspective.

    Varimax in the current version of PATN would be a waste of time. First, we aren’t using variance, and second, you can do far better visually than any mathematical attempt to maximise (or minimise) variance anyway. Another way of saying it is – variance is appliable to PCA but PATN v3 doesn’t use PCA for very good reasons.

    Any better?

    Lee

    in reply to: Problems adjusting print size of dendrograms and Box&Whi #502
    lee
    Keymaster

    Hi Michael

    PATN is not fancy in it’s printing options, but there are strategies available to get the most out of the system. My philosophy with PATN graphical outputs is to assume that users have access to a reasonable image editor. But before that, there are a few things to check-

    1. Make sure that your printer drivers are accurate and up to date.

    2. Make sure that your Page Setup is appropriate for what you have available and what you want to do.

    3. You can right mouse click most graphic images and reduce the type and size of the font. With box and whisker plots, two-way tables, dendrograms and assocation meaures etc, this is extremely useful.

    4. Do a Print Preview (available as a button and from right click mouse menus) to see wat it is going to look like. Adjust accordingly.

    5. Always have a good image editor available to get graphical outputs tweaked for publication. You can ‘slice and dice’ and add annotaions and other graphics very easily these days. Applications such as Paintshop Pro, Photoshop Elements, ACDSee Photo Editor, GIMP etc are all fine.

    Lee

    in reply to: create new variables from variable group function #501
    lee
    Keymaster

    Hi Michael

    We have fixed the problem and tested the new version. PATN 3.11 is now available as a fix update through the usual ‘File | Check for Update’ option in PATN. Just follow the updae instructions.

    Lee

    in reply to: create new variables from variable group function #499
    lee
    Keymaster

    We have found the problem (4 missing lines of code!) and corrected it. We are now re-testing PATN and the updater etc before releasing the update as v3.11. I’ll post an announcement as soon as it is ready.

    Sorry about that. It is amazing how a year of testing still can’t test everything.

    Lee

    in reply to: create new variables from variable group function #498
    lee
    Keymaster

    Hi Michael

    You are right. It is taking far to long, and it shouldn’t. I’m looking into it now, and will get back with an update as soon as we isolate the issue.

    Lee

Viewing 15 posts - 1 through 15 (of 59 total)