lee

Forum Replies Created

Viewing 14 posts - 46 through 59 (of 59 total)
  • Author
    Posts
  • in reply to: non hierarchical process stopping #454
    lee
    Keymaster

    Hi Tiffany

    Sounds strange. A test of PATN generated 300,00 by 20 dataset and 10 groups with Gower Metric worked fine. Random data willbe slow to convergence. Your 3 minutes seems too fast to me, even on a fast PC. Even after a few iterations, the groups should be ok.

    It would be best to send me the dataset as csv file (you can export rom PATN if you don’t have a csv file) and zipped up. That way I can better test your problem.

    Lee

    in reply to: deriving GSTA statistics #452
    lee
    Keymaster

    Hi Ross

    DOS PATN used two strategies for summarising variables across groups. Measures of central tendency (mean, median) and dispersion (range, standard deviation) were the norm for ‘continuous data’.

    The alternative of using ‘constancy’ (the tendency for a variable to occur in all objects in a group) and ‘fidelity’ (the tendency for a variable to be isolated to one group) were more useful as group statistical summaries for presence/absence data.

    At the moment, PATN uses only measures of central tendancy and dispersion. I’ll look into including an option for constancy and fidelity.

    Re exports, I thought it was better to export the lot just in case you wanted to continue the analyses. I figured it was far easier to delete the extrinsics than to add them.

    Lee

    in reply to: wrong number of groups in classification output #451
    lee
    Keymaster

    Hi Tiffany

    PATN’s convergence to the requested number of groups in non-hierarchical classification can be sensitive to starting conditions. In some cases, the exact number of groups cannot be achieved. This usually means that the requested number sits on an unstable point.

    Requesting a different number of groups changes the scenario for the first phase, usually resulting in PATn generating a different suite of seed objects. This condition can certainly result in generating the number of groups you originaly requested.

    We use a ‘branch and bound’ algorithm to try and converge on your selected number of groups. This usually ‘works’, but in the real world, there are no guarantees that it will converge to your requested number.

    We will however see if we can improve the convergence.

    Lee

    in reply to: Printing Box and Whisker Diagrams #450
    lee
    Keymaster

    Hi Andrew

    I’ve reproduced your error. Seems it relates to the number of groups exceeding one print page. 100 groups was larger than we ever tested so we didn’t pick the problem up.

    We have now fixed this for PATN v3.03.

    Leon had also picked up another error that alluded us on smaller test datasets. Transformations and standardizations were taking longer than they should.

    We have also fixed this in PATN v3.03.

    I also noted when diagnosing your problem that the smooth scrolling of the Data Table was not.

    We have also fixed this in PATN v3.03.

    We are now therefore running a fuller test protocol over PATN v3.03 to see if there is anything else that slipped through. This could take up to a week. I’ll announce it in the Forum when it is ready. We may take the opportunity to add another function or two planned for v3.03.

    I apologise for these problems, but at least you can’t say we are tardy in trying to identify and fix them!

    Lee

    in reply to: Printing Box and Whisker Diagrams #448
    lee
    Keymaster

    Hi Andrew

    Does the page look correct when you do a Print Preview on the box and whisker plots (right mouse click in b&w plot window)? It should be WYSIWYG. I’ve printed many b&w plots on a range of laser and inkjet printers and have had no problems.

    How many groups and variables do you have, and what printer are you using? Numbers shouldn’t matter but knowing this and the printer, I may be able to try and reproduce the situation.

    Lee

    in reply to: PATN vs. Other Software #447
    lee
    Keymaster

    Hi Pete

    SAS and SPSS are general statistical packages. PATN specialises in the exploratory aspects of statistics (pattern finding, hypothesis generation), not the confirmatory part (hypothesis testing) such as modelling.

    PATN has association measures that are proven to recover underlying structure far better than the usual ones available in stats packages (eg Bray and Curtis, Kulczynski and two-step). Also, PATN’s custering techniques (both non-hierarchical and hierarchical) are adapted to maximise recovery by linking to known characteristics of the association measures. Ditto the ordination technique in PATN; SSH This is a hybrid of BOTH metric and non-metric scaling and is well proven over both traditional methds such as factor and principal components/coordinates.

    But, aside from the ‘engines’ inside PATN, the real difference is in HOW PATN interacts with the user. A (very) complete analysis in PATN takes about a minute from the time you import data till the time you have the results. PATN scans incoming data and picks an analysis stratgey. general packages cannot do that. This is literally the trivial part of PATN (but as stated above, it is a ‘robust’ part).

    The really interesting – and fun partit is the interactive 3d display where you really get to understand the patterns. This is where he results all come together into one dynamix, interactive display. As far as I am aware, there is no statistical package that goes anywhere near PATN in interactive visualization. If you havent experienced this, it’s hard to explain what it is like. The interrogation/evaluation options on this display are very powerful in getting users to think about data structures, causes, processes. It is here were most PATN users spend 99% of their time. Correctly so. This is how PATN v3 was designed.

    Hope that helps a little. I could say a lot more about the differences (natural colour, object comparisons, association histograms…) but I’d be interested to see what other PATN users would say.

    Lee

    in reply to: PATN v3.02 and a question to users #446
    lee
    Keymaster

    Hi Janet

    Good to hear from you. Actually, we have just implemented what I hope is the best solution! We can either import the analysis files (group compositions and ordination) with header AND labels or without header AND without labels. PATN will auto-detect which. Yuo will not be able to mix the two forms.

    This will be in PATN V3.02. We have closed off development on v3.02 and have just to complete the help file and a little further testing before release. We want to get this out as quickly as we could. Consequently, we will have to leave some features to v3.03.

    In relation to the export of colours, the ordination coordinates ARE the colours (x=red, y=green and z=blue). You can export these as is of course. If you need them in the range 0-1, either use Excel or pull them back into PATN as data and standardize them!

    Hope that helps.

    in reply to: PATN v3.02 and a question to users #444
    lee
    Keymaster

    While I note that some have read my previous posting, no one seems to have been game to vote on the question I posed. (It’s easy by the way!) The results of the vote will be incorporated into v3.02 so I was hoping for some feedback.

    The issue I’ve noted is that when you export data from PATN such as the data table, group composition, ordination coordinates and most other analysis data, PATN will put some header information at the top of the file to help you interpret the content. Particularly a year or two later when you may come back to it. BUT, should we assume that if you IMPORT the same file, it will be identical in format? That is – with the header?

    The point here is if you import data from another applciation, you will then have to include something that looks like the header. This is no big deal, but will be necessary.

    To me, it makes sense to be able to re-import an exported file without changing anything.

    Do you agree? If we don’t get feedbackon this, we will opt for expecting header on imports.

    Lee

    in reply to: 0 values for groups (post non-hierarchical clustering) #442
    lee
    Keymaster

    Hi Tiffany

    No problems. In the Windows version of PATN, group numbers go sequentially from group “0” to group “k-1” where “k” is the number of groups.

    This is different to the old DOS version that went from group “1” to group “k”.

    Lee

    in reply to: PATN 3.01 near! #441
    lee
    Keymaster

    Hi Derek

    Sorry for the delay in my response (thanks Adam!) but I’ve been away camping for three weeks. A much needed break.

    Anyway, did Adam’s solution work ok for you?

    I’ll hopefully post a generic solution within a few days as v3.02 will be released shortly. Hopefully within days.

    Best wishes,

    Lee

    in reply to: Dataset size limitations? #440
    lee
    Keymaster

    Hi Andrew

    Sorry for the delay in getting back to you. I’ve been camping for three weeks. Mia culpa but I seriously needed a break after 3 months of 18 hour days.

    Your comments have been noted and we have addressed speedier data import for huge datasets in soon to be released v3.02 (along with fixing the row label sizing that leon has noted in other BB entry).

    God, 35 hours is certainly not acceptable, but 1.4 million records is definitely biggish. The DOS version of PATN was (by comparison with Windows code) was minimal and enabled maximal use of memory. I’m not sure what the limits are on v3+ but I would hope that a few million records could be handled. Once the physical memory is exhausted, disk I/O will certainly slow it down. So if you plan to do this regularly, a few GIG of memory will be a must.

    The Excel limit doesn’t relate at all to PATN – there are no fixed limits in PATN.

    csv should be the easiest and fastest import method.

    We will also address the slider bar issue for 3.02 – which will be released within days I hope. I’ll keep all posted on this given that our auto-update via eSellerate is not as we would have wished. Stay posted on that one.

    Cheers,

    Lee

    in reply to: Some initial comments on formats and a PCC question #439
    lee
    Keymaster

    Hi Leon

    I apologise for not responding earlier – I’ve been camping in Denmark and Sweden for the past three weeks (no electricity – let alone Internet).

    I was already aware of the row labels issue and that is now addressed in the soon to be released 3.02.

    Re updates – as you have seen, we have run into some problems with what we thought was an automatic procedure via our eSellerate gateway. Bugger. We have been working on a solution for a few weeks. I’ll announce some type of solution tomorrow.

    Re your PCC quesiton, easy! PATN v3 and later has the concept of extrinsics built in to the Data Table. Just read THE WHOLE dataset in, select the environmental columns (extrinsics) and press the 5th button on the toolbar to move them to the right hand side of the red line, run the analysis with ‘all evaluations’ selected in the analysis dialog box and … and you have it! No more multiple datasets!

    Lee

    in reply to: some initial thoughts on moving from DOS PATN to PATNv3 #434
    lee
    Keymaster

    Hi Ross

    I think we will should be able to add the IMPORT DOS PATN dataset and GSTA fixes by Monday.

    Figuring the BMP issue with the dendrogram needs some sleuthing but we should have a solution by then. PATN should work fine with your dataset. In discussing this issue, we may also add the ability to save in other formats such as jpg.

    As for the ASCII version of DOS PATN – not sure at this stage. It feels uncomfortably like a step back in time and could require considerable resources.

    When updated, we will add a link to the update that you can download (and update the shop downlaod version).

    I’ll keep you posted.

    Lee

    in reply to: some initial thoughts on moving from DOS PATN to PATNv3 #432
    lee
    Keymaster

    Hi Ross

    I value the feedback. I’ll address each issue you raised. Thanks for the feedback on the purchase/download. We have improved the comments through the process a little better.

    Import. Yes, I really had concentrated so hard on getting Excel files into PATN, I hadn’t really fully considered the migration from the DOS version. Your point about the limitations of Excel we realised, but not all the implication. This is certainly an oversight on my part. We do need to add this as an import option. I figure we could ask for the xxx.prm file name and then pickup the data and labels from the stem filename (assume that the data file is xxx.dat, the labels xxx.rlb and xxx.clb). This should be quite easy. I’ll talk to our programmer tonight about this.

    1. It looks like a problem with the size of the dendrogram bitmap. I hadn’t run into this before. I could save your dendrogram image using the option ‘Save As …bitmap – reduced width’. I’ll look into this as the dendrogram is not huge. Once you have the BMP file, you can embed it in anything such as Excel.

    2. OK, point taken. Is the easy solution to fill out the matrix with the species labels? This would enable sorting across all table columns in Excel. As far as I can tell, this would be same as GSTA. This would also be very easy to do.

    3. OK. I’ve dealt with this above. Fully agree.

    As mentioned in a previous e-mail, I’m keen to see what the priorities of the PATN community are regarding other options modules such as NNB and RIND. I’ll put this response on the BB as these issues are ideal to share on BB. Once we have a ‘critical mass’ on the BB, we can use the ‘voting’ functions.

    I’ll get onto seeking the solutions right away.

    Lee

Viewing 14 posts - 46 through 59 (of 59 total)