next up previous contents
Next: Quick Reference Up: Use of the DODECANTS Previous: CSF Volume Measurement   Contents

KNN Classifier

The final stage of the algorithm is the normalisation for age-related atrophy and the KNN classifier. The classifier can also be run just to get the age-normalised CSF volumes: in fact, it is anticipated that this will be the approach adopted by most users.

In order to run the classifier, enter the directory name and filename of the .tvs file, and a name for output files, into the relevant fields of the DODECANTS tool. Then enter the number of records in the .tvs file into the ``KNEAR field". Finally, press the ``Run classifier" button. The tool will then run through the .tvs file, picking up the CSF volume measurements, normalise them for head size and age related atrophy (see TINA Memo no. 2004-002), compute the reduced variables W1 to W5 (see above), and run a leave-one-out KNN classifier on the resultant data.

There are two switches to control aspects of this process. The normalisation for age related atrophy corrects the CSF volumes by multiplying by the ratio of the average CSF volume in normal subjects at a standard age to the average CSF volume in normal subjects at the subject's age. This standard age is calculated from the mean of the ages of the subjects listed in the .tvs file. However, the ``Fix mean age" switch allows this mean age to be entered manually. This allows the mean age to be fixed across analyses of multiple data lists, allowing the results to be compared directly. The second switch, ``Optimise scales", dictates whether to optimise the kernel size used in the KNN classifier to give the best diagnosis, or to use those determined in our own work (see [8]).

The output will be contained in five files, called name, name.v12_norm, name.dat, name.colours and name.row, where name is the filename entered in the ``Output files" field. The name.dat file contains rows of the variables W1 to W5 (see above), followed by the subjects age, for each subject. The name file contains a list of the patients, with their disease code as specified in the original data list and the most likely disease code as determined by the nearest neighbour classifier. This is followed by two matrices. The first is a matrix of original diagnosis (columns) against the predicted diagnosis (rows). The second is the same matrix, but summed over the probabilities associated with the predicted diagnoses, rather than quantised to the most likely diagnosis. Note that both matrices have rows and columns for the non-existent disease code 0. The name.v12_norm file has the same entries as the .v12 file described above, but with the volume measurements normalised for age-related atrophy. The name.row file lists the subject labels from the original data list, and the name.colours file lists colours associated with each disease code. These last two files are used for data display in xgobi: if the name.dat file is loaded into this program, then the points will automatically be labelled and coloured using these files.


next up previous contents
Next: Quick Reference Up: Use of the DODECANTS Previous: CSF Volume Measurement   Contents
root 2017-09-24