[Fwd: eigenvectors differences with matlab on high dimensional data.]


Sender: Miguel Garcia Torres <mgarciat@ull.es>
Subject: eigenvectors differences with matlab on high dimensional data.


Hi all,

I am using Jama to perform PCA via SVD decomposition. I am testing the 
results
with those obtained with Matlab and for large data size (number of 
points >=
number of variables), I get the same results as Matlab (very very good 
precision).
But if the number of variables is greater than the examples, the last 
component
of the eigenvalues has a poor precision.

To clarify, if A is the matrix with [m,n] (where m correspond to the 
number of
points and n to the number of variables). If m>=n, then the eigenvectors 
correspond
to the columns of V. In this case the values correspond to Matlab results.
If m < n, then I tanspose the Matrix to perform the decomposition and so the
eigenvectors correspond to the columns of U. In this case, the last 
element of
the eigenvectors has a values wich is far from that obtained in Matlab.

You can download the matrices from
_http://webpages.ull.es/users/mgarciat/pca_hd.tgz_
This archive contains the following files:

pca_hd_p20.csv -> correspond to the high dimensional data (hd) with 20 
decimals.
pca_hd_eigenvectors.csv -> the eigenvectors obtained with Matlab.  
pca_hd_eigenvaluess.csv -> the eigenvalues obtained with Matlab.

When I compare I get some differences like:

[604,6]:  expected: 0.015720974334793        -0.004136528914528481
[604,10]:  expected: 0.052244404045946        -0.046277291821854284
[604,11]:  expected: -0.021269147636881        -0.0011694770500378223
[604,12]:  expected: 0.052382661544416        0.010050589203392619
[604,13]:  expected: -0.021673352215208        0.012748585718630039
[604,17]:  expected: 0.022376430522196        -0.01744958012069039

I would be very grateful if someone could check it and explained if these
differences correspond to an error or not. Although I am writing
some methods  (in Java), I could send the code if someone request it for
testing

Thanks you in advance,

MiguelGT

PS. Here I attach some code in Java

-----------------To read csv file into an array of 
doubles-----------------------
private static double[][] readMatrix(String fname) throws Exception {
        double[][] data = null;
        BufferedReader br = new BufferedReader(new FileReader(fname));

        String line = null;
        List<double[]> lst = new ArrayList<double[]>();
        while ((line = br.readLine()) != null) {
            String[] svalues = line.split(",");
            double[] row = new double[svalues.length];
            for (int i = 0; i < svalues.length; i++) {
                row[i] = Double.parseDouble(svalues[i]);
            }
            lst.add(row);
        }
        //
        data = new double[lst.size()][];
        for (int i = 0; i < lst.size(); i++) {
            data[i] = lst.get(i);
        }
        return data;
    }
---------------------------------------------------------------------------------------------

------To obtain the mean values of each column----------
public static double[] columnMeans(double[][] data) {
        //variable mean
        double[] mean = new double[data[0].length];
        for (int e = 0; e < mean.length; e++) {
            mean[e] = 0.;
        }
        for (int r = 0; r < data.length; r++) {
            for (int c = 0; c < data[r].length; c++) {
                mean[c] += data[r][c];
            }
        }
        for (int e = 0; e < mean.length; e++) {
            mean[e] /= (double) data.length;
        }
        return mean;
    }
------------------------------------------------------------------------
----To center the data-----------------------
public static double[][] centerData(final double[][] data, double[] mean) {
        double[][] cdata = new double[data.length][];
        for (int r = 0; r < data.length; r++) {
            cdata[r] = new double[data[r].length];
            for (int c = 0; c < data[r].length; c++) {
                cdata[r][c] = data[r][c] - mean[c];
            }
        }
        return cdata;
    }
------------------------------------------------------------------






Date Index | Thread Index | Problems or questions? Contact list-master@nist.gov