SDDPACK: Software for the Semidiscrete Decomposition.
Copyright © 1999 Tamara G. Kolda and Dianne P. O'Leary.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
Tamara G. Kolda
Computational Sciences and Mathematics Research Department
Sandia National Labs
Livermore, CA 94551
Tamara.Kolda@na-net.ornl.gov
http://www.epm.ornl.gov/~kolda
Dianne P. O'Leary
Department of Computer Science
University of Maryland
College Park, MD 20742
oleary@cs.umd.edu
http://www.cs.umd.edu/users/oleary
Kolda's research supported by the Applied Mathematical Sciences Research Program of Office of Energy Research in the U.S. Department of Energy, under contract DE-AC05-96OR22464 with Lockheed Martin Energy Research Corporation; the National Physical Sciences Consortium in conjunction with the University of Maryland and the National Security Agency; and the IDA Center for Computing Sciences.
O'Leary's research supported by the National Science Foundation under Grant CCR-97-32022 and by the Departement Informatik, ETH Zurich, Switzerland.
Reports of bugs should be sent to Tamara.Kolda@na-net.ornl.gov.
The semidiscrete decomposition (SDD) approximates a matrix as a weighted sum of outer products formed by vectors with entries constrained to be in the set {-1, 0, 1}.
It is useful for image compression and for latent semantic indexing (LSI) in information retrieval.
The primary advantage of the SDD over other types of matrix approximations such as the truncated singular value decomposition (SVD) is that it typically provides a more accurate approximation for far less storage.
This package provides software in Matlab and in C to compute the SDD.
The software is available for download at http://www.cs.umd.edu/users/oleary/SDDPACK in the file SDDPACK.tgz.
After downloading the file SDDPACK.tgz
, type
"gunzip -c sdd.tgz | tar xvf -
". This creates
a directory called SDDPACK containing the following:
SDDPACK/README.html SDDPACK/Makefile SDDPACK/gpl.txt SDDPACK/include/sdd.h SDDPACK/include/qsortopt.h SDDPACK/src/sdd.c SDDPACK/src/decomp.c SDDPACK/src/convertmtx.c SDDPACK/src/convertsdd.c SDDPACK/matlab/mtxread.m SDDPACK/matlab/mtxwrite.m SDDPACK/matlab/sdd.m SDDPACK/matlab/sddweight.m SDDPACK/matlab/sddtensor.m SDDPACK/matlab/sddfun.m SDDPACK/matlab/sddweightfun.m SDDPACK/testdata/test1.mtx SDDPACK/testdata/test1.gif SDDPACK/doc/sddpack.ps
Most documentation to run the software is given in this file. Also, each file contains comments with information on the code. General information about the SDD and its variants is provided in the included paper:
Tamara G. Kolda and Dianne P. O'Leary, Computation and Uses of the Semidiscrete Matrix Decomposition, Computer Science Department Report CS-TR-4012 Institute for Advanced Computer Studies Report UMIACS-TR-99-22, University of Maryland, April 1999.
Matlab codes are included in the directory
SDDPACK/matab
. To use these codes, be
sure to be in the SDDPACK/matlab
directory (or add this
directory to your Matlab path).
We have included a small 5 x 5 test problem in testdata/test1.mtx
. The matlab
routine mtxread
, included
in SDDPACK, reads the matrix file.
>> A = mtxread('../testdata/test1.mtx'); >> full(A) ans = 0.5828 0.2259 0.2091 0.5678 0.4154 0.4235 0.5798 0.3798 0.7942 0.3050 0.5155 0.7604 0.7833 0.0592 0.8744 0.3340 0.5298 0.6808 0.6029 0.0150 0.4329 0.6405 0.4611 0.0503 0.7680
Using the routine sdd
,
included in SDDPACK, we can compute the 10-term SDD using the default
settings as follows:
>> [D, X, Y] = sdd(A, 10) D = 0.4797 0.3262 0.1939 0.0847 0.1017 0.0588 0.0387 0.0324 0.0551 0.0237 X = 1 0 -1 -1 1 1 -1 0 0 1 1 1 0 1 -1 1 -1 -1 0 -1 1 -1 1 1 0 -1 1 -1 0 -1 1 1 1 -1 -1 -1 0 1 0 -1 1 -1 0 0 0 1 1 0 1 0 Y = 1 0 0 0 1 0 0 -1 -1 0 1 0 1 1 0 1 1 0 1 0 1 0 1 0 0 -1 1 -1 0 1 1 1 -1 1 0 -1 -1 1 0 0 1 -1 0 1 0 0 -1 -1 0 0
We have also included a utility called
sddfun
that
compares the SVD to the SDD with different starting criteria.
We show the text below; the graphic is in
SDDPACK/testdata/test1.gif
.
>> sddfun(A) Computing SVD Rank of matrix: 5 Computing SDD with option 1 Computing SDD with option 2 Computing SDD with option 3 Computing SDD with option 4 Method Color Line Dot ------ ----- ---- --- SVD g - x SDD-1 b - o SDD-2 m : ^ SDD-3 r -. + SDD-4 c -- * Method Rel Resid Inn Its Density ------ --------- ------- ------- SDD-1 11.92 3.20 66.00 SDD-2 11.85 2.60 72.00 SDD-3 13.95 2.80 84.00 SDD-4 11.92 2.80 66.00
>> A = full(mtxread('../testdata/test1.mtx')); >> W = ones(5) - eye(5); >> [D, X, Y] = sddweight(A, W, 5) D = 0.7833 0.6029 0.7680 0.5828 0.5798 X = 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 Y = 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0SDDPACK also contains sddweightfun, similar to sddfun described above. Sample output is given below. This also produces a figure, but we do not include it here.
>> sddweightfun(A,W) Rank of matrix: 5 Computing SDDWEIGHT with option 1 Computing SDDWEIGHT with option 2 Computing SDDWEIGHT with option 3 Computing SDDWEIGHT with option 4 Method Color Line Dot ----------- ----- ---- --- SDDWEIGHT-1 b - o SDDWEIGHT-2 m : ^ SDDWEIGHT-3 r -. + SDDWEIGHT-4 c -- * Method Rel Resid Inn Its Density ----------- --------- ------- ------- SDDWEIGHT-1 13.11 3.10 52.00 SDDWEIGHT-2 13.77 2.80 52.00 SDDWEIGHT-3 9.85 3.40 76.00 SDDWEIGHT-4 10.03 3.40 66.00
>> rand('state', 0); >> A = rand(3,4,2); >> [D, X] = sddtensor(A, 5) D = 0.5712 0.3364 0.2278 0.2379 0.1797 X = [3x5 double] [4x5 double] [2x5 double] >> X{1} ans = 1 -1 1 0 0 1 1 -1 -1 1 1 -1 -1 1 0 >> X{2} ans = 1 0 1 1 1 1 0 -1 0 1 1 1 0 1 -1 1 1 0 0 0 >> X{3} ans = 1 0 1 1 1 1 1 1 -1 1
Change directories into SDDPACK
.
To compile the code, check the Makefile
. You should check the
location of the remove program and set the flags CC, CFLAGS, and LIB
for your system. Several examples have been included. Once that is
fixed, type "make
" at the command prompt.
The flag -DQSORTOPT
is optional. It provides an faster
qsort that the default on most machines.
Special Note: If you do not have a 32- or 64-bit
architecture, than you must be sure to include the following flag in
the CFLAGS definition, e.g., -DIDXSHIFT=7
if 128-bit architecture.
> decomp -k 10 testdata/test1.mtx testdata/test1.sdd *** output from decomp *** matrix file : testdata/test1.mtx SDD file : testdata/test1.sdd existing SDD file :terms : 10 accuracy : 0.000000 tolerance : 0.010000 max inner its : 100 y init choice : 1 input type : text 1 2 1.40882e+00 5.75170e+00 3 3 2 4 5.57350e-01 8.51472e-01 3 6 3 7 2.18833e-01 3.38517e-01 3 9 4 10 1.32799e-01 8.60342e-02 2 11 5 11 1.01791e-01 3.10079e-02 2 13 6 12 4.99871e-02 5.18036e-02 4 17 7 17 2.59716e-02 2.40155e-02 2 19 8 19 1.33357e-02 1.26359e-02 3 22 9 22 7.27355e-03 6.06216e-03 2 24 10 23 5.02049e-03 2.25306e-03 2 26 -- SDD information -- final residual norm : 7.0855e-02 final relative residual norm: 0.026 total outer iterations : 10 average inner iterations : 2.600 average init iterations : 2.300 > more testdata/test1.sdd %% Semidiscrete Decomposition (SDD) %% Matrix: testdata/test1.mtx Terms: 10 Accr: 0.00e+00 Tol: 1.00e-02 InnIts: 100 Init: 1 10 5 5 4.7965389490127563476562500e-01 3.2624220848083496093750000e-01 1.9394071400165557861328125e-01 8.4672987461090087890625000e-02 1.0166592150926589965820312e-01 5.8767084032297134399414062e-02 3.8742337375879287719726562e-02 3.2449815422296524047851562e-02 5.5055230855941772460937500e-02 2.3733202368021011352539062e-02 1 1 1 1 1 0 1 -1 1 -1 -1 0 1 1 0 -1 1 1 -1 0 1 -1 0 -1 0 1 1 -1 -1 1 -1 -1 1 0 1 0 -1 -1 1 0 0 0 0 0 1 1 -1 -1 -1 0 1 1 1 1 1 0 0 0 1 -1 0 1 1 -1 0 0 1 0 1 1 1 0 0 0 0 0 1 -1 -1 0 0 1 1 -1 -1 -1 0 -1 1 -1 -1 1 0 0 0 0 0 1 0 0
Note that these results match what we got with the matlab version of SDD on the same matrix.
By default, the SDD performs i/o in text mode. However, to get the advantage that the SDD has in terms of size, it is best to use binary i/o. We have included conversion programs for the matrices and SDD's. Their usage is shown below.
> convertmtx testdata/test1.mtx testdata/test1.mtx.dat *** output from convertmtx *** Input File : testdata/test1.mtx Output File : testdata/test1.mtx.dat > decomp -k 10 -b testdata/test1.mtx.dat testdata/test1.sdd.dat *** output from decomp *** matrix file : testdata/test1.mtx.dat SDD file : testdata/test1.sdd.dat existing SDD file :terms : 10 accuracy : 0.000000 tolerance : 0.010000 max inner its : 100 y init choice : 1 input type : binary 1 2 1.40882e+00 5.75170e+00 3 3 2 4 5.57350e-01 8.51472e-01 3 6 3 7 2.18833e-01 3.38517e-01 3 9 4 10 1.32799e-01 8.60342e-02 2 11 5 11 1.01791e-01 3.10079e-02 2 13 6 12 4.99871e-02 5.18036e-02 4 17 7 17 2.59716e-02 2.40155e-02 2 19 8 19 1.33357e-02 1.26359e-02 3 22 9 22 7.27355e-03 6.06216e-03 2 24 10 23 5.02049e-03 2.25306e-03 2 26 -- SDD information -- final residual norm : 7.0855e-02 final relative residual norm: 0.026 total outer iterations : 10 average inner iterations : 2.600 average init iterations : 2.300 > convertsdd -b testdata/test1.sdd.dat test1.sdd *** output from convertsdd *** Input File : testdata/test1.sdd.dat Output File : test1.sdd > more testdata/test1.sdd %% Semidiscrete Decomposition (SDD) %% Matrix: testdata/test1.mtx Terms: 10 Accr: 0.00e+00 Tol: 1.00e-02 InnIts: 100 Init: 1 10 5 5 4.7965389490127563476562500e-01 3.2624220848083496093750000e-01 1.9394071400165557861328125e-01 8.4672987461090087890625000e-02 1.0166592150926589965820312e-01 5.8767084032297134399414062e-02 3.8742337375879287719726562e-02 3.2449815422296524047851562e-02 5.5055230855941772460937500e-02 2.3733202368021011352539062e-02 1 1 1 1 1 0 1 -1 1 -1 -1 0 1 1 0 -1 1 1 -1 0 1 -1 0 -1 0 1 1 -1 -1 1 -1 -1 1 0 1 0 -1 -1 1 0 0 0 0 0 1 1 -1 -1 -1 0 1 1 1 1 1 0 0 0 1 -1 0 1 1 -1 0 0 1 0 1 1 1 0 0 0 0 0 1 -1 -1 0 0 1 1 -1 -1 -1 0 -1 1 -1 -1 1 0 0 0 0 0 1 0 0
Even this example demonstrates how much room can be saved using
binary mode rather than text: The file test1.sdd
requires
772 bytes of storage, while test1.sdd.dat
requires only
236 bytes.
mtxfiles.tgz
.
Type "gunzip -c mtxfiles.tgz | tar xvf -" to unzip them.
Copyright © 1999 Tamara G. Kolda and Dianne P. O'Leary
Comments? Tamara.Kolda@na-net.ornl.gov
Last modified: Tue Apr 6 11:23:57 EDT 1999