Introduction
single-svdlib is a high-performance Rust library for computing Singular Value Decomposition (SVD) on sparse matrices. It provides efficient implementations of both Lanczos and randomized SVD algorithms, making it particularly well-suited for large-scale data analysis and machine learning applications.
What is SVD?
Singular Value Decomposition (SVD) is a fundamental matrix factorization technique that decomposes a matrix $A$ into three matrices:
$$A = U \Sigma V^T$$
Where:
- $U$ contains the left singular vectors
- $\Sigma$ is a diagonal matrix containing the singular values
- $V^T$ contains the transposed right singular vectors
SVD is widely used in dimensionality reduction, latent semantic analysis, matrix approximation, and as a key component in many machine learning algorithms.
Why single-svdlib?
Traditional SVD algorithms struggle with large sparse matrices common in applications like single-cell genomics, recommender systems, and natural language processing. single-svdlib addresses these challenges by:
- Optimizing for sparse matrices: Taking advantage of sparsity patterns to reduce computation and memory usage
- Providing multiple algorithm options: Choose between Lanczos or randomized methods based on your specific needs
- Supporting parallel processing: Built-in multithreading via Rayon for better performance on modern hardware
- Offering precision control: Works with both
f32
andf64
precision
Key Features
-
Multiple SVD algorithms:
- Lanczos algorithm: Based on SVDLIBC, excellent for moderate-sized sparse matrices
- Randomized SVD: Optimized for very large and sparse matrices with configurable accuracy
-
Sparse matrix support:
- Compressed Sparse Row (CSR) format
- Compressed Sparse Column (CSC) format
- Coordinate (COO) format
-
Performance optimizations:
- Parallel execution with Rayon
- Adaptive tuning for highly sparse matrices
- Column masking for subspace SVD
-
Complete API:
- Consistent interface for all matrix formats
- Comprehensive error handling and diagnostics
- Well-documented parameters for algorithm tuning
Architecture
single-svdlib has a modular design with these main components:
- Core traits and types: Common interfaces for all SVD algorithms
- Lanczos module: Implementation of the LAS2 algorithm from SVDLIBC
- Randomized module: Randomized SVD with power iteration and normalization
- Matrix abstractions: Adapters for different sparse matrix formats
- Utilities: Helper functions for linear algebra operations
Performance Tips
-
Choose the right algorithm:
- For matrices up to ~10,000 x 10,000 with moderate sparsity, use the Lanczos algorithm
- For larger matrices or very high sparsity (>99%), use randomized SVD
-
Matrix format matters:
- Convert COO matrices to CSR or CSC before computation
- CSR typically performs better for row-oriented operations
-
Tune randomized SVD parameters:
- Increase
n_power_iterations
for better accuracy (at the cost of speed) - Use
PowerIterationNormalizer::QR
for numerical stability - Adjust
n_oversamples
based on desired accuracy
- Increase
-
Use column masking for operations that only need a subset of the data
Diagnostics
Each SVD computation returns detailed diagnostics to help understand algorithm behavior:
let svd = lanczos::svd(&matrix)?;
// Examine computation details
println!("Non-zero elements: {}", svd.diagnostics.non_zero);
println!("Transposed during computation: {}", svd.diagnostics.transposed);
println!("Lanczos steps: {}", svd.diagnostics.lanczos_steps);
println!("Significant values found: {}", svd.diagnostics.significant_values);
License
single-svdlib is licensed under the BSD License, the same as the original SVDLIBC implementation.