Skip to main content

Introduction

single-svdlib is a high-performance Rust library for computing Singular Value Decomposition (SVD) on sparse matrices. It provides efficient implementations of both Lanczos and randomized SVD algorithms, making it particularly well-suited for large-scale data analysis and machine learning applications.

What is SVD?

Singular Value Decomposition (SVD) is a fundamental matrix factorization technique that decomposes a matrix $A$ into three matrices:

$$A = U \Sigma V^T$$

Where:

  • $U$ contains the left singular vectors
  • $\Sigma$ is a diagonal matrix containing the singular values
  • $V^T$ contains the transposed right singular vectors

SVD is widely used in dimensionality reduction, latent semantic analysis, matrix approximation, and as a key component in many machine learning algorithms.

Why single-svdlib?

Traditional SVD algorithms struggle with large sparse matrices common in applications like single-cell genomics, recommender systems, and natural language processing. single-svdlib addresses these challenges by:

  1. Optimizing for sparse matrices: Taking advantage of sparsity patterns to reduce computation and memory usage
  2. Providing multiple algorithm options: Choose between Lanczos or randomized methods based on your specific needs
  3. Supporting parallel processing: Built-in multithreading via Rayon for better performance on modern hardware
  4. Offering precision control: Works with both f32 and f64 precision

Key Features

  • Multiple SVD algorithms:

    • Lanczos algorithm: Based on SVDLIBC, excellent for moderate-sized sparse matrices
    • Randomized SVD: Optimized for very large and sparse matrices with configurable accuracy
  • Sparse matrix support:

    • Compressed Sparse Row (CSR) format
    • Compressed Sparse Column (CSC) format
    • Coordinate (COO) format
  • Performance optimizations:

    • Parallel execution with Rayon
    • Adaptive tuning for highly sparse matrices
    • Column masking for subspace SVD
  • Complete API:

    • Consistent interface for all matrix formats
    • Comprehensive error handling and diagnostics
    • Well-documented parameters for algorithm tuning

Architecture

single-svdlib has a modular design with these main components:

  1. Core traits and types: Common interfaces for all SVD algorithms
  2. Lanczos module: Implementation of the LAS2 algorithm from SVDLIBC
  3. Randomized module: Randomized SVD with power iteration and normalization
  4. Matrix abstractions: Adapters for different sparse matrix formats
  5. Utilities: Helper functions for linear algebra operations

Performance Tips

  1. Choose the right algorithm:

    • For matrices up to ~10,000 x 10,000 with moderate sparsity, use the Lanczos algorithm
    • For larger matrices or very high sparsity (>99%), use randomized SVD
  2. Matrix format matters:

    • Convert COO matrices to CSR or CSC before computation
    • CSR typically performs better for row-oriented operations
  3. Tune randomized SVD parameters:

    • Increase n_power_iterations for better accuracy (at the cost of speed)
    • Use PowerIterationNormalizer::QR for numerical stability
    • Adjust n_oversamples based on desired accuracy
  4. Use column masking for operations that only need a subset of the data

Diagnostics

Each SVD computation returns detailed diagnostics to help understand algorithm behavior:

let svd = lanczos::svd(&matrix)?;

// Examine computation details
println!("Non-zero elements: {}", svd.diagnostics.non_zero);
println!("Transposed during computation: {}", svd.diagnostics.transposed);
println!("Lanczos steps: {}", svd.diagnostics.lanczos_steps);
println!("Significant values found: {}", svd.diagnostics.significant_values);

License

single-svdlib is licensed under the BSD License, the same as the original SVDLIBC implementation.