Skip to main content

Linear Algebra Operations

single-algebra provides robust linear algebra capabilities built on multiple backend options. These operations form the foundation for dimensionality reduction, matrix decomposition, and other advanced analytical techniques essential for working with high-dimensional data.

SVD (Singular Value Decomposition)

SVD is a fundamental matrix factorization technique that decomposes a matrix into three matrices: U, Σ, and V^T. single-algebra implements SVD with multiple backend options:

  • LAPACK-based SVD: Leverages the industry-standard LAPACK library through the nalgebra-lapack crate with OpenBLAS backend for high performance.
  • Faer-based SVD: Uses the Faer library, a pure Rust implementation optimized for modern CPU architectures.
  • Single-SVDLib: A specialized implementation for large sparse matrices, with support for both Lanczos algorithm and randomized approaches.

All SVD implementations expose a consistent interface for retrieving:

  • Left singular vectors (U)
  • Singular values (Σ)
  • Right singular vectors (V^T)
  • Matrix reconstruction
// Example of using LAPACK-based SVD
let mut svd = crate::svd::lapack::SVD::new();
svd.compute(matrix.view())?;

// Access components
let u = svd.u().unwrap();
let s = svd.s().unwrap();
let vt = svd.vt().unwrap();

// Reconstruct the original matrix
let reconstructed = svd.reconstruct().unwrap();

PCA (Principal Component Analysis)

PCA is implemented as a higher-level abstraction built on SVD, with multiple implementations optimized for different use cases:

Dense PCA

Full-matrix PCA implementation with comprehensive features:

// Create a PCA model with LapackSVD backend
let mut pca = PCABuilder::new(LapackSVD)
.n_components(2)
.center(true)
.scale(false)
.build();

// Fit the model to data
pca.fit(data.view())?;

// Transform data to the principal component space
let transformed = pca.transform(data.view())?;

Sparse PCA

Specialized version optimized for sparse matrices, with better memory efficiency for high-dimensional, sparse data:

// Create a Sparse PCA model
let mut sparse_pca = SparsePCABuilder::<f64>::new()
.n_components(50)
.center(true)
.alpha(1.0)
.svd_method(SVDMethod::Lanczos)
.build();

// Fit and transform in one step
let transformed = sparse_pca.fit_transform(&sparse_matrix)?;

Masked Sparse PCA

Feature selection integrated into PCA computation for analyzing specific subsets of features:

// Create a boolean mask for feature selection
let feature_mask = vec![true, false, true, true, false]; // Only use 1st, 3rd, and 4th features

// Create a Masked Sparse PCA model
let mut masked_pca = MaskedSparsePCABuilder::<f64>::new()
.n_components(2)
.mask(feature_mask)
.svd_method(SVDMethod::Random {
n_oversamples: 10,
n_power_iterations: 5,
normalizer: PowerIterationNormalizer::QR,
})
.build();

// Fit and transform
let transformed = masked_pca.fit_transform(&sparse_matrix)?;

PCA Features

All PCA implementations provide:

  • Configurable Components: Control the number of principal components to extract
  • Centering and Scaling: Options for data preprocessing
  • Variance Analysis: Calculate explained variance ratios and cumulative explained variance
  • Feature Importance: Determine the contribution of each feature to the principal components

The modular architecture allows for extension with different SVD implementations via the SVDImplementation trait, enabling easy switching between backends.

SVD Method Selection

single-algebra supports multiple SVD computation methods:

// Lanczos algorithm - good for sparse matrices
let svd_method = SVDMethod::Lanczos;

// Randomized SVD - faster for very large matrices
let svd_method = SVDMethod::Random {
n_oversamples: 10, // Additional dimensions for better accuracy
n_power_iterations: 7, // Power iterations for enhanced accuracy
normalizer: PowerIterationNormalizer::QR, // Normalization method
};

Matrix Operations

The library provides comprehensive operations for both sparse and dense matrices:

Matrix Similarity Measures

// Calculate cosine similarity between vectors
let cosine_sim = CosineSimilarity;
let similarity = cosine_sim.calculate(vector_a.view(), vector_b.view());

// Calculate Euclidean similarity
let euclidean_sim = EuclideanSimilarity::new(1.0);
let similarity = euclidean_sim.calculate(vector_a.view(), vector_b.view());

// Other similarity measures
let pearson = PearsonSimilarity;
let manhattan = ManhattanSimilarity::default();
let jaccard = JaccardSimilarity::new(0.001);

Sparse Matrix Utilities

Efficient operations optimized for sparse matrices:

// Count non-zero elements per column
let nnz_per_col: Vec<u32> = sparse_matrix.nonzero_col()?;

// Calculate column sums
let col_sums: Vec<f64> = sparse_matrix.sum_col()?;

// Calculate column variances
let variances: Vec<f64> = sparse_matrix.var_col::<u32, f64>()?;

// Find min/max values per column
let (min_vals, max_vals): (Vec<f64>, Vec<f64>) = sparse_matrix.min_max_col()?;

// Normalize columns to sum to 1.0
sparse_matrix.normalize(&col_sums, 1.0, &Direction::COLUMN)?;

Performance Features

single-algebra includes numerous optimizations for high-performance computing:

  • Parallel Processing: Multi-threaded implementations using Rayon for critical operations
  • Chunk-wise Processing: Efficient data handling for better cache utilization
  • Specialized Sparse Algorithms: Algorithms tailored to exploit sparsity patterns
  • SIMD Acceleration: Optional SIMD support through the simba feature

Integration with External Libraries

single-algebra provides seamless integration with multiple linear algebra ecosystems:

  • nalgebra: Core integration for general-purpose linear algebra operations
  • ndarray: Integration for n-dimensional array processing
  • Faer: Modern, SIMD-optimized implementations
  • BLAS/LAPACK: Industry-standard high-performance routines through OpenBLAS

Application Areas

These linear algebra operations serve as building blocks for:

  • Dimensionality reduction in high-dimensional data (like single-cell RNA-seq)
  • Feature extraction in machine learning pipelines
  • Signal processing and data compression
  • Network and graph analysis
  • Statistical modeling and inference

The modular design of single-algebra allows users to select the most appropriate implementation for their specific use case, balancing accuracy, performance, and memory requirements.