Skip to main content

Utilities Module

The memory/utils module provides utility functions for working with in-memory data representations in SingleRust. These utilities help with data type conversions, matrix transformations, and other helper operations for efficient memory management.

Type Conversion Utilities

One of the primary roles of this module is to handle type conversions between different numerical formats.

Converting to Float Types

When performing numerical operations like normalization, data often needs to be in floating-point format. The convert_to_float_if_non_float_type function handles this conversion:

use single_rust::memory::utils::convert_to_float_if_non_float_type;
use single_rust::shared::Precision;
use anndata_memory::IMArrayElement;

// Convert matrix to floating-point (f32 or f64)
convert_to_float_if_non_float_type(&matrix, Some(Precision::Single))?;

This function will convert integer or other numeric types to floating-point format, with options for:

  • Single precision (f32) - Faster, uses less memory
  • Double precision (f64) - Higher accuracy, but more memory-intensive

Checking for Conversion Needs

To check if a matrix needs conversion before operations:

use single_rust::memory::utils::target_type_float_need_conversion_in_memory;

let needs_conversion = target_type_float_need_conversion_in_memory(&matrix_datatype)?;
if needs_conversion {
// Perform conversion
}

Matrix Format Conversions

Converting Between Array Types

Functions for converting between different array representations:

// Convert CSR matrix to Array2<f64>
let array2d = convert_to_array_f64_csr(csr_matrix, shape)?;

// Convert CSC matrix to Array2<f64>
let array2d = convert_to_array_f64_csc(csc_matrix, shape)?;

// Convert dynamic array to Array2<f64>
let array2d = convert_to_array_f64_array(dyn_array)?;

Selective Conversions

You can also convert only selected parts of matrices:

use single_rust::shared::convert_to_array_f64_selected;
use anndata::data::SelectInfoElem;

// Convert only specific rows and columns
let dense_submatrix = convert_to_array_f64_selected(
&array_data,
shape,
&row_selection, // SelectInfoElem for rows
&col_selection // SelectInfoElem for columns
)?;

DataFrame Creation

Create DataFrames from collections:

use single_rust::memory::utils::{create_dataframe_from_map, create_string_dataframe_from_map};
use std::collections::HashMap;

// Create DataFrame from numeric data
let values_map: HashMap<String, Vec<f64>> = HashMap::new();
// ... populate map ...
let df = create_dataframe_from_map(&values_map)?;

// Create DataFrame from string data
let string_map: HashMap<String, Vec<String>> = HashMap::new();
// ... populate map ...
let df = create_string_dataframe_from_map(&string_map)?;

Sparse Matrix Conversion

Convert between different sparse matrix formats:

// Convert CSR matrix with specific numeric type
let f32_csr_matrix = convert_csr_sparse_matrix::<i32, f32>(i32_csr_matrix)?;
let f64_csr_matrix = convert_csr_sparse_matrix::<f32, f64>(f32_csr_matrix)?;

// Convert CSC matrix with specific numeric type
let f32_csc_matrix = convert_csc_sparse_matrix::<i32, f32>(i32_csc_matrix)?;
let f64_csc_matrix = convert_csc_sparse_matrix::<f32, f64>(f32_csc_matrix)?;

Performance Considerations

When working with the utility functions in this module, keep these performance considerations in mind:

  1. Memory Usage: Converting between types (especially to higher precision) increases memory usage
  2. Precision Tradeoffs: Precision::Single (f32) uses half the memory of Precision::Double (f64) but with lower precision
  3. Copy Operations: Many conversion functions create copies of data, which can be memory-intensive for large matrices
  4. Sparse vs. Dense: Converting sparse matrices to dense formats significantly increases memory usage

Example: Type Conversion for Normalization

A common use case is converting data types before normalization:

use single_rust::memory::utils::convert_to_float_if_non_float_type;
use single_rust::memory::processing::normalize_expression;
use single_rust::shared::{Precision, Direction};
use anndata_memory::IMArrayElement;

fn prepare_and_normalize(matrix: &IMArrayElement, target: u32) -> anyhow::Result<()> {
// Ensure the matrix is in single-precision float format
convert_to_float_if_non_float_type(matrix, Some(Precision::Single))?;

// Now perform normalization
normalize_expression(matrix, target, &Direction::ROW, None)?;

Ok(())
}

This ensures that the matrix is in the right format before applying numerical operations.

Advanced Usage: Custom Dataframe Creation

When creating custom statistics:

use std::collections::HashMap;
use polars::prelude::{DataFrame, Series};
use single_rust::memory::utils::create_dataframe_from_map;

fn compute_custom_statistics(data: &[f64], categories: &[String]) -> anyhow::Result<DataFrame> {
let mut stats_map: HashMap<String, Vec<f64>> = HashMap::new();

// Initialize map with empty vectors
stats_map.insert("mean".to_string(), Vec::new());
stats_map.insert("median".to_string(), Vec::new());
stats_map.insert("std".to_string(), Vec::new());

// Group data by categories and compute statistics
// ... (calculation logic) ...

// Create a DataFrame from the computed statistics
create_dataframe_from_map(&stats_map)
}

This approach is useful for creating structured statistical results that can be added to the AnnData object.