singlet.dataset.dimensionality

class singlet.dataset.dimensionality.DimensionalityReduction(dataset)[source]

Bases: singlet.dataset.plugins.Plugin

Reduce dimensionality of gene expression and phenotype

pca(n_dims=2, transform='log10', robust=True, random_state=None)[source]

Principal component analysis

Parameters:
  • n_dims (int) – Number of dimensions (2+).
  • transform (string or None) – Whether to preprocess the data.
  • robust (bool) – Whether to use Principal Component Pursuit to exclude outliers.
Returns:

dict of the left eigenvectors (vs), right eigenvectors (us)

of the singular value decomposition, eigenvalues (lambdas), the transform, and the whiten function (for plotting).

tsne(n_dims=2, perplexity=30, theta=0.5, rand_seed=0, **kwargs)[source]

t-SNE algorithm.

Parameters:
  • n_dims (int) – Number of dimensions to use.
  • perplexity (float) – Perplexity of the algorithm.
  • theta (float) – A number between 0 and 1. Higher is faster but less accurate (via the Barnes-Hut approximation).
  • rand_seed (int) – Random seed. -1 randomizes each run.
  • **kwargs – Named arguments passed to the t-SNE algorithm.

Returns:

umap(n_dims=2, rand_seed=0, **kwargs)[source]

Uniform Manifold Approximation and Projection.

Parameters:
  • n_dims (int) – Number of dimensions to use.
  • rand_seed (int) – Random seed. -1 randomizes each run.
  • **kwargs – Named arguments passed to umap.UMAP.

Returns: