build_tools.syllable_analysis.dimensionality.mapping

Coordinate mapping utilities for dimensionality reduction results.

This module provides functions for creating and saving mappings between syllables and their reduced-dimension coordinates (e.g., from t-SNE, PCA, UMAP).

Functions

create_tsne_mapping(records, tsne_coords)

Create syllable→features→coordinates mapping.

save_tsne_mapping(mapping, output_path[, indent])

Save t-SNE mapping to JSON file.

Module Contents

build_tools.syllable_analysis.dimensionality.mapping.create_tsne_mapping(records, tsne_coords)[source]

Create syllable→features→coordinates mapping.

Combines annotated syllable records with their t-SNE coordinates to create a comprehensive mapping structure. This is useful for: - Post-hoc cluster analysis - Cross-referencing visualizations - Interactive exploration - Sharing visualizations with collaborators

Parameters:
  • records (List[Dict]) – Original annotated syllable records from load_annotated_syllables(). Each record should have: - syllable (str): The syllable text - frequency (int): Occurrence count - features (dict): Boolean feature flags

  • tsne_coords (numpy.ndarray) – t-SNE coordinate array (n_syllables × n_dimensions). Typically 2D for visualization, but can be 3D or higher.

Returns:

[
{

“syllable”: “kran”, “frequency”: 7, “tsne_x”: -2.34, “tsne_y”: 5.67, “features”: {…}

]

Return type:

List of mapping records with structure

Raises:

ValueError – If records and tsne_coords have mismatched lengths

Example

>>> records = [
...     {"syllable": "ka", "frequency": 187, "features": {...}},
...     {"syllable": "ran", "frequency": 42, "features": {...}}
... ]
>>> coords = np.array([[-2.1, 3.4], [1.5, -0.8]])
>>> mapping = create_tsne_mapping(records, coords)
>>> mapping[0]["tsne_x"]
-2.1
>>> mapping[0]["syllable"]
'ka'

Notes

  • Array indices preserve order from input records

  • Coordinates are converted from numpy float64 to Python float for JSON compatibility

  • All original record fields are preserved in the mapping

  • For 2D t-SNE: creates tsne_x and tsne_y fields

  • For 3D+ t-SNE: creates tsne_x, tsne_y, tsne_z, … fields

build_tools.syllable_analysis.dimensionality.mapping.save_tsne_mapping(mapping, output_path, indent=2)[source]

Save t-SNE mapping to JSON file.

Writes the syllable→coordinates mapping as formatted JSON for human readability and programmatic access.

Parameters:
  • mapping (List[Dict]) – Mapping data from create_tsne_mapping()

  • output_path (pathlib.Path) – Output file path (should end in .json)

  • indent (int) – JSON indentation for readability (default: 2)

Example

>>> from pathlib import Path
>>> mapping = [{"syllable": "ka", "tsne_x": -2.1, "tsne_y": 3.4, "features": {...}}]
>>> save_tsne_mapping(mapping, Path("output.json"))

Notes

  • Output is formatted with indentation for human readability

  • Uses ensure_ascii=False to preserve Unicode characters

  • UTF-8 encoding ensures international character support

  • Parent directories are created if they don’t exist