build_tools.syllable_analysis.dimensionality.mapping
Coordinate mapping utilities for dimensionality reduction results.
This module provides functions for creating and saving mappings between syllables and their reduced-dimension coordinates (e.g., from t-SNE, PCA, UMAP).
Functions
|
Create syllable→features→coordinates mapping. |
|
Save t-SNE mapping to JSON file. |
Module Contents
- build_tools.syllable_analysis.dimensionality.mapping.create_tsne_mapping(records, tsne_coords)[source]
Create syllable→features→coordinates mapping.
Combines annotated syllable records with their t-SNE coordinates to create a comprehensive mapping structure. This is useful for: - Post-hoc cluster analysis - Cross-referencing visualizations - Interactive exploration - Sharing visualizations with collaborators
- Parameters:
records (list[dict]) – Original annotated syllable records from load_annotated_syllables(). Each record should have: - syllable (str): The syllable text - frequency (int): Occurrence count - features (dict): Boolean feature flags
tsne_coords (numpy.ndarray) – t-SNE coordinate array (n_syllables × n_dimensions). Typically 2D for visualization, but can be 3D or higher.
- Returns:
- [
- {
“syllable”: “kran”, “frequency”: 7, “tsne_x”: -2.34, “tsne_y”: 5.67, “features”: {…}
]
- Return type:
List of mapping records with structure
- Raises:
ValueError – If records and tsne_coords have mismatched lengths
Example
>>> records = [ ... {"syllable": "ka", "frequency": 187, "features": {...}}, ... {"syllable": "ran", "frequency": 42, "features": {...}} ... ] >>> coords = np.array([[-2.1, 3.4], [1.5, -0.8]]) >>> mapping = create_tsne_mapping(records, coords) >>> mapping[0]["tsne_x"] -2.1 >>> mapping[0]["syllable"] 'ka'
Notes
Array indices preserve order from input records
Coordinates are converted from numpy float64 to Python float for JSON compatibility
All original record fields are preserved in the mapping
For 2D t-SNE: creates tsne_x and tsne_y fields
For 3D+ t-SNE: creates tsne_x, tsne_y, tsne_z, … fields
- build_tools.syllable_analysis.dimensionality.mapping.save_tsne_mapping(mapping, output_path, indent=2)[source]
Save t-SNE mapping to JSON file.
Writes the syllable→coordinates mapping as formatted JSON for human readability and programmatic access.
- Parameters:
mapping (list[dict]) – Mapping data from create_tsne_mapping()
output_path (pathlib.Path) – Output file path (should end in .json)
indent (int) – JSON indentation for readability (default: 2)
Example
>>> from pathlib import Path >>> mapping = [{"syllable": "ka", "tsne_x": -2.1, "tsne_y": 3.4, "features": {...}}] >>> save_tsne_mapping(mapping, Path("output.json"))
Notes
Output is formatted with indentation for human readability
Uses ensure_ascii=False to preserve Unicode characters
UTF-8 encoding ensures international character support
Parent directories are created if they don’t exist