Syllable Walker

Overview

Syllable Walker - Phonetic Feature Space Exploration

The syllable walker is a phonetic exploration tool that generates sequences of syllables by “walking” through phonetic feature space using cost-based random selection. It enables corpus analysis, pattern discovery, and exploration of phonetic relationships. This is a build-time analysis tool only - not used during runtime name generation.

The walker explores syllable datasets by moving probabilistically from one syllable to phonetically similar syllables. Each step considers:

  • Phonetic distance - How many features change (Hamming distance)

  • Frequency bias - Preference for common vs rare syllables

  • Temperature - Amount of randomness in selection

  • Inertia - Tendency to stay at current syllable

Key Features:

  • Four pre-configured profiles (clerical, dialect, goblin, ritual)

  • Custom parameter control for fine-tuned exploration

  • Deterministic walks (same seed = same walk, reproducible)

  • Interactive web interface for browser-based exploration

  • Batch processing to generate thousands of walks for analysis

  • Fast operation (<10ms per walk after initialization)

  • Large corpus support (efficiently handles 500k+ syllables)

Main Components:

  • SyllableWalker: Core walking algorithm with efficient neighbor graph

  • WalkProfile: Configuration preset for different walking behaviors

  • WALK_PROFILES: Predefined profiles (clerical, dialect, goblin, ritual)

Usage:
>>> from build_tools.syllable_walk import SyllableWalker
>>>
>>> # Load annotated syllables
>>> walker = SyllableWalker("data/annotated/syllables_annotated.json")
>>>
>>> # Walk using a profile
>>> walk = walker.walk_from_profile(
...     start="ka",
...     profile="dialect",
...     steps=5,
...     seed=42
... )
>>>
>>> # Display walk sequence
>>> print(" → ".join(s["syllable"] for s in walk))
ka → ki → ti → ta → da → de

CLI Usage:

# Walk with a profile
python -m build_tools.syllable_walk data.json --start ka --profile dialect --steps 5

# Launch interactive web interface
python -m build_tools.syllable_walk data.json --web --port 8080

# Batch walks for analysis
python -m build_tools.syllable_walk data.json --batch 100 --profile ritual

Core Concepts

Phonetic Distance

Each syllable has 12 binary phonetic features (from syllable_feature_annotator). The distance between two syllables is the number of features that differ (Hamming distance). The max_flips parameter limits how many features can change in a single step.

Neighbor Graph

During initialization, the walker pre-computes which syllables are “neighbors” (within the specified Hamming distance). This enables fast walk generation:

  • Distance 1: ~30 sec initialization, conservative walks

  • Distance 2: ~1 min initialization, moderate walks

  • Distance 3: ~3 min initialization, maximum flexibility

For 500k+ syllable datasets, distance 3 is recommended.

Determinism

The same seed always produces the same walk. This is essential for reproducible experiments, testing, and debugging. Each walk uses an isolated RNG instance to avoid global state contamination.

Walk Profiles

The walker includes four pre-configured profiles:

Profile

Description

Steps

Max Flips

Temperature

Freq Weight

Use Case

clerical

Conservative, minimal change

5

1

0.3

1.0

Formal names

dialect

Balanced exploration

5

2

0.7

0.0

General use

goblin

Chaotic, high variation

5

2

1.5

-0.5

Exotic names

ritual

Maximum exploration

5

3

2.5

-1.0

Extreme variation

Frequency Weight controls syllable selection:

  • Positive values (e.g. 1.0) favor common syllables

  • Zero (0.0) is neutral

  • Negative values (e.g. -1.0) favor rare syllables

Temperature controls randomness:

  • Low (0.3) = more deterministic, prefer lowest-cost moves

  • High (2.5) = more random, explore high-cost moves

Command-Line Interface

Explore syllable feature space via cost-based random walks

usage: python -m build_tools.syllable_walk [-h] [--start SYLLABLE]
                                           [--profile NAME] [--steps N]
                                           [--seed SEED] [--max-flips N]
                                           [--temperature T]
                                           [--frequency-weight W]
                                           [--compare-profiles] [--batch N]
                                           [--search QUERY] [--web]
                                           [--output FILE] [--quiet]
                                           [--verbose]
                                           [--max-neighbor-distance N]
                                           [--port PORT]
                                           data_file

Positional Arguments

data_file

Path to syllables_annotated.json file (output of syllable_feature_annotator). This file contains syllables with phonetic features and frequency information. Example: data/annotated/syllables_annotated.json

walk parameters

Parameters controlling syllable walk behavior. These work with any mode except –search.

--start

Starting syllable for the walk. If not specified, a random syllable will be chosen. Must be a syllable present in the data file. Use –search to find valid syllables. Examples: ‘ka’, ‘bak’, ‘the’. Default: random syllable

--profile

Possible choices: clerical, dialect, goblin, ritual

Walk profile preset defining behavior characteristics. Available profiles: clerical (conservative, favors common syllables), dialect (balanced exploration, neutral frequency), goblin (chaotic, favors rare syllables), ritual (maximum exploration, very rare syllables). Each profile has predefined max_flips, temperature, and frequency_weight values. Can be overridden with custom parameters. Default: dialect

Default: 'dialect'

--steps

Number of steps to take in the walk. Each step visits one syllable. Output length will be steps + 1 (includes starting syllable). Valid range: 0-1000. Examples: 5 (quick walk), 20 (longer exploration). Default: 5

Default: 5

--seed

Random seed for reproducible walks. Same seed with same parameters always produces identical walks. This is useful for testing, debugging, or generating consistent examples. If not specified, uses system randomness (non-reproducible). Examples: 42, 12345. Default: None (random)

custom parameters

Advanced parameters that override profile settings. Use these to fine-tune walk behavior beyond predefined profiles.

--max-flips

Possible choices: 1, 2, 3

Maximum number of phonetic features that can change per step. This controls the Hamming distance constraint between consecutive syllables. Higher values allow more dramatic phonetic changes. Valid values: 1 (very conservative), 2 (moderate), 3 (maximum). Must be <= max-neighbor-distance. Overrides profile setting. Examples: 1 for minimal change, 3 for maximum variation. Default: determined by profile

--temperature

Exploration temperature controlling randomness (0.1-5.0). Higher values increase randomness and exploration, making the walk more likely to choose high-cost transitions. Lower values make walks more deterministic, strongly preferring low-cost moves. Overrides profile setting. Typical values: 0.3 (conservative), 0.7 (balanced), 1.5 (exploratory), 2.5 (chaotic). Default: determined by profile

--frequency-weight

Frequency bias weight (-2.0 to 2.0). Controls whether the walk favors common or rare syllables. Positive values: Favor common syllables (e.g., 1.0 strongly favors common). Zero: Neutral, no frequency bias. Negative values: Favor rare syllables (e.g., -1.0 strongly favors rare). Overrides profile setting. Examples: 1.0 (prefer common), 0.0 (neutral), -1.0 (prefer rare). Default: determined by profile

operation modes

Different modes of operation. These modes are mutually exclusive. If no mode is specified, performs a single walk.

--compare-profiles

Compare all four walk profiles from the same starting syllable. Generates one walk for each profile (clerical, dialect, goblin, ritual) using the same seed (if specified), allowing direct comparison of different behaviors. The –profile argument is ignored in this mode. Output shows walks side-by-side with profile descriptions. Useful for understanding profile differences.

Default: False

--batch

Generate N walks in batch mode. Each walk starts from a random syllable (unless –start is specified, then all walks start from the same syllable). Useful for statistical analysis, corpus exploration, or generating large datasets. Combine with –output to save results to JSON file. Progress is shown during generation. Examples: –batch 100 for analysis, –batch 1000 for corpus stats. Valid range: 1-10000

--search

Search for syllables matching the query string. Performs case-insensitive substring match against all syllables in the dataset. Shows up to 20 matches with frequency information. Useful for finding valid starting syllables or exploring corpus contents. Does not perform walk generation. Examples: –search ‘th’ finds ‘the’, ‘thi’, ‘tha’, etc. –search ‘ka’ finds ‘ka’, ‘kan’, ‘kaf’, etc.

--web

Start interactive web interface instead of command-line mode. Launches a web server with a browser-based interface for exploring walks interactively. All walk parameters can be adjusted in the browser. The server runs until stopped with Ctrl+C. Use –port to specify custom port (default: 5000). Other CLI arguments are ignored in web mode. Access at http://localhost:5000 after starting.

Default: False

output options

Control output format, destination, and verbosity.

--output

Save results to JSON file instead of printing to console. Parent directories will be created if they don’t exist. Output format depends on mode: single walk saves walk details with profile and seed info; batch mode saves array of walks with metadata. File can be used for further analysis or visualization. Examples: –output results/walks.json, –output batch_data.json

--quiet

Suppress progress messages and verbose output. Only prints final results or errors. Useful for scripting, piping output, or when running in automated environments. Cannot be combined with –verbose. Progress bars and initialization messages are hidden in quiet mode.

Default: False

--verbose

Enable verbose output showing initialization progress, neighbor graph construction details, and detailed walk information. Shows memory usage, processing time, and intermediate steps. Useful for understanding performance, debugging, or learning how the walker works. Cannot be combined with –quiet. Significantly increases output volume.

Default: False

walker configuration

Advanced configuration for the walker engine. These settings affect initialization time and memory usage.

--max-neighbor-distance

Possible choices: 1, 2, 3

Maximum Hamming distance for pre-computing neighbor graph (1-3). During initialization, the walker computes which syllables are ‘neighbors’ (similar in phonetic features). Higher values allow larger –max-flips but significantly increase initialization time and memory usage. Should be >= largest –max-flips you plan to use. Initialization time (500k syllables): ~30 sec (1), ~1 min (2), ~3 min (3). Memory impact: ~50MB (1), ~150MB (2), ~300MB (3). Default: 3 (recommended for maximum flexibility)

Default: 3

--port

Port number for web server when using –web mode. Only applies when –web flag is specified, otherwise ignored. Choose a port that is not already in use by another service. Common alternatives: 8000, 8080, 3000. If the port is in use, the server will fail to start with an error message. Valid range: 1024-65535 (ports below 1024 require root/admin). Default: 5000

Default: 5000

# Generate a single walk with default profile (dialect)
python -m build_tools.syllable_walk data.json --start ka

# Use specific profile
python -m build_tools.syllable_walk data.json --start bak --profile goblin --steps 10

# Compare all profiles from same starting point
python -m build_tools.syllable_walk data.json --start ka --compare-profiles

# Generate batch of 50 walks and save to JSON
python -m build_tools.syllable_walk data.json --batch 50 --profile ritual --output walks.json

# Search for syllables containing "th"
python -m build_tools.syllable_walk data.json --search "th"

# Custom walk parameters (overrides profile)
python -m build_tools.syllable_walk data.json --start ka --steps 10 \
    --max-flips 2 --temperature 1.5 --frequency-weight -0.8 --seed 42

# Start interactive web interface (opens on http://localhost:5000)
python -m build_tools.syllable_walk data.json --web

# Start web interface on custom port
python -m build_tools.syllable_walk data.json --web --port 8000

For detailed documentation, see: claude/build_tools/syllable_walk.md

Output Format

Single Walk

{
  "walk": [
    {
      "syllable": "ka",
      "frequency": 20,
      "features": [0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0]
    },
    {
      "syllable": "pai",
      "frequency": 9,
      "features": [0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0]
    }
  ],
  "profile": "dialect",
  "start": "ka",
  "seed": 42
}

Batch Output

{
  "walks": [
    {
      "walk": [
        {"syllable": "ka", "frequency": 20, "features": [0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0]},
        {"syllable": "ki", "frequency": 15, "features": [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0]}
      ],
      "start": "ka",
      "seed": 42
    },
    {
      "walk": [
        {"syllable": "bak", "frequency": 8, "features": [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1]},
        {"syllable": "pak", "frequency": 5, "features": [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1]}
      ],
      "start": "bak",
      "seed": 43
    }
  ],
  "profile": "dialect",
  "parameters": {
    "steps": 5,
    "max_flips": 2,
    "temperature": 0.7,
    "frequency_weight": 0.0
  }
}

Integration Guide

The syllable walker uses output from the feature annotator:

# Step 1: Extract syllables from dictionary
python -m build_tools.pyphen_syllable_extractor --file wordlist.txt --auto

# Step 2: Normalize syllables
python -m build_tools.pyphen_syllable_normaliser \
  --source data/corpus/ \
  --output data/normalized/

# Step 3: Annotate with phonetic features
python -m build_tools.syllable_feature_annotator \
  --syllables data/normalized/syllables_unique.txt \
  --frequencies data/normalized/syllables_frequencies.json \
  --output data/annotated/syllables_annotated.json

# Step 4: Explore with syllable walker
python -m build_tools.syllable_walk data/annotated/syllables_annotated.json --web

When to use this tool:

  • To explore phonetic connectivity in your syllable corpus

  • To test if desired phonetic transitions exist before creating patterns

  • To discover interesting phonetic progressions for name generation

  • To analyze corpus structure and syllable relationships

  • To generate datasets for statistical analysis of phonetic patterns

Common Use Cases:

Understanding Corpus Structure:

Generate many walks to see how syllables connect:

# Generate 100 walks for corpus analysis
python -m build_tools.syllable_walk data.json --batch 100 --output corpus_walks.json

Analyze the JSON output to understand syllable connectivity, central hubs, and phonetic pathways.

Testing Pattern Viability:

Explore if desired phonetic transitions exist before creating new patterns:

# Test phonetic transitions with ritual profile
python -m build_tools.syllable_walk data.json --start the --profile ritual

Finding Interesting Sequences:

Discover unusual but valid phonetic progressions:

# Explore unusual sequences with goblin profile
python -m build_tools.syllable_walk data.json --profile goblin --steps 10

Statistical Analysis:

Generate large datasets for analysis:

# Generate 1000 walks with dialect profile
python -m build_tools.syllable_walk data.json --batch 1000 \
  --profile dialect --output dialect_walks.json

# Generate 1000 walks with goblin profile
python -m build_tools.syllable_walk data.json --batch 1000 \
  --profile goblin --output goblin_walks.json

Then analyze frequency distributions, transition patterns, etc.

Advanced Topics

Web Interface

The web interface provides an intuitive way to explore syllable walks without command-line complexity.

Starting the Server:

# Default port (5000)
python -m build_tools.syllable_walk data/annotated/syllables_annotated.json --web

# Custom port
python -m build_tools.syllable_walk data/annotated/syllables_annotated.json \
  --web --port 8000

# Quiet mode (suppress initialization messages)
python -m build_tools.syllable_walk data/annotated/syllables_annotated.json \
  --web --quiet

Features:

  • Profile selection - Choose from four profiles or use custom parameters

  • Starting syllable - Specify start or use random

  • Real-time generation - Instant walk generation with visual feedback

  • Walk display - See full path and syllable details with frequencies

  • Statistics tracking - Total syllables and walks generated

  • Reproducible - Optional seed for deterministic walks

The web server uses Python’s standard library http.server (no Flask dependency).

Algorithm Details

Cost Function:

Each potential step has a cost based on:

  1. Hamming distance - Number of features that change

  2. Feature-specific costs - Some features cost more to change

  3. Frequency weight - Bias toward common or rare syllables

  4. Inertia - Tendency to stay at current syllable

The walker uses softmax selection with temperature to probabilistically choose the next syllable:

For each neighbor n:
  hamming_cost = sum(feature_costs[i] for i where features differ)
  freq_cost = frequency_weight × log(frequency[n])
  total_cost = hamming_cost + freq_cost + inertia_cost

Probability of selecting n:
  P(n) = exp(-cost(n) / temperature) / sum(exp(-cost(k) / temperature))

Higher temperature = more random selection (flattens probability distribution)

Lower temperature = more deterministic (strongly favors lowest cost)

Performance

Initialization Time (500k syllables):

Max Neighbor Distance

Time

Memory

Max Flips Supported

1

~30 seconds

~50 MB

1

2

~1 minute

~150 MB

1-2

3

~3 minutes

~300 MB

1-3

Walk Generation:

  • After initialization: <10ms per walk (instant)

  • Deterministic: Same seed always produces same walk

  • Scalable: Speed independent of corpus size

Notes

Dependencies:

  • Requires NumPy for efficient feature matrix operations (build-time dependency)

  • Uses standard library http.server for web interface (no Flask)

Performance Characteristics:

  • Initialization is one-time cost (30 sec - 3 min depending on distance)

  • Walk generation is instant after initialization (<10ms per walk)

  • Designed for large corpus analysis (500k+ syllables)

  • Determinism guaranteed via isolated RNG instances

Troubleshooting:

Initialization Takes Too Long:

  • Reduce --max-neighbor-distance (default 3 → try 2)

  • Use smaller corpus for testing

  • Initialization is one-time cost, walk generation is instant

Getting Stuck at One Syllable:

  • Increase --max-flips (allow bigger phonetic jumps)

  • Increase --temperature (more randomness)

  • Check if starting syllable is isolated with --search

Walks Too Random:

  • Decrease --temperature (less randomness)

  • Adjust --frequency-weight (try -0.5 for rare syllables)

  • Try different profiles (clerical for conservative)

Port Already in Use (Web Mode):

# Use a different port if 5000 is occupied
python -m build_tools.syllable_walk data.json --web --port 8000

Build-time tool:

This is a build-time analysis tool only - not used during runtime name generation.

Related Documentation:

For detailed usage guide, see: claude/build_tools/syllable_walk.md

API Reference

Syllable Walker - Phonetic Feature Space Exploration

The syllable walker is a phonetic exploration tool that generates sequences of syllables by “walking” through phonetic feature space using cost-based random selection. It enables corpus analysis, pattern discovery, and exploration of phonetic relationships. This is a build-time analysis tool only - not used during runtime name generation.

The walker explores syllable datasets by moving probabilistically from one syllable to phonetically similar syllables. Each step considers:

  • Phonetic distance - How many features change (Hamming distance)

  • Frequency bias - Preference for common vs rare syllables

  • Temperature - Amount of randomness in selection

  • Inertia - Tendency to stay at current syllable

Key Features:

  • Four pre-configured profiles (clerical, dialect, goblin, ritual)

  • Custom parameter control for fine-tuned exploration

  • Deterministic walks (same seed = same walk, reproducible)

  • Interactive web interface for browser-based exploration

  • Batch processing to generate thousands of walks for analysis

  • Fast operation (<10ms per walk after initialization)

  • Large corpus support (efficiently handles 500k+ syllables)

Main Components:

  • SyllableWalker: Core walking algorithm with efficient neighbor graph

  • WalkProfile: Configuration preset for different walking behaviors

  • WALK_PROFILES: Predefined profiles (clerical, dialect, goblin, ritual)

Usage:
>>> from build_tools.syllable_walk import SyllableWalker
>>>
>>> # Load annotated syllables
>>> walker = SyllableWalker("data/annotated/syllables_annotated.json")
>>>
>>> # Walk using a profile
>>> walk = walker.walk_from_profile(
...     start="ka",
...     profile="dialect",
...     steps=5,
...     seed=42
... )
>>>
>>> # Display walk sequence
>>> print(" → ".join(s["syllable"] for s in walk))
ka → ki → ti → ta → da → de

CLI Usage:

# Walk with a profile
python -m build_tools.syllable_walk data.json --start ka --profile dialect --steps 5

# Launch interactive web interface
python -m build_tools.syllable_walk data.json --web --port 8080

# Batch walks for analysis
python -m build_tools.syllable_walk data.json --batch 100 --profile ritual
class build_tools.syllable_walk.SyllableWalker(data_path, max_neighbor_distance=3, feature_costs=None, inertia_cost=0.5, verbose=False)[source]

Bases: object

Navigate syllable feature space via cost-based random walks.

This class efficiently handles large syllable datasets (500k+) by pre-computing neighbor relationships and using vectorized operations where possible.

The walker performs a one-time expensive computation during initialization to build a neighbor graph, mapping each syllable to nearby syllables within a maximum Hamming distance. After initialization, walk generation is extremely fast (<10ms per walk).

syllables

List of all syllable strings

frequencies

NumPy array of syllable frequencies (uint32)

feature_matrix

NumPy array of binary feature vectors (N x 12, uint8)

syllable_to_idx

Dict mapping syllable text to index

neighbor_graph

Dict mapping syllable index to list of neighbor indices

max_neighbor_distance

Maximum Hamming distance for neighbors

feature_costs

Dict of costs for each feature flip

inertia_cost

Cost of staying at current syllable

Example

>>> walker = SyllableWalker("syllables_annotated.json", verbose=True)
>>> walk = walker.walk_from_profile(
...     start="ka", profile="dialect", steps=5, seed=42
... )
>>> print(walker.format_walk(walk))
ka → ki → ti → ta → da → de

Notes

  • Initialization time: ~2-3 minutes for 500k syllables

  • Walk generation: <10ms per walk after initialization

  • Memory usage: ~200-300 MB for 500k syllables

  • Thread safety: Not thread-safe (use separate instances)

__init__(data_path, max_neighbor_distance=3, feature_costs=None, inertia_cost=0.5, verbose=False)[source]

Initialize the syllable walker with pre-computed neighbor graph.

Parameters:
  • data_path (Path | str) – Path to syllables_annotated.json file (output of syllable_feature_annotator)

  • max_neighbor_distance (int) – Maximum Hamming distance for pre-computing neighbors (1-3). Higher values = more neighbors = slower initialization + more memory, but allows larger feature flips per step. Default: 3 (recommended)

  • feature_costs (Optional[Dict[str, float]]) – Custom feature cost dictionary. If None, uses DEFAULT_FEATURE_COSTS. Keys must match FEATURE_KEYS.

  • inertia_cost (float) – Cost of staying at current syllable. Higher values discourage staying put. Default: 0.5

  • verbose (bool) – If True, print progress during initialization (neighbor graph construction can take 2-3 minutes for 500k syllables)

Raises:

Notes

  • Initialization performs expensive one-time computation

  • Use verbose=True for long-running initializations

  • Consider caching the neighbor graph (future optimization)

format_walk(walk, arrow=' ')[source]

Format a walk as a string with arrows.

Parameters:
  • walk (List[Dict]) – Walk result from walk() or walk_from_profile()

  • arrow (str) – Separator between syllables (default: “ → “)

Return type:

str

Returns:

Formatted walk string

Example

>>> walk = walker.walk_from_profile("ka", "dialect", steps=5, seed=42)
>>> walker.format_walk(walk)
'ka → ki → ti → ta → da → de'
>>> walker.format_walk(walk, arrow=" -> ")
'ka -> ki -> ti -> ta -> da -> de'
get_available_profiles()[source]

Get all available walk profiles.

Return type:

Dict[str, WalkProfile]

Returns:

Dictionary mapping profile names to WalkProfile objects

Example

>>> profiles = walker.get_available_profiles()
>>> for name in profiles:
...     print(name)
clerical
dialect
goblin
ritual
get_random_syllable(seed=None)[source]

Get a random syllable from the dataset.

Parameters:

seed (Optional[int]) – Random seed for reproducibility (default: None)

Return type:

str

Returns:

Random syllable text

Example

>>> walker.get_random_syllable(seed=42)
'ka'
>>> walker.get_random_syllable(seed=42)
'ka'  # Same seed = same result
get_syllable_info(syllable)[source]

Get information about a specific syllable.

Parameters:

syllable (str) – Syllable text to look up

Returns:

syllable, frequency, features Returns None if syllable not found

Return type:

Syllable dictionary with keys

Example

>>> info = walker.get_syllable_info("ka")
>>> if info:
...     print(f"Frequency: {info['frequency']}")
Frequency: 1234
walk(start, steps, max_flips, temperature=1.0, frequency_weight=0.0, seed=None)[source]

Execute a syllable walk through feature space.

Starting from a syllable, takes steps steps through feature space, choosing each next syllable probabilistically based on: - Feature flip cost (weighted Hamming distance) - Frequency cost (rarity penalty/bonus) - Temperature (exploration vs exploitation) - Inertia (tendency to stay put)

The walk uses softmax selection over candidate neighbors: 1. Find all neighbors within max_flips distance 2. Compute cost for each neighbor (flip cost + rarity cost) 3. Add inertia option (staying at current syllable) 4. Apply softmax with temperature: weight_i = exp(-cost_i / T) 5. Sample next syllable proportional to weights

Parameters:
  • start (int | str) – Starting syllable (syllable text or index)

  • steps (int) – Number of steps to take (each step visits one syllable)

  • max_flips (int) – Maximum feature flips allowed per step (1-3). Must be <= max_neighbor_distance from __init__.

  • temperature (float) – Exploration temperature (0.1-5.0). Higher values increase randomness. Typical values: - 0.3: Conservative, minimal exploration - 0.7: Balanced - 1.5: High exploration - 2.5: Maximum randomness

  • frequency_weight (float) – Frequency bias (-2.0 to 2.0): - Positive: Favor common syllables - Zero: Neutral - Negative: Favor rare syllables Typical values: -1.0, 0.0, 1.0

  • seed (Optional[int]) – Random seed for reproducibility. Same seed = same walk. If None, uses system randomness (non-reproducible).

Returns:

  • “syllable”: Syllable text (str)

  • ”frequency”: Corpus frequency (int)

  • ”features”: Binary feature vector (list of 12 ints)

Length = steps + 1 (includes starting syllable)

Return type:

List of syllable dictionaries with keys

Raises:

Example

>>> walker = SyllableWalker("data.json")
>>> walk = walker.walk(
...     start="ka",
...     steps=5,
...     max_flips=2,
...     temperature=0.7,
...     frequency_weight=0.0,
...     seed=42
... )
>>> len(walk)
6  # start + 5 steps
>>> walk[0]["syllable"]
'ka'

Notes

  • Deterministic: Same seed always produces same walk

  • Uses local Random instance (doesn’t affect global random state)

  • Inertia option allows walk to stay at current syllable

walk_from_profile(start, profile, steps=5, seed=None)[source]

Execute a walk using a named profile.

Convenience method that uses predefined WalkProfile parameters. See WALK_PROFILES for available profiles.

Parameters:
  • start (int | str) – Starting syllable (text or index)

  • profile (str | WalkProfile) – Profile name (“clerical”, “dialect”, “goblin”, “ritual”) or WalkProfile object

  • steps (int) – Number of steps to take (default: 5)

  • seed (Optional[int]) – Random seed for reproducibility (default: None)

Return type:

List[Dict]

Returns:

List of syllable dictionaries (same as walk())

Raises:

ValueError – If profile name not found

Example

>>> walker = SyllableWalker("data.json")
>>> walk = walker.walk_from_profile("ka", "goblin", steps=10, seed=42)
>>> print(walker.format_walk(walk))
ka → kha → gha → ghe → ge → gwe → ...
class build_tools.syllable_walk.WalkProfile(name, description, max_flips, temperature, frequency_weight)[source]

Bases: object

Configuration profile for a syllable walk.

A profile encapsulates all parameters needed for a walk, providing named presets for different behaviors.

name

Human-readable profile name (e.g., “Dialect Walk”)

description

Brief description of profile behavior

max_flips

Maximum feature flips allowed per step (1-3)

temperature

Exploration temperature (0.1-5.0)

frequency_weight

Frequency bias (-2.0 to 2.0)

Example

>>> profile = WalkProfile(
...     name="Custom Walk",
...     description="High temperature, neutral frequency",
...     max_flips=2,
...     temperature=2.0,
...     frequency_weight=0.0
... )
>>> print(profile)
Custom Walk: High temperature, neutral frequency
__str__()[source]

String representation showing name and description.

Return type:

str

description: str
frequency_weight: float
max_flips: int
name: str
temperature: float
build_tools.syllable_walk.get_profile(name)[source]

Get a walk profile by name.

Parameters:

name (str) – Profile name (case-insensitive)

Return type:

WalkProfile

Returns:

WalkProfile object

Raises:

ValueError – If profile name not found

Example

>>> profile = get_profile("goblin")
>>> profile.temperature
1.5
>>> profile = get_profile("GOBLIN")  # Case-insensitive
>>> profile.temperature
1.5
build_tools.syllable_walk.list_profiles()[source]

Get all available walk profiles.

Return type:

Dict[str, WalkProfile]

Returns:

Dictionary mapping profile names to WalkProfile objects (copy)

Example

>>> profiles = list_profiles()
>>> for name, profile in profiles.items():
...     print(f"{name}: {profile.description}")
clerical: Conservative, favors common syllables, minimal phonetic change
dialect: Moderate exploration, neutral frequency bias
goblin: Chaotic, favors rare syllables, high phonetic variation
ritual: Maximum exploration, strongly favors rare syllables