build_tools.name_combiner
Name Combiner - Structural Name Candidate Generation
Generates N-syllable name candidates from an annotated syllable corpus by combining syllables and aggregating their features to the name level. This is a build-time tool only - not used during runtime name generation.
This module is the first stage of the Selection Policy Layer. It performs structural combination without policy evaluation - that responsibility belongs to the name_selector module.
- Architectural Boundary:
Candidate generation is a structural step, not a decision-making step. All governance, admissibility, and rejection logic remains exclusively within the name_selector module.
Features: - Deterministic combination with seed control - Frequency-weighted syllable sampling - Feature aggregation to name level (majority rule for nucleus) - Output to extraction run’s candidates/ directory
Aggregation Rules: - Onset features (starts_with_*): First syllable only - Coda features (ends_with_*): Final syllable only - Internal features (contains_*): Boolean OR across all syllables - Nucleus features (short_vowel, long_vowel): Majority rule (>50%)
- Usage:
>>> from build_tools.name_combiner import combine_syllables, aggregate_features >>> candidates = combine_syllables(annotated_data, syllable_count=2, count=100) >>> for candidate in candidates: ... print(f"{candidate['name']}: {candidate['features']}")
CLI:
python -m build_tools.name_combiner \
--run-dir _working/output/20260110_115453_pyphen/ \
--syllables 2 \
--count 10000 \
--seed 42