build_tools.name_selector.selector
Main selector orchestration logic.
This module provides the high-level selection function that coordinates loading candidates, evaluating them against a policy, and producing ranked output.
The selector is the central orchestrator of the Selection Policy Layer. It ties together: - Candidate loading (from name_combiner output) - Policy evaluation (from policy.py) - Result ranking and filtering
Usage
>>> from build_tools.name_selector import select_names, load_name_classes
>>>
>>> # Load policies and candidates
>>> policies = load_name_classes("data/name_classes.yml")
>>> with open("candidates/pyphen_candidates_2syl.json") as f:
... candidates_data = json.load(f)
>>>
>>> # Select names
>>> selected = select_names(
... candidates=candidates_data["candidates"],
... policy=policies["first_name"],
... count=100,
... mode="hard",
... )
>>>
>>> for name in selected[:5]:
... print(f"{name['name']}: score={name['score']}, rank={name['rank']}")
Functions
|
Select and rank name candidates against a policy. |
|
Compute statistics about a selection operation. |
Module Contents
- build_tools.name_selector.selector.select_names(candidates, policy, count=100, mode='hard', order='alphabetical', seed=None)[source]
Select and rank name candidates against a policy.
Evaluates all candidates, filters out rejected ones, ranks by score, and returns the top N.
Parameters
- candidatesSequence[dict]
List of candidate dictionaries from name_combiner output. Each must have “name”, “syllables”, and “features” keys.
- policyNameClassPolicy
The policy to evaluate against.
- countint, optional
Maximum number of names to return. Default: 100.
- mode{“hard”, “soft”}, optional
Evaluation mode. “hard” rejects on discouraged features. “soft” applies penalties. Default: “hard”.
- order{“alphabetical”, “random”}, optional
Ordering for names with equal scores. “alphabetical” sorts by name for deterministic output. “random” shuffles within score groups using the provided seed. Default: “alphabetical”.
- seedint, optional
RNG seed for random ordering. Only used when order=”random”. Required for deterministic random ordering. Default: None.
Returns
- list[dict]
List of selected candidates, sorted by score (descending). Each candidate is augmented with “score”, “rank”, and “evaluation”.
Examples
>>> selected = select_names(candidates, policy, count=50) >>> selected[0]["rank"] 1 >>> selected[0]["score"] # Highest score 4 >>> len(selected) 50
Notes
The returned candidates are augmented with: - score: int - The policy score - rank: int - 1-based rank (1 = best) - evaluation: dict - Detailed evaluation breakdown
- build_tools.name_selector.selector.compute_selection_statistics(candidates, policy, mode='hard')[source]
Compute statistics about a selection operation.
Evaluates all candidates and returns aggregate statistics without building the full result list.
Parameters
- candidatesSequence[dict]
List of candidate dictionaries.
- policyNameClassPolicy
The policy to evaluate against.
- mode{“hard”, “soft”}, optional
Evaluation mode. Default: “hard”.
Returns
- dict
Statistics dictionary containing: - total_evaluated: int - admitted: int - rejected: int - rejection_reasons: dict[str, int] - score_distribution: dict[int, int] (score -> count)
Examples
>>> stats = compute_selection_statistics(candidates, policy) >>> stats["admitted"] 2341 >>> stats["rejection_reasons"]["ends_with_stop"] 1234