build_tools.name_selector.selector

Main selector orchestration logic.

This module provides the high-level selection function that coordinates loading candidates, evaluating them against a policy, and producing ranked output.

The selector is the central orchestrator of the Selection Policy Layer. It ties together: - Candidate loading (from name_combiner output) - Policy evaluation (from policy.py) - Result ranking and filtering

Usage

>>> from build_tools.name_selector import select_names, load_name_classes
>>>
>>> # Load policies and candidates
>>> policies = load_name_classes("data/name_classes.yml")
>>> with open("candidates/pyphen_candidates_2syl.json") as f:
...     candidates_data = json.load(f)
>>>
>>> # Select names
>>> selected = select_names(
...     candidates=candidates_data["candidates"],
...     policy=policies["first_name"],
...     count=100,
...     mode="hard",
... )
>>>
>>> for name in selected[:5]:
...     print(f"{name['name']}: score={name['score']}, rank={name['rank']}")

Functions

`select_names`(candidates, policy[, count, mode, order, ...])	Select and rank name candidates against a policy.
`compute_selection_statistics`(candidates, policy[, mode])	Compute statistics about a selection operation.

Module Contents

build_tools.name_selector.selector.select_names(candidates, policy, count=100, mode='hard', order='alphabetical', seed=None)[source]

Select and rank name candidates against a policy.

Evaluates all candidates, filters out rejected ones, ranks by score, and returns the top N.

Parameters

candidatesSequence[dict]: List of candidate dictionaries from name_combiner output. Each must have “name”, “syllables”, and “features” keys.
policyNameClassPolicy: The policy to evaluate against.
countint, optional: Maximum number of names to return. Default: 100.
mode{“hard”, “soft”}, optional: Evaluation mode. “hard” rejects on discouraged features. “soft” applies penalties. Default: “hard”.
order{“alphabetical”, “random”}, optional: Ordering for names with equal scores. “alphabetical” sorts by name for deterministic output. “random” shuffles within score groups using the provided seed. Default: “alphabetical”.
seedint, optional: RNG seed for random ordering. Only used when order=”random”. Required for deterministic random ordering. Default: None.

Returns

list[dict]: List of selected candidates, sorted by score (descending). Each candidate is augmented with “score”, “rank”, and “evaluation”.

Examples

>>> selected = select_names(candidates, policy, count=50)
>>> selected[0]["rank"]
1
>>> selected[0]["score"]  # Highest score
4
>>> len(selected)
50

Notes

The returned candidates are augmented with: - score: int - The policy score - rank: int - 1-based rank (1 = best) - evaluation: dict - Detailed evaluation breakdown

build_tools.name_selector.selector.compute_selection_statistics(candidates, policy, mode='hard')[source]

Compute statistics about a selection operation.

Evaluates all candidates and returns aggregate statistics without building the full result list.

Parameters

candidatesSequence[dict]: List of candidate dictionaries.
policyNameClassPolicy: The policy to evaluate against.
mode{“hard”, “soft”}, optional: Evaluation mode. Default: “hard”.

Returns

dict: Statistics dictionary containing: - total_evaluated: int - admitted: int - rejected: int - rejection_reasons: dict[str, int] - score_distribution: dict[int, int] (score -> count)

Examples

>>> stats = compute_selection_statistics(candidates, policy)
>>> stats["admitted"]
2341
>>> stats["rejection_reasons"]["ends_with_stop"]
1234