build_tools.name_selector.selector ================================== .. py:module:: build_tools.name_selector.selector .. autoapi-nested-parse:: Main selector orchestration logic. This module provides the high-level selection function that coordinates loading candidates, evaluating them against a policy, and producing ranked output. The selector is the central orchestrator of the Selection Policy Layer. It ties together: - Candidate loading (from name_combiner output) - Policy evaluation (from policy.py) - Result ranking and filtering Usage ----- >>> from build_tools.name_selector import select_names, load_name_classes >>> >>> # Load policies and candidates >>> policies = load_name_classes("data/name_classes.yml") >>> with open("candidates/pyphen_candidates_2syl.json") as f: ... candidates_data = json.load(f) >>> >>> # Select names >>> selected = select_names( ... candidates=candidates_data["candidates"], ... policy=policies["first_name"], ... count=100, ... mode="hard", ... ) >>> >>> for name in selected[:5]: ... print(f"{name['name']}: score={name['score']}, rank={name['rank']}") Functions --------- .. autoapisummary:: build_tools.name_selector.selector.select_names build_tools.name_selector.selector.compute_selection_statistics Module Contents --------------- .. py:function:: select_names(candidates, policy, count = 100, mode = 'hard', order = 'alphabetical', seed = None) Select and rank name candidates against a policy. Evaluates all candidates, filters out rejected ones, ranks by score, and returns the top N. Parameters ---------- candidates : Sequence[dict] List of candidate dictionaries from name_combiner output. Each must have "name", "syllables", and "features" keys. policy : NameClassPolicy The policy to evaluate against. count : int, optional Maximum number of names to return. Default: 100. mode : {"hard", "soft"}, optional Evaluation mode. "hard" rejects on discouraged features. "soft" applies penalties. Default: "hard". order : {"alphabetical", "random"}, optional Ordering for names with equal scores. "alphabetical" sorts by name for deterministic output. "random" shuffles within score groups using the provided seed. Default: "alphabetical". seed : int, optional RNG seed for random ordering. Only used when order="random". Required for deterministic random ordering. Default: None. Returns ------- list[dict] List of selected candidates, sorted by score (descending). Each candidate is augmented with "score", "rank", and "evaluation". Examples -------- >>> selected = select_names(candidates, policy, count=50) >>> selected[0]["rank"] 1 >>> selected[0]["score"] # Highest score 4 >>> len(selected) 50 Notes ----- The returned candidates are augmented with: - score: int - The policy score - rank: int - 1-based rank (1 = best) - evaluation: dict - Detailed evaluation breakdown .. py:function:: compute_selection_statistics(candidates, policy, mode = 'hard') Compute statistics about a selection operation. Evaluates all candidates and returns aggregate statistics without building the full result list. Parameters ---------- candidates : Sequence[dict] List of candidate dictionaries. policy : NameClassPolicy The policy to evaluate against. mode : {"hard", "soft"}, optional Evaluation mode. Default: "hard". Returns ------- dict Statistics dictionary containing: - total_evaluated: int - admitted: int - rejected: int - rejection_reasons: dict[str, int] - score_distribution: dict[int, int] (score -> count) Examples -------- >>> stats = compute_selection_statistics(candidates, policy) >>> stats["admitted"] 2341 >>> stats["rejection_reasons"]["ends_with_stop"] 1234