hebrew_flash_cards/scripts
Sochen 6d2d446ed5 feat: pseudo-frequency for confusables using English word frequency
264 confusable groups where all entries shared the same Hebrew frequency
now have differentiated pseudo_frequency values based on English word
commonality (hermitdave en_50k.txt). Most common meaning keeps base
rank; less common meanings get +100 offset per position.

Examples:
- אב: "father" (en:194) → 2491, "bud" (en:2963) → 2591
- אח: "brother" (en:300) → 911, "fireplace" (en:9389) → 1011

Builder uses pseudo_frequency for sort order when available.
Confusable card definitions now sorted most-common-first.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 05:28:30 +00:00
..
assign_frequency.py feat: YAP-cleaned frequency corpus + two-tier assignment pipeline 2026-03-10 06:22:55 +00:00
assign_pseudo_frequency.py feat: pseudo-frequency for confusables using English word frequency 2026-04-03 05:28:30 +00:00
check_guid_coverage.py Sprint 11: unified JSON architecture + consolidated scraping pipeline 2026-03-08 10:54:58 +00:00
clean_frequency_corpus.py feat: YAP-cleaned frequency corpus + two-tier assignment pipeline 2026-03-10 06:22:55 +00:00
extract_verb_list.py Sprint 9: cloze cards, plurals deck, project reorg, lint tooling 2026-03-07 08:09:39 +00:00
validate_data.py Sprint 17: homograph example dedup + plural audio + prep extraction 2026-03-14 21:51:35 +00:00