hebrew_flash_cards/scripts
Sochen af186e2030 Sprint 17: homograph example dedup + plural audio + prep extraction
- Homograph collision fix: _deduplicate_confusable_examples() clears
  shared examples from less-common confusable group members (36 entries
  fixed). Keeps examples only on highest-frequency meaning.
- Plural deck audio: wired up PluralAudio field in apkg_builder.py,
  downloaded 613 plural audio files from pealim.com for all deck entries.
- Prep extraction upstream: moved Hebrew preposition parsing from build
  time into list/detail scrapers (SCHEMA.yaml prep field added).
- Validation: new no_shared_confusable_examples check in validate_data.py
- Tests: 9 new unit tests for confusable deduplication (98 total)
- Release: v0.19

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 21:51:35 +00:00
..
assign_frequency.py feat: YAP-cleaned frequency corpus + two-tier assignment pipeline 2026-03-10 06:22:55 +00:00
check_guid_coverage.py Sprint 11: unified JSON architecture + consolidated scraping pipeline 2026-03-08 10:54:58 +00:00
clean_frequency_corpus.py feat: YAP-cleaned frequency corpus + two-tier assignment pipeline 2026-03-10 06:22:55 +00:00
extract_verb_list.py Sprint 9: cloze cards, plurals deck, project reorg, lint tooling 2026-03-07 08:09:39 +00:00
validate_data.py Sprint 17: homograph example dedup + plural audio + prep extraction 2026-03-14 21:51:35 +00:00