- Regenerated all example sentences from scratch (deleted legacy + stale entries) - Added .txt file support to epub_examples.py for Ben Yehuda corpus - 7 Ben Yehuda nikkud'd children's texts + 3 new Time Tunnel EPUBs - Maqaf-stripped construct form indexing (+68% inflected matches) - Total: 3,598 words with examples, 3,289 with cloze (was ~2,900) - Cloze prefix preservation (_cloze_prefix_len) - Hebrew spoiler stripping from English meanings - Gender field (זָכָר/נְקֵבָה) on vocab cards - sec-table CSS layout for aligned key:value pairs - Mishkal uses mishkal_hebrew on plural cards - Improved mishkal extraction from pealim detail pages - 21 new pytest tests (cloze, PoS, Hebrew stripping, gender, mishkal) - 2 new validate_data.py tests + mishkal stats - Colliding forms tracking (local-only) - Release tag v0.17 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
61 lines
1.1 KiB
Text
61 lines
1.1 KiB
Text
archive
|
|
nikkud.csv
|
|
practice.py
|
|
cardinal_one_to_ten.*
|
|
*.swp
|
|
bin**
|
|
lib**
|
|
include**
|
|
lib64**
|
|
pyvenv.cfg
|
|
venv/
|
|
__pycache__/
|
|
*.pyc
|
|
.pytest_cache/
|
|
|
|
# Large generated cache files (rebuild locally)
|
|
data/benyehuda_index.json
|
|
data/colliding_forms.json
|
|
|
|
# Audio directories (large; rebuild locally)
|
|
data/audio/
|
|
data/audio_conj/
|
|
|
|
# Image cache and downloads (rebuild with image_fetch.py)
|
|
data/image_cache.json
|
|
data/images/
|
|
|
|
# Output .apkg files (generated by pipeline)
|
|
output/
|
|
|
|
# Internal / private files — not for public repo
|
|
ANKIWEB_DESCRIPTION.md
|
|
PROJECT_NOTES.md
|
|
PROJECTS.md
|
|
SPRINT_LOG.md
|
|
CLAUDE.md
|
|
RECOMMENDATIONS.md
|
|
|
|
# Intermediate scrape progress files
|
|
data/ktiv_male_forms.json.partial
|
|
data/ktiv_male_forms_partial.json
|
|
data/ktiv_scrape_progress.json
|
|
data/noun_slug_map_progress.json
|
|
data/top_verbs_to_scrape.json
|
|
|
|
# EPUB source files (large; user-specific)
|
|
data/epubs/
|
|
|
|
# Stray deck files
|
|
Everything__*.apkg
|
|
*.apkg
|
|
|
|
# Legacy CSV files (replaced by data/words.json)
|
|
*.csv
|
|
data/*.csv
|
|
|
|
# Dead whitelist files
|
|
vulture_whitelist.py
|
|
|
|
# Release artifacts — distributed via Forgejo releases, not committed to tree
|
|
releases/
|