Commit graph

4 commits

Author SHA1 Message Date
802c369365 v0.14: rescrape vocab, formatting fixes for all decks
- Full pealim.com rescrape: 9,120 words (15 new), all with audio URLs
- Plurals deck: 2:1 regular:irregular ratio (649 notes), RTL arrows, 1.6x hint text
- Conjugation deck: blue infinitive on front, plain meaning on back, nikkud labels
- Confusables deck: larger prompt text (32px), audio only when all words have it
- Validator: non-audio variants no longer false-fail on audio check
- 14 new audio files downloaded

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 09:26:41 +00:00
17f7458d19 Sprint 9: cloze cards, plurals deck, project reorg, lint tooling
- Cloze card pipeline: 924 cards from 2,296 AI-vetted Hebrew book sentences
- Plurals deck: 375 notes (144 irregular + 231 regular from 86 mishkal patterns)
- Ktiv male forms expanded to 20,711 entries for sentence matching
- Project reorg: helpers.py (deduped strip_nikkud from 10 files), scripts/ for
  one-off tools, tests/ with smoke tests, deleted 3 dead files
- Lint tooling: pyproject.toml with ruff/vulture/bandit/pytest config, .editorconfig,
  fixed all 129 ruff errors (B023 closure fix, SIM103, unused vars)
- validate_apkg.py: card count range check for optional cloze template
- Data caches committed: vetted_sentences, ktiv_male_forms, noun_plurals,
  noun_slug_map, vocab_sentence_matches, epub_sentence_index

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 08:09:39 +00:00
64a1b18951 Sprint 7: emoji/prep extraction, conjugation reduction, project rename
- Item 1/2: Extract emoji and Hebrew parentheticals (prepositions) from
  Meaning field; display emoji with 3.5em font, prep inline after Hebrew
  word. Add Emoji and Prep fields to Hebrew Flash Cards model.
- Item 3: Seeded RNG per verb reduces conjugation cards by ~630 (4 present
  forms → 1 pronoun each; past_3p → 1 gender). 1st-person forms gain gender
  label (זכר/נקבה). Total: 1,834 conj cards (was ~2,464).
- Item 4: hebrew_extract.py uses BeautifulSoup to capture data-audio URLs
  from pealim.com list pages during scraping. step_audio() reads audio_url
  column from CSV (no longer needs audio_extract.py).
- Item 5: Rename to 'Hebrew Flash Cards'. New filenames: hebrew_dict.csv,
  hebrew_extract.py, hebrew_vocabulary.apkg, hebrew_conjugations.apkg.
  Deck/model names updated throughout. Forgejo repo rename pending (sochen
  lacks admin rights — Nevo must do via UI).
- Fix: Deduplicate entries with same Hebrew word before adding notes
  (eliminates GUID collisions from duplicate source CSV rows).
- Bump RELEASE_TAG to v0.11.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 05:49:51 +00:00
4fcc5cff60 Sprint 6: release tagging, conjugation front swap, validate_apkg.py
- Add RELEASE_TAG="v0.10" constant; tag all notes (vocab + conj) so users
  can identify which release their cards came from via Anki Browse
- Swap conjugation card front: Pronoun now above Infinitive for easier recall
- Add validate_apkg.py: comprehensive .apkg integrity checker covering ZIP
  structure, media manifest, audio format, DB schema, card counts, sound refs,
  and field content; runs on both decks
- Configure Forgejo v0.10 release with conjugation .apkg as downloadable asset
- Update releases/pealim_conjugations.apkg with tagged notes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 05:09:45 +00:00