Previously --skip-conjugations returned None, causing build_all_variants() to produce near-empty conjugation decks (0.3MB font-only files). Now loads from conjugations.json cache so all 6 release variants build correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| data | ||
| .gitignore | ||
| apkg_builder.py | ||
| benyehuda.py | ||
| conjugation_extract.py | ||
| extract_verb_list.py | ||
| flashcard.png | ||
| frequency_lookup.py | ||
| hebrew_extract.py | ||
| image_fetch.py | ||
| pealim.apkg | ||
| pealim_dict.csv | ||
| pealim_dict_for_anki.csv | ||
| pealim_extract.py | ||
| README.md | ||
| requirements.txt | ||
| run.py | ||
| test_scrape.py | ||
| validate_apkg.py | ||
| validate_verb_list.py | ||
| verbs_input.txt | ||
Hebrew Flash Cards — Hebrew Vocabulary & Verb Flashcards for Anki
For Hebrew learners
This project generates two Anki decks for learning Modern Hebrew:
- Vocabulary deck — ~9,100 words from pealim.com, with nikkud (vowel marks), roots, parts of speech, related words, and example sentences from classic Hebrew literature.
- Conjugation deck — 70 paradigm verbs from Coffin & Bolozky's A Reference Grammar of Modern Hebrew (2005), fully conjugated in all tenses and persons, across all seven binyanim.
All card data comes from open or academic sources:
- Word data: pealim.com — a free Modern Hebrew dictionary
- Example sentences: Project Ben-Yehuda — public-domain Hebrew literature corpus
- Word frequency: hermitdave/FrequencyWords — Hebrew frequency list
- Verb paradigm list: Coffin, Edna Amir and Shmuel Bolozky. A Reference Grammar of Modern Hebrew. Cambridge University Press, 2005.
Just give me the flashcards
- Download the
.apkgfiles from Releases - Double-click to import into Anki (free, cross-platform)
- Start studying
Both decks can be imported independently. If you already have one, re-importing the same file updates your deck without losing study progress.
What's in the vocabulary deck
Each card has two sides:
Hebrew → English: See the Hebrew word (with nikkud) + hear audio → recall the meaning.
English → Hebrew: See the English meaning → recall the Hebrew word, its root, and how to write it.
Fields on each card:
| Field | Example |
|---|---|
| Hebrew word (nikkud) | שָׁמַר |
| Meaning | kept, watched over |
| Root | שמ״ר |
| Part of speech | פועל (verb) |
| Without nikkud | שמר |
| Related words | שׁוֹמֵר, שְׁמִירָה |
| Example sentence | from Ben-Yehuda corpus |
| Audio | pronunciation from pealim.com |
| Frequency rank | #412 |
Cards are presented in frequency order — Anki will show you the most common words first. Frequency rank is displayed on every card so you can see how common each word is. Words not in the top 50,000 show a "50k+" badge.
What's in the conjugation deck
70 paradigm verbs from Coffin & Bolozky's A Reference Grammar of Modern Hebrew (Appendix 1), covering all seven binyanim:
- פָּעַל (Pa'al), נִפְעַל (Nif'al), פִּעֵל (Pi'el), פֻּעַל (Pu'al)
- הִתְפַּעֵל (Hitpa'el), הִפְעִיל (Hif'il), הֻפְעַל (Huf'al)
Each verb is drilled in: present, past, future, and imperative — all persons and genders. The infinitive is shown on the card front as context but is not quizzed.
Present tense expansion: Each present form generates 3 cards (one per pronoun that uses it), so you learn אֲנִי, אַתָּה, and הוּא all separately with the same masculine singular form.
Modern Hebrew 2fp/3fp: Classical feminine plural future forms (e.g., תִּשְׁמֹרְנָה) are shown in parentheses; the card's primary answer is the modern masculine plural form used in everyday speech.
Passive label: Pu'al and Huf'al cards show the active partner's infinitive on the front (e.g., לְבַטֵּל) followed by (סָבִיל) in smaller text, so you know you're drilling the passive conjugation. Active verbs show no label.
Card order: New cards are introduced in random order.
Citation: Coffin, Edna Amir and Shmuel Bolozky. A Reference Grammar of Modern Hebrew. Cambridge University Press, 2005.
Suggested study strategy
Start with the vocabulary deck. Anki will present the most frequent words first. Don't try to study to many cards every single day-- Anki suggests 20 per day.
The conjugation cards reinforce verb forms you've already seen in vocabulary.
Use the Hebrew → English direction to build reading comprehension. Use the English → Hebrew direction to build writing and speaking recall.
About the data sources
pealim.com — A comprehensive free Modern Hebrew dictionary with nikkud, roots, conjugations, and audio. This project scrapes the public dictionary and conjugation tables.
Project Ben-Yehuda — A public-domain digital library of Hebrew literature. Example sentences come from the nikkud corpus (classic texts with full vowel marks).
FrequencyWords — An open Hebrew word frequency list derived from subtitle data. Used to sort vocabulary cards from most to least common.
Coffin & Bolozky — The verb paradigm list for the conjugation deck comes from Appendix 1 of A Reference Grammar of Modern Hebrew (Cambridge University Press, 2005), which provides a comprehensive reference for Modern Hebrew verbal morphology.
Fixing errors
If you notice a wrong translation, missing audio, or incorrect conjugation:
- For vocabulary errors: the source is pealim.com — you can suggest corrections there. But if you think morfix has a correct translation and pealim.com does not, we may be able to encode an override.
For any other issue, whether you know to code or not: Email me at pealim [at] nevo [dot] engineer
For developers
Installation
pip install -r requirements.txt
Quick test (20 words, no network)
python run.py --skip-scrape --skip-audio --skip-examples --test 20
Full pipeline
# Use cached dictionary (recommended after first run)
python run.py --skip-scrape
# Full rebuild including verb list extraction from PDF
python extract_verb_list.py
python run.py --skip-scrape --refresh-examples
CLI options
python run.py [options]
--skip-scrape Use cached data/hebrew_dict.csv (no pealim.com scraping)
--skip-audio Skip audio .mp3 downloads
--skip-examples Skip Ben Yehuda example fetching
--only {vocab,conjugations} Run only one deck (skips all unrelated steps)
--skip-conjugations Skip verb conjugation extraction (deprecated: use --only vocab)
--skip-images Skip image fetching for concrete nouns
--refresh-examples Force rebuild of Ben Yehuda index (nikkud corpus)
--test N Process only first N words
Output files
| File | Description |
|---|---|
data/hebrew_dict.csv |
Raw dictionary |
data/hebrew_dict_for_anki.csv |
Enriched Anki CSV |
data/conjugations.json |
Verb conjugation data |
data/audio/ |
Vocabulary audio (.mp3) |
data/audio_conj/ |
Conjugation audio (.mp3) |
data/fonts/ |
Heebo font files (bundled in .apkg) |
data/images/ |
Noun images from Wikipedia/Commons |
data/image_cache.json |
Image fetch cache |
output/hebrew_vocabulary.apkg |
Vocabulary Anki deck |
output/hebrew_conjugations.apkg |
Conjugation Anki deck |
Pipeline overview
hebrew_extract.py— scrapes pealim.com dictionaryfrequency_lookup.py— downloads/loads Hebrew frequency databenyehuda.py— builds sentence index from Ben-Yehuda corpusextract_verb_list.py— extracts verb list from Coffin & Bolozky PDFconjugation_extract.py— fetches conjugation tables from pealim.comimage_fetch.py— fetches Wikipedia/Commons images for concrete nounsvalidate_verb_list.py— validates verb list against pealim.comapkg_builder.py— assembles both.apkgfilesrun.py— orchestrates all steps
AnkiWeb
The decks will be published as shared decks on AnkiWeb (TBD).
