diff --git a/.gitignore b/.gitignore index 81debe4..9a79832 100644 --- a/.gitignore +++ b/.gitignore @@ -15,8 +15,13 @@ __pycache__/ # Large generated cache files (rebuild locally) data/benyehuda_index.json -# Audio directory (large; rebuild with --skip-scrape) +# Audio directories (large; rebuild locally) data/audio/ +data/audio_conj/ + +# Image cache and downloads (rebuild with image_fetch.py) +data/image_cache.json +data/images/ # Output .apkg files (generated by pipeline) output/ diff --git a/README.md b/README.md index 36789b6..f98d5ca 100644 --- a/README.md +++ b/README.md @@ -8,8 +8,8 @@ This project generates two Anki decks for learning Modern Hebrew: -- **Vocabulary deck** — ~14,400 words from [pealim.com](https://www.pealim.com/dict/), with nikkud (vowel marks), roots, parts of speech, related words, example sentences from classic Hebrew literature, and audio pronunciation. -- **Conjugation deck** — 71 paradigm verbs from Coffin & Bolozky's *A Reference Grammar of Modern Hebrew* (2005), fully conjugated in all tenses and persons, across all seven binyanim. +- **Vocabulary deck** — ~9,100 words from [pealim.com](https://www.pealim.com/dict/), with nikkud (vowel marks), roots, parts of speech, related words, example sentences from classic Hebrew literature, and audio pronunciation. +- **Conjugation deck** — 69 paradigm verbs from Coffin & Bolozky's *A Reference Grammar of Modern Hebrew* (2005), fully conjugated in all tenses and persons, across all seven binyanim. All card data comes from open or academic sources: - Word data: [pealim.com](https://www.pealim.com) — a free Modern Hebrew dictionary @@ -56,7 +56,7 @@ Cards are presented in **random order** within Anki's spaced-repetition system, ## What's in the conjugation deck -71 paradigm verbs from Coffin & Bolozky's *A Reference Grammar of Modern Hebrew* (Appendix 1), covering all seven binyanim: +69 paradigm verbs from Coffin & Bolozky's *A Reference Grammar of Modern Hebrew* (Appendix 1), covering all seven binyanim: - פָּעַל (Pa'al), נִפְעַל (Nif'al), פִּעֵל (Pi'el), פֻּעַל (Pu'al) - הִתְפַּעֵל (Hitpa'el), הִפְעִיל (Hif'il), הֻפְעַל (Huf'al) @@ -76,9 +76,9 @@ Each verb is drilled in: present, past, future, imperative, infinitive — all p ## Suggested study strategy -Start with the vocabulary deck. Anki will present the most frequent words first. Aim for 10–20 new cards per day. +Start with the vocabulary deck. Anki will present the most frequent words first. Don't try to study to many cards every single day-- Anki suggests 20 per day. -Once you have ~300–500 vocabulary words, add the conjugation deck. The conjugation cards reinforce verb forms you've already seen in vocabulary. +The conjugation cards reinforce verb forms you've already seen in vocabulary. Use the Hebrew → English direction to build reading comprehension. Use the English → Hebrew direction to build writing and speaking recall. @@ -86,7 +86,7 @@ Use the Hebrew → English direction to build reading comprehension. Use the Eng ## About the data sources -**pealim.com** — A comprehensive free Modern Hebrew dictionary with nikkud, roots, conjugations, and audio. This project scrapes the public dictionary listing (not conjugation tables, which are covered separately). +**pealim.com** — A comprehensive free Modern Hebrew dictionary with nikkud, roots, conjugations, and audio. This project scrapes the public dictionary and conjugation tables. **Project Ben-Yehuda** — A public-domain digital library of Hebrew literature. Example sentences come from the nikkud corpus (classic texts with full vowel marks). @@ -100,9 +100,9 @@ Use the Hebrew → English direction to build reading comprehension. Use the Eng If you notice a wrong translation, missing audio, or incorrect conjugation: -- For vocabulary errors: the source is pealim.com — you can suggest corrections there. -- For conjugation errors: open an issue in this repository with the verb and the correct form. -- For example sentence issues: open an issue with the word and sentence. +- For vocabulary errors: the source is pealim.com — you can suggest corrections there. But if you think morfix has a correct translation and pealim.com does not, we may be able to encode an override. + +For any other issue, whether you know to code or not: Email me at pealim [at] nevo [dot] engineer --- @@ -177,4 +177,4 @@ python run.py [options] ## AnkiWeb -The generated decks will be published on AnkiWeb. See `ANKIWEB_DESCRIPTION.md` for the submission content. +The decks will be published as shared decks on AnkiWeb (TBD).