For more than a century, scholars have pored over the Cairo Geniza—the largest collection of medieval Jewish manuscripts in the world—knowing that its 400,000 fragments hold secrets of Jewish life, learning, and law from a millennium ago. But until now, most of these documents have remained locked away, not by secrecy or censorship, but by the sheer impossibility of reading them all. That’s about to change.
Israeli researchers have launched MiDRASH, a groundbreaking artificial intelligence project that will transcribe the entire Geniza collection, making a thousand years of Jewish history searchable, accessible, and alive for the first time. The National Library of Israel announced this week that the project—funded by a €10 million grant from the European Research Council—has already begun digitally transcribing hundreds of thousands of manuscript fragments, with plans to process 10 million additional images of Hebrew manuscripts in the coming months.
Jewish law forbids discarding fragments of writing from holy literature. Instead, worn-out or disfigured writing is stored in a designated area called a genizah (meaning ‘to hide’ or to ‘put away’), either in a synagogue or cemetery before proper cemetery burial.
The Ben Ezra Synagogue in Old Cairo served as more than a house of prayer. For a thousand years, its genizah room accumulated everything from sifrei kodesh (holy books) and teshuvos (Rabbinic correspondence) to personal letters and business contracts. Jewish law forbids discarding any document bearing God’s name, so the community stored them all. Egypt’s dry climate preserved what would have rotted anywhere else, creating an accidental time capsule of Jewish civilization during the Middle Ages, when ninety percent of world Jewry lived under Muslim rule.
The scale of what survived is staggering. Among the fragments are writings in Maimonides’ own hand, including previously unknown commentaries on the Jerusalem Talmud and portions of his Mishneh Torah. There are piyutim, halachic rulings, merchant correspondence, and private notes passed from rebbi to talmid. One transcribed document captures a 16th-century widow from Yerushalayim writing in Yiddish to her son in Egypt, with his reply scrawled in the margins describing his struggle to survive a plague sweeping through Cairo.
“We got to know lots of new texts, lots of new versions of texts we already knew, and we learned a huge number of things,” said Daniel Stökl Ben Ezra, professor at the École Pratique des Hautes Etudes in Paris and one of the project’s lead researchers.
But discovering the Geniza’s treasures is one thing; reading them is another. The fragments are written in Hebrew, Aramaic, Judeo-Arabic, and Yiddish, in countless handwritten scripts that vary wildly in legibility. Although the entire collection was photographed and digitized starting in 2006, fewer than 15 percent of the documents have been transcribed. The rest have been sitting in digital archives, visible but largely unreadable.
MiDRASH changes that equation. Using an open-source platform called eScriptorium, researchers have trained artificial intelligence to recognize and transcribe Hebrew script in all its medieval variations. The AI learns by studying existing transcriptions prepared by scholars, then applies that knowledge to decode new manuscripts. When the system encounters difficult texts, human researchers review its work, correcting errors and improving the training. The result is a feedback loop that makes the AI increasingly accurate while exponentially speeding up the pace of discovery.
The Sages teach us: “Turn it over and turn it over, for everything is in it” (Pirkei Avot 5:22). The rabbis who filled the Geniza never imagined their administrative records, legal debates, and personal correspondence would one day illuminate how Torah spread across continents and cultures. Yet that is exactly what MiDRASH promises to reveal. By making millions of words searchable and comparable, researchers can now trace how midrashim, hanhagos, and methods of learning shifted as Jewish communities moved between Baghdad, Damascus, Cairo, and beyond. The history of Torah thought can finally be mapped across centuries.
This effort stands in a direct line from David Ben-Gurion’s 1950 decision to establish the Institute of Microfilmed Hebrew Manuscripts. Recognizing that countless seforim and manuscripts scattered across the world could never physically return to Yerushalayim, Ben-Gurion ordered them photographed for posterity. Over 1,500 collections were eventually microfilmed. The Cairo Geniza was digitized beginning in 2006. MiDRASH represents the next stage: not just preserving images, but unlocking the words themselves.
“The possibility to reconstruct, to make a kind of Facebook of the Middle Ages, is just before our eyes,” Stökl Ben Ezra said. The comparison is apt. Just as social networks map relationships between people today, MiDRASH will map intellectual relationships across medieval Jewish scholarship—showing who quoted whom, who borrowed from whom, and how ideas traveled from scholar to scholar across generations.
The National Library has launched a “Transcribe-a-thon,” inviting trained volunteers to help refine the AI’s work so that others can benefit from accurate texts. The researchers acknowledge the software will make mistakes, but human oversight ensures the final product meets scholarly standards. The full collection is expected to go online within a year, accessible to researchers and laypeople alike.
Cairo was the greatest city of the medieval Middle East—a center of global trade, learning, and science that surpassed both Damascus and Baghdad. Its Jewish community flourished there, later swelled by refugees fleeing newly Christian Spain. Maimonides himself, physician to the family of Saladin, worshipped at Ben Ezra. While dynasties rose and fell, the community went about its daily life, its religious authorities filling the Genizah with rabbinical arguments, civic records, and the detritus of intellectual and administrative business.
The discovery of this treasure in the late 19th century revolutionized Jewish scholarship. Now, artificial intelligence promises a second revolution. Lost midrashim, forgotten versions of well-known texts, correspondence between gedolim, and previously unknown pieces of Torah literature all await discovery. What took scholars a century to partially uncover may soon be available to anyone with an internet connection.