Persian institute kickstarts digital revolution

A leading Persian studies center is partnering with technology experts in an ambitious new digital humanities project.

by Aysha Khan — Dec. 15, 2015 — Share:

The University of Maryland's Roshan Institute for Persian Studies is taking a full advantage of the thriving field of digital humanities that intersects computer science with literature, art, history and language.

"Digital projects to do with the Middle East are only just beginning to happen," said Iranian scholar Fatemeh Keshavarz, who directs both the Roshan Institute and the School of Languages, Literatures and Cultures (SLLC).

A sprawling new initiative from the Roshan Institute, housed a few hallways from the SLLC offices in the University of Maryland's Jimenez Hall, aims to change that paradigm.

The Roshan Initiative in Persian Digital Humanities, nicknamed PersDig@UMD, emerged from collaborations between with Persian scholars around the world and a number of local technology-oriented organizations, said assistant director Matthew Miller.

"The Maryland Institute of Technology in the Humanities, down in Hornbake Library, is one of the strongest digital humanities centers in the world," Miller said. "And the Roshan Institute is one of the best Persian studies centers in the world. So there's this perfect synergy that makes the University of Maryland the perfect place for this initiative to begin."

Roshan Institute
Hafez poetry

ROSHAN INSTITUTE, located in Jimenez Hall, aims to combine corpuses of
poetry with translations and interpretations. Photos by Aysha Khan

The initiative is working on six major projects, from archiving millions of social media tweets related to important Persian diasporic issues — like the Iran Nuclear Deal — for analysis, to producing original open-source academic e-books in Persian.

With the help of experts on campus and in affiliated organizations like the Library of Congress and the Hill Museum & Manuscript Library, the Roshan Initiative is acting as a hub for a diverse set of projects in the field, Keshavarz said.

The initiative is starting with the Persian Digital Library, an open-access corpus of more than 60,000 Persian poems. They estimate the library, from the actual text to dictionaries and animation of classical Persian tales by artist Rashin Kheiriyeh, will be complete by next fall.

"There's no corpus of Persian language like there is in European languages," Miller said. "Even Arabic has some resources, but Persian doesn't even have that. These poems are out there, floating around the internet and libraries, but they're not all compiled and aligned with translations and dictionary lookups that already exist."

When the Roshan Institute began thinking about beginning a digitization project, their fundamental aim was to help make Persian material available online, Keshavarz said.

But new possibilities to bring the plain text to life quickly emerged: it could be searchable, it could have a built-in dictionary, it could have morphological and linguistic analysis tools, it could be connected to various commentaries on the text.

Within a week of completing their crowdfunding venture, Miller's team cleaned existing texts — what he called "very dirty corpuses," with duplicate files and inconsistent formatting — and structured the poetry corpus for international scholarly use. Now, file structure will be no barrier producing metadata or conducting computational analyses of the texts, he said.

His team is also well on their way to incorporating morphological analysis for different reading environments and reading apps, so scholars can read through a text on these platforms and look words up in a dictionary just by clicking.

"We're serving a very broad range of audience members, from scholars and students of Persian literature to individual Iranians who want to read a clean, easily accessible poem of Hafez," Miller said.

The Persian diaspora tends to be fairly tech savvy, Keshavarz points out. A major 2008 study estimated there were about 60,000 regularly updated blogs in the Persian language, from secular reformist blogs to Persian poetry and literature blogs. This rich online discussion space became in the first decade of the 2000s what the Washington Post called a "Blogistan."

Besides some grants from the university, more than 85 donors pitched in and added $11,400 through the initiative's crowdfunding page — a full 113 percent of its original goal.

"The field of Persian digital humanities doesn't really exist yet," Millers said. "We're kind of pioneering it. And we've really just started five months ago. Persian digital humanities still so young that we're just trying to see the vast array of possibilities on the horizon. The best ideas are yet to even come."

↑ Head back to the top.

© Copyright 2015. Featured image courtesy Roshan Institute