Decoding with Diamond

The UK’s national synchrotron science facility, Diamond Light Source, in Oxfordshire is being used to ‘virtually unwrap’ a set of 2000-year-old scrolls that were buried and damaged by the eruption of Vesuvius in AD 79. Tom Austin-Morgan finds out how.

Researchers led by ancient artefacts decoder Professor Brent Seales will use Diamond’s powerful light source to virtually unwrap two complete scrolls and four fragments from the Herculaneum scrolls.

Prof Seales is director of the Digital Restoration Initiative at the University of Kentucky, a research programme dedicated to developing software to enable the recovery of fragile, unreadable texts. He says: “Diamond Light Source is an absolutely crucial element in our long-term plan to reveal the writing from damaged materials, as it offers unparalleled brightness and control for the images we can create, plus access to a brains trust of scientists who understand our challenges.”

Over the past two decades, Prof Seales and his team have worked to digitally restore and read the vast amount of material in the “invisible library” of irreparably damaged manuscripts. In 2015 they succeeded in visualising the writing trapped inside five complete wraps of an ancient Hebrew scroll from En Gedi, Israel. For the first time ever, a complete text from an object so severely damaged that it could never be opened physically was digitally retrieved and recreated.

According to Seth Parker, technical project manager on Prof Seales’ team, the Israel Antiquities Authority excavated a synagogue at En Gedi in the 1970s: “They found a burnt script, about the length of your pinky finger,” he explains. “They kept it in a vault until 2014 or 2015, when they decided to CT scan it. But when they got the data back, they couldn’t see anything inside of it.”

The Israel Antiquities Authority contacted Dr Seales and gave him the scroll data. “This was right around the time that our software was really coming online for doing this virtual unwrapping process,” Parker continues. “I sat down with the data and in a day had something that looked like vague smudges, it kind of looked like text.

“We then got the high-resolution data set from them and re-did that process. When we sent it back, they said ‘you’ll never believe what you discovered; this is a copy of the book of Leviticus, and given the time period that we think the scroll is we think it’s one of the oldest copies’.

“We proceeded to process the entire data set and produced an entire virtual unwrapping of that scroll. We had two columns of text, the first showing Leviticus Chapter One, and the second showing Chapter Two. The resolution of the scan allowed us to bring the images out at a resolution around 1500 dpi, much larger than even archival quality photography.”

The software, that the team calls ‘Volume Cartography’, breaks the deciphering of these damaged and un-openable scrolls into three stages. The first is called segmentation, where it identifies the individual layers of the scroll and the legends on them. Second is sampling, where it takes the structure and flattens it into a 2D image. The third stage is called texturing which looks at the CT data and figures out how it would look in three-dimensions.

“All three have to be in place otherwise you can’t read anything inside of these scrolls,” says Parker.

Prof Seales’ long-term goal has been to reveal the contents of the Herculaneum scrolls, the most iconic items in the invisible library. Buried and carbonised by the eruption of Mount Vesuvius, the scrolls are too fragile to be opened and represent the perfect storm of important content, massive damage, extreme fragility, and difficult-to-detect ink as it too is comprised of carbon.

These papyri were discovered in 1752 in an ancient Roman villa near the Bay of Naples believed to belong to the family of Julius Caesar.The majority of the 1,800 scrolls reside at the Biblioteca Nazionale di Napoli, although a few were offered as gifts to dignitaries by the King of Naples and ended up at the Bodleian Library at Oxford University, the British Library, and the Institut de France.

The two scrolls and four fragments that will be scanned at Diamond come from the Institut de France. Because the four fragments contain many layers and feature visible, exposed writing on the top, they will provide the key data needed to develop the next iteration of the team’s virtual unwrapping software pipeline, a machine learning algorithm that will enable the visualisation of the carbon ink.

“It’s ironic, and somewhat poetic, that the scrolls sacrificed during the past era of disastrous physical methods will serve as the key to retrieving the text from those that survive but are unreadable,” says Prof Seales. “And by digitally restoring and reading these texts, which are arguably the most challenging and prestigious to decipher, we will forge a pathway for revealing any type of ink on any type of substrate in any type of damaged cultural artefact.”

The use of carbon ink is the main reason these scrolls have evaded deciphering. Unlike metal-based inks, such as the iron gall used to write medieval documents, carbon ink has a density similar to the carbonised papyrus upon which it sits, meaning it appears invisible in X-ray scans.

“We do not expect to immediately see the text from the upcoming scans,” says Prof Seales. “But they will provide the crucial building blocks for enabling that visualisation.

“First, we will immediately see the internal structure of the scrolls in more definition than has ever been possible, and we need that level of detail to ferret out the compressed layers on which the text sits. In addition, we believe strongly — and contrary to conventional wisdom – that tomography does indeed capture subtle, non-density-based evidence of ink, even when it is invisible to the naked eye in the scan data.The machine learning tool we are developing will amplify that ink signal by training a computer algorithm to recognise it – pixel by pixel – from photographs of opened fragments that show exactly where the ink is — voxel by voxel — in the corresponding tomographic data of the fragments. The tool can then be deployed on data from the still-rolled scrolls, identify the hidden ink, and make it more prominently visible to any reader.”

Parker explains that if you think of the surface of the papyrus as a waffle and the ink as syrup: “When you pour syrup into those holes they fill up, that’s basically what the ink is doing on the surface of the papyrus. You’re taking this thing that has a lot of highly irregular structure, lots of deep holes, and you’re filling in those gaps, making it slightly smoother. The CT scan does pick this up if the resolution is high enough, then you can start to differentiate where the papyrus is light because the surface is very rough, and where the papyrus has ink on it, because it’s not as rough.”

Because of the highly irregular construction of the Herculaneum papyrus – not only due to the construction of the papyrus itself, but also because of the damage caused by the pyroclastic flow from the volcanic eruption – machine learning will be utilised to speed the process by filtering out as much irregularity as possible. It will focus on the parts of the CT scans that represent the ink as well as figuring out whether any gaps are simply the spaces between the layers of the scroll or if they are damaged areas on the same layer.

“The segmentation (layer identification) is the only manual part of the process,” Parker explains. “If you can do that automatically, then you can basically take a CT scan, throw it on the server, let it run as long as it needs and then out pops your scroll.”

The scanning of these delicate items at Diamond, will be a mammoth undertaking, for all involved. Because of their extreme fragility, custom-fit cases have been 3D printed for the scrolls that enable as little handling as possible before being inserted into the I12 beamline at Diamond. The I12 or JEEP (Joint Engineering, Environmental, and Processing) beamline is a high energy X-ray beamline for imaging, diffraction and scattering, which operates at photon energies of 53-150 keV.

So far, around 15 terabytes of data has been collected from the various scans at Diamond. Where the En Gedi scrolls took a matter of months to decipher, the Herculaneum scrolls may take years. Though Parker is confident that the text will be deciphered faster.

However, he says: “The history of these scrolls is littered with stories of destruction. When they were first excavated, the first scrolls were thrown on a fire because people thought they were logs...We don’t want to be another part of that story of destruction. We want to be safe, but we also want to read them, and we want to reveal that text to everyone in the world.”