kevin sacca

I am a scientist who utilizes multispectral imaging to recover and preserve information from old documents. AMA!

My short bio: My name is Kevin Sacca and I am an senior undergraduate student researcher at the Rochester Institute of Technology (R.I.T) working in the Chester F. Carlson Center for Imaging Science. I work with Dr. Roger Easton Jr. and his multidisciplinary team of scientists and scholars who share a common passion for preserving the information and cultural heritage that is inherent in antique manuscripts, paintings, palimpsest, scrolls, etc.

Personally, I have been working on the Archimedes Palimpsest, the Martellus Map of 1491, and various unknown documents from the St. Catherine’s Monastery in Mount Sinai. My job is to use the raw multispectral imagery, perform statistical image processing routines, and generate imagery showing much more clearly the text, diagrams, or figures that have been almost lost due to fading or “palimpsesting”.

My Proof: It’s hard to show my proof because I can’t disclose 99.9% of my work, and an image of me holding a card with my reddit username won’t help much… So here’s a press release of a scholarship I was awarded this year for my contributions to the field of Imaging Science and Photonics from SPIE. Press Release

[Edit]: I should mention that the reason for this AMA is due to a number of people requesting it after I posted on this front-page TIL thread: http://www.reddit.com/r/todayilearned/comments/39xhr8/til_of_a_monk_who_had_taken_an_old_book_written/

What was the (persumabely) oldest document you ever recovered?

I know my colleagues have worked on the Dead Sea Scrolls, which I’m pretty sure are as old as they come. But personally, the Archimedes Palimpsest is the oldest document I’ve worked on.

I’m new to this whole world and gave both The Dead Sea Scrolls and Archimedes Palimpsest a Google. Your job is the coolest ever and so is this AMA. Can you ELI5 the findings on both of those?

Wow thank you! While I’m not sure of the significance of the findings for the Dead Sea Scrolls or the Archimedes (in terms of history, culture, or the translations), I do know that so far, there have been many successful recoveries on both documents. I am not literate in the languages of the text I try to reveal, so unfortunately I can’t read as I go, I pass along my data to the experts who can.

Getting successful recoveries is the best outcome because:

1) The information is now preserved digitally and won’t fade over time, so it can be studied for an indefinite period of time.

2) We are able to reveal text that hasn’t been looked at in over 2000 years… We hope that the text is significant in terms of it’s content, and if it is, it can literally change what we know about our world’s history.

So you just focus in on the actual process of uncovering the info, not the analysis?

Yeah, in other words, I really just try to provide useful images to those who can read and make sense of the information.

Was there ever a document you were working on that completely infatuated your interest outside of the technical details of uncovering hidden text? Like, a piece of history that really interested you in all ways?

Yes, absolutely. The Martellus Map really grabbed my attention, and I was pretty desperate to start working on it. I bugged my advisor for a very long time before I started working with him. And while I know there are millions of people who know more about history and maps than I do, I think it’s so interesting to learn first-hand what the educated people in the 15th century thought about the world.

Scattered all around the map are these legends and cartouches which contain text that very briefly describe the land there. They can be pretty funny to us now, but it’s really interesting to see how our knowledge of the world progressed to what it is now.

Digital information doesnt fade but formats/filesystems change so rapidly that you can quickly find yourself without any data that is readable. https://en.wikipedia.org/wiki/Digital_dark_age

On top of storing our images in multiple file formats, uncompressed and compressed, we store copies of them in multiple places! I don’t think that we are concerned about not being able to read them.

Are there any dangers involved during the processing of ancient manuscripts? Could a misstep lead to a portion of a text being lost, and has this ever happened?

Yes there is significant risk in removing some of the documents from their casing and exposing them to intense light. However, my group is very conscientious about how we expose the target. Previous imaging teams used broad-spectrum light which exposed the targets to way more light than they should have, and that resulted in heat transfer and the loss of ink/pigment.

We use LED light sources which emit a very narrow bandpass, which dramatically reduces the light energy incident upon the target to the point where there is almost no risk.

I can’t imagine a job where I could get fired for turning the lights on too bright… What’s the worst thing you’ll admit to screwing up?

Well, I’ve honestly been quite good about not screwing things up. The worst I’ve done so far is forget which hard drive (of like 80 4TB hard drives that surround me) I stored my data on. So, I had to start over, which is fine, but all that data took some time to process again.

Wow. Move over Hillary Duff. I’ll be stealing the spotlight today. Thank you so much! To have so many people ask me questions about the stuff I spend so much time doing is actually really nice!

Can you walk us through the general practices of finding the overwritten text?

Well, we capture high-resolution, narrow-band multispectral imagery using tunable LED light sources and color filter wheels in combination with our multispectral camera.

We then take our massive data products into ENVI, a software package from Exelis Inc., which is very useful for image processing. Usually used for Remote Sensing purposes, ENVI also contains statistical processing routines that are invaluable for our research task too!

Within ENVI, we simply use combinations of statistical processing routines, like Principal Component Analysis, Independent Component Analysis, and spectral math to generate imagery whose contents are visually enhanced.

Can you talk a little more about these instruments? Makes and models. Just curious to see the equipment.

I’m not positive of the model (maybe the EV?), but check out MegaVision! MegaVision — Cultural Heritage Division

We most often use a MegaVision multispectral camera for our imaging, mainly because we have a colleague who works for them, but also because their equipment is top!

I actually worked on a competitor’s multispectral & hyperspectral image processing tools for remote sensing about a decade ago. I’d like to second a request for more info on the hardware and frequencies used since I only really worked with the software end.

The wavelengths of light we capture at are 365 to 940 nanometers in steps of ~50nm. Then we perform fluorescent ultraviolet imaging, which uses ultraviolet light, but not at specific wavelengths.

In your estimation, is the image science research community working at a sufficient rate to preserve our great libraries before the books are inevitably lost? How much longer do the Martellus Map of 1491 and other artifacts like it stand to survive in their present cozy, climate-controlled environments?

Well, considering the amount of text that was lost on the Martellus in the 500 years since it was painted, and also considering the fact that the documents that many people want imaged are much, much older than 500 years old, we need to act as soon as we can to try to preserve as much as possible.

In my opinion, I think not. We need a larger community working on this type of research. The imaging science community is surprisingly small considering its applications in almost every other field in the world. While that kind of job security should make me happy, it actually worries me that we aren’t putting enough resources into preserving our history and culture when it’s sitting in museums doing nothing but deteriorating.

Has Google been of any help with delicate works? I know their main effort has been more of a feed-into-a-machine-and-go, but they could do wonders with their ability to charge forward on a project and make the information so widely accessible.

I’m not aware of the efforts Google is making in this field, but I’m sure they have their chip in the dip (Is that a saying? I don’t know. Maybe I just made something up.) as they so often do.

That being said, Google translate has been super helpful to me working in the lab, to help me figure out if the words I recover mean anything!

I’m fascinated with Martellus Map. Do you have the source for a high resolution version of it? I tried to google it and the biggest image have only 2800px wide and I can’t read anything.

A huge reason why you can’t read anything is because most of the text has faded! Chet van Duzer is leading the Martellus project with Yale, so please check out his article(s) on the map, and consider buying his new book.

I have a copy of the book, which includes sections of the map after processing, and you would be amazed at how much text you can see clearly afterwords.

What has been your most interesting find so far?

Cartouches are treasure troves of hidden text. On the Martellus map, I’ve found text in small sections of the map where there was thought to be no text (or no recoverable text). Written in these cartouches are some pretty interesting passages by Martellus himself.

Why can’t you disclose your work?

Well… funding for our research is typically from private donors, museums, and, in the case of remote sensing, the government or government contractors. While maybe it’s okay to show you guys stuff, I wouldn’t take a chance disclosing any of my work that could result in some legal shitstorm that could affect my career or the careers of my friends who I work with.

Has there ever been anything surprising you’ve come across while imaging documents?

Yes, the craziest thing is when you find an entire layer of text below text you can see clearly. It makes you wonder all sorts of things about the limitations of our human visual system, and that there is much more than meets the eye.

How was the original writing in these old documents erased? Was it painted over or some such?

Many times the ink was washed off with water or chemicals, other times the ink was scraped off. Then, the cleaned paper is reused and written/painted over. Otherwise, the text is faint due to natural aging and fading of pigments.

How accurate was National Treasure I and II, and why were they very accurate?

Not accurate. They had a very whimsical interpretation of document imaging techniques. Great movies though.

Would you consider National Treasure the Jurassic Park or Indiana Jones of your career? Also on a related and more serious note, do you consider yourself an archaeologist? You are in essence recovering ancient artifacts for the study of cultures past, even if your methods of excavating and far from traditional.

I’d love to haphazardly travel across the world uncovering the dark secrets of history, so I’d totally try National Treasure out. I’m afraid of dinosaurs, loved the new movie, but I’ll have to say no to dinosaurs. My boss is Indiana Jones, so I can’t.

I don’t consider myself an archaeologist, I don’t actually uncover the documents. I am proud to be part of a team who leads an effort into preserving cultural heritage! But I think the more scholarly colleagues of mine are more suited to the title of archaeologist.

What file format do you store the images in? Something standard (TIF, PDF) or something proprietary?

We store the images in multiple file formats, tiffs because they are lossless, jpegs because they can be convenient for viewing, PDFs for use in documents and articles, and finally, .img files for working in ENVI.

So, when you are imaging a document, are you capturing reflective photons, transmissive, or both? Are you looking for florescence in any of the target materials?

All of the above. We capture transmissive, reflective, and fluorescence from both sides of the parchment. We have images of both sides because sometimes, the ink can stain through the layers and not fade as much as the surface layer.

What guides/books/websites would you reccomend where I can learn about the basics of image processing? I am about to take a course on image processing and eager to get started!

Excellent! I recommend you check out the software ENVI if you can, as it is an environment where you can learn about a lot of different image processing routines.

In terms of books… I’d check out Gonzalez and Woods book, Digital Image Processing. Any edition will do. I see their books everywhere in my building at school.

I do a lot of image processing in Python, so if you’re new to programming, I recommend CodeAcademy.com to learn the basics of programming, and then I’d check out any documentation for image manipulation in python using scipy or spectral libraries!

Is it possible to check work that didn’t use this technology to see if we’ve missed anything? I realize that certain pieces are almost untouchable but I think it’s amazing knowing that we could be sitting on new information and never know it.

Yes it is entirely possible, and highly recommended that we do so soon! The longer we wait, the more time that aging and fading has to destroy the evidence!

There is a huge call to museums and universities to get their important artifacts imaged using multispectral/hyperspectral techniques!

Some pieces are much better candidates to put the time and resources into imaging them, but I know that if we had all the time and money, my team would image every document in the world.

I work in these materials daily (Dead Sea Scrolls, Codex Ephraemi Syri Rescriptus, etc.). I presumed St. Catherines had been thoroughly raked through by now. Is there anything of historical religious significance (say before 4th century) that you expect will soon see the light of day?

Yes, we have been planning more imaging this summer at St. Catherines. There are very old copies of the Bible (presumably), and other materials that may contain undertext of a completely different nature.

What inspired you to want to study imaging science? How did you end up where you are?

Long story, I changed my major a handful of times (even before arriving at R.I.T) from International Business, Physics, Film & Animation, and finally Imaging Science.

I knew I wanted to study math and physics, but I wanted a specialized curriculum, so I looked into Imaging Science because I had the connection with the Film & Animation program.

I attended a presentation given by a PhD student who worked on the palimpsest research project, and it was so interesting that I decided to switch. And I switched my first week of freshman year and have loved it since.

So you switched from business into physics and eventually imaging science? You give me hope that wall street is yet not able to steal away all of the best minds.

Yeah, I’ve thought a lot about what it meant to switch my major so drastically, and the career paths that open up (and close) by doing so, and to be summarized simply:

I ended up here because I’m studying something I genuinely think is interesting, something that I can do for a career that can help others, the people around me every day are amazing, and I’m happy where I am in life. I think as long as those remain true, I can accomplish what I want to.

If you could inspect any document that exists in the world today what would it be?

Fun question! Okay, well I don’t know about specific documents, but I’m an aspiring treasure hunter, and I’d absolutely love to image old treasure maps! That would be SO COOL. Very National Treasure-like.

If you could inspect a document that no longer is know to exists, or can’t be proved to have existed but is rumored to have, what would it be?

I think the Lost Library of Alexandria exists, and I hope we are able to uncover some text that proves it’s existence, or best-case scenario, where it is! You never know!

In terms of your hours and free time, what is a general work day for you?

Well during the academic year, I work mainly on weekends. This summer, however, I work full-time on processing data.

My typical work day starts with checking my email, looking for requests from my scholar colleagues for processing specific images that have grabbed their attention.

I have a long long list of documents to process, and so I take the scholar’s requests as priority, slowly checking things off the list.

I usually spend around four hours on one image, as that’s about the time it takes for me to finish all of my processing routines in my arsenal. Then, I take the best results and create a psuedocolor rendering of them and send them off to the scholars for interpretation. Many times they are able to read entire words that they could not see or read before!

In terms of free time, there’s honestly so much work to be done that there is never really any free time. However, while the computer is chugging through the processing, I have time to look into new statistical processing methods, and try them out on samples to see if they work or not! All in the effort to improve the results!

I have plans to go abroad to Germany this summer to image documents from the bombing in Dresden, so that will be about a week or so.

Links to any papers that explain the results you’ve obtained? Any data you’ve worked with that we can play with?

My first co-authorship paper will be finished and hopefully presented at the IEEE conference in Rome this year, so I don’t have a paper explaining my personal results. But please check out Roger Easton’s website which I have linked in the original post. He has plenty of stuff to look at.

I can’t give out any of my data right now, because I don’t want to somehow get into any trouble. But! I encourage you to Google for RIT Archimedes data or RIT Martellus data or anything of the like, because I know there is some posted online that you can download yourself! Because if you find and download it yourself, then that’s A-okay!

My uncle has a journal from a Civil War soldier that he rescued in the 1960s from a refuse pile (a small local museum that was clearing out stuff they didn’t find important) that he wishes to preserve. What’s something I could tell him to do to preserve it? At lease until the point he passes it onto his daughter (a curator of old documents). It’s still readable, if that tells you anything.

I am not too knowledgeable on this matter, but I do know this: Keep it in a dust/dirt/water free container out of the light. Light is a big player in fading the text.

If you’re a fellow RIT student, you might know about the Carey Collection at the Wallace Library. They would be EXCELLENT people to go talk to about just this. Please, please talk to them.

The team in which you work with – have you lot ever encountered a document which unveiled information which broke new grounds and changed the perception of what we know of the history of that period? If so, what was it and how did you (and the team) feel?

While I’ve been a part of the team, no, nothing ground-shattering.

However, in the recent past, the whole foundation for this type of research was, in all seriousness, accidentally stumbled upon. The main goal I believe was simply to enhance the text that we could already see a little, but they ended up finding a palimpsest, so an entirely new layer of text under the existing text.

I know the efforts at the St. Catherines Monastery have been quite fruitful as well. There’s been talk about possibly discovering Shakespeare’s 13th signature or more Archimedes works, it’s all so amazing!

What are some texts or papers you would recommend as “required reading” for multispectral imaging? I’m hoping to begin an engineering project at work that will involve multispectral imaging of tissue for medical purposes.

Well for starters, the best book for any kind of spectral imaging work is Schott’s book, Remote Sensing: An Imaging Chain Approach. Because honestly, we are always doing remote sensing, but on a smaller scale.

For a multispectral medical imaging focused book, check out Dr. M. Ali Roula’s book. I know I’ve seen it around before, and it might hit the nail on the head for you.

Also find this book: Introduction to Medical Imaging by Nadine Barrie Smith. My classmates used this book for a medical imaging physics class and said it was amazing.

Do you think we’ll ever get consumer/commodity hyperspectral imagers? Something akin to the FLIR One? Are there any hacks that allow cheap hyperspectral imagery? What spectral ranges are useful? Is polarization also a useful signal at all?

Yes! I think that is very possible! In the near future too! This technology is relatively new, but it’s advancing so fast, so I foresee this technology being rapidly available!

Spectral imagery is achieved by simply collecting photons of a small bandgap of wavelengths with a sensitive detector. So, if you had a light source that emitted at all in a given wavelength, you could place a wavelength-specific filter over your detector and, if the exposure is right, you have yourself a spectral imager. Multi and hyperspectral are terms used to describe the spectral resolution of your system. I don’t foresee hyperspectral technology being readily available just yet, but definitely multispectral. (multispectral: ~tens of bands — hyperspectral: ~hundreds of bands)

Also curious about polarization detection. It is a relatively simple technique which has seen many applications in biophysical fields and can easily be combined with hyperspectral imaging.

Yes! Polarization is an incredibly useful method to isolate the important signal. While I’m not aware of the measures we take to polarize our light for document imaging, as an Imaging Science student, we study polarization a lot, and there is some incredible research into polarization being done by a few graduate students I know!

Were some scientific discoveries in those documents, of which we didn’t thought people had it back then? Or were there even discoveries we didn’t make, but learned about trough the documents?

Well I know as a kid we were taught that people in the 14th and 15th century thought the world was flat, and that Columbus was going to sail off the edge of the Earth.

This is a myth, and proven over and over again. Educated people back then knew the Earth was round, and evidence from the Martellus backs it up.

I’m a big fan of old documents myself. Whats the best way for someone to get into this field of research?

This research was what made me change my major into Imaging Science. After I switched, I bugged my advisor for over two years before he hired me and I got to start working on this stuff.

The best thing to do is actively contact people in the field. Let them know you are interested, and study up on the subject so if an opportunity presents itself, you can demonstrate that you are a useful person to have helping!

Is there a way to tell if document is “recycled” or not?

If the document shows signs of undertext when illuminated by infrared, ultraviolet fluorescent light, then the document may have been recycled and is a good candidate for multispectral imaging.

How would a school or researcher go about getting your team to preserve some documents?

They should contact Dr. Roger Easton at easton@cis.rit.edu

Another good resource is R.I.T’s Cary Collection They could offer insight in the best way to preserve documents, and they would easily be able to contact our team for imaging.

Do you currently have any technology or techniques that could see text that has been sharpied over? Like redacted information from CIA reports? Perhaps the 29 pages about the Saudis and 9/11.

Honestly, probably. I imagine that back in the day, before knowledge of spectral reflectance was less understood in industry, sharpie markers didn’t absorb all wavelengths of light, so perhaps to some wavelengths of light, it would look like the sharpie wasn’t even there.

I’m sure it’s possible to recover at least some classified text like that. If you never hear from me again, you can be assured that it’s true.

How are you able to keep from falling in love with every woman scientist you see in the lab? It’s been all over the news that girls are bad for science because everyone apparently falls in love and no work gets done.

Easy. I’m the only person working in my lab this summer. I get my work done with (relatively) few distractions!