settingsLogin | Registersettings

approaches for storing an archival copy of the mbox files

0 votes

I have been using epadd at my institution as a way of processing and providing access to email records of important university officials. In the epadd processing module, I chose not to transfer emails with certain kinds of sensitive data such as student PII. I would like to keep an archival copy of the email records; our normal process would be to ingest the files into Archivematica and then store the resulting AIP in Amazon S 3.

Unfortunately, when I export an mbox file out of epadd, that file includes the emails I purposefully weeded out in the epadd processing module. I am concerned about putting a file containing sensitive data in AS3 (I have a strong feeling that I should NOT do that), but I also feel like I need to be storing a copy of the emails we want to keep in a way that provides redundancy, fixity checks, etc.

Keeping a copy of the epadd "archive" only on our backed up networked drive does not feel like the best solution, but maybe it is. One could export an mbox file from within the delivery module of each transferred email and then put these files in Archivematica and AS3, but that is not scalable.

Is anyone else in this situation and how are you handling it?

asked Oct 26, 2017 by acobourn (210 points)
reshown Feb 15, 2018 by admin

1 Answer

0 votes

Hi Alston,

Thanks for sharing your experience with the software, and sorry for the difficulty.

The MBOX export feature was developed based on specifications provided by an ePADD user who wanted the ability to import mail via IMAP and export the unprocessed email archive for preservation. We are updating the user guide to better clarify the current MBOX export functionality.

To build further support for a broad range of preservation workflows during this grant period, we have discussed a variety of options, including the possibility of enabling the export of processed mail via MBOX and EML (with separate attachments), as well as providing checksums for all files.

We welcome your feedback, and the feedback of anybody else who is incorporating ePADD into their preservation workflows, as we are now gathering requirements for this functionality. Feel free to reply to the list with this information, or if you would prefer to share it privately, please write us at epadd_project@stanford.edu. We have also discussed forming a working group around this topic -- please also write that address if you wish to join.

I do want to clarify that ePADD does support the bulk export of messages to MBOX within the Delivery module. But we recommend that you wait until we have developed more dedicated support for preservation -- the preservation sprint will take place during the current grant period (which ends October 2018).

Thanks again for your feedback.

Best,

Josh Schneider
ePADD Community Manager

answered Oct 26, 2017 by Josh_Schneider (5,040 points)
ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.
...