settingsLogin | Registersettings

0 size attachments in mbox import

0 votes

Hi,

I'm working with an email account which was exported as mbox files from Outlook using the drag and drop method on a Mac. Files at first appeared to import into ePADD 7 fine, but it appears that all attachments are 0 length files. When I look at the mbox files in a text editor, I can see the attachment content in base64, with a header like:

--B36372211791183097107
Content-type: image/png; name="Melbourne roundtable - April 2019.png";
x-mac-creator="4F50494D";
x-mac-type="504E4766"
Content-ID:
Content-disposition: inline;
filename="Melbourne roundtable - April 2019.png"
Content-transfer-encoding: base64

In the reports, I can see some error messages like:

  1. Dirty message part, has conflicting message part headers.C:\Users\lglanville\new_acq\data\Pacific.mbox Message# 51
  2. Ignoring attachment for C:\Users\lglanville\new_acq\data\Pacific.mbox Message #51: java.lang.RuntimeException at edu.stanford.muse.util.Util.ASSERT(Util.java:97) at edu.stanford.muse.datacache.BlobStore.remove(BlobStore.java:453) at edu.stanford.muse.datacache.BlobStore.add(BlobStore.java:359) at edu.stanford.muse.email.EmailFetcherThread.handleAttachments(EmailFetcherThread.java:794) at edu.stanford.muse.email.EmailFetcherThread.processMessagePart(EmailFetcherThread.java:644) at edu.stanford.muse.email.EmailFetcherThread.processMessagePart(EmailFetcherThread.java:627) at edu.stanford.muse.email.EmailFetcherThread.fetchAndIndexMessages(EmailFetcherThread.java:1204) at edu.stanford.muse.email.EmailFetcherThread.run(EmailFetcherThread.java:1420) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)

Any idea what might be happening here?

asked Apr 5 by lglanville (240 points)

1 Answer

0 votes

I suggest you try other means to migrate Outlook messages to mbox format.

answered Apr 6 by Peter_Chan (2,770 points)

Hi Peter,
I ended up migrating the affected files using Aid4mail. Attachments are now displaying correctly. However, a substantial portion of emails are displaying without any contents. The header data has been harvested correctly, but the message itself is blank. There are no related errors in the reports. Have imported the affected emails into Thunderbird and they display correctly, when exported from Thunderbird and reingested to ePADD they have the same issue. Have you come across anything similar?

Have you try Emailchemy? https://weirdkid.com/emailchemy/
Parsing mbox files is an issue in some instance.
Since our project ended in Feb. 2019, we don't have programmer to look into the code. Hope we have new funding to further develop ePADD soon.

This comment has an approval-pending update
ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.
...