mbox file contents
I wanted to extract all unsaved images from my inbox. So after some research I found that email is saved in the old *nix mbox format files and associated (awful format) index files. I created a Python script to scan an mbox file (Inbox) and extract all images found in it. I was surprised at the number of images extracted because they contained images from "deleted" email. I then went into Thunderbird and deleted some more email with images and, when running my script to extract all images, found the images which were in the emails I had just deleted. What is going on? My script found some 2,000 more emails in the Inbox file (assuming my script counts them correctly) that are shown in the Total at the bottom of the Thunderbird window, even after doing the "Compact" operation. What does this mean about my email files? Are they going to grow all the time and never shrink when email is deleted or moved to other "folders"? I have looked at decoding the .msf files to see what they can tell me, but from what documentation I could find and looking at Inbox.msf, this is a daunting task. Some help, please. JimC
All Replies (2)
Chosen Solution
Compacting the Inbox folder should get rid of (purge) deleted and moved messages from the Inbox mbox file. It does on my end, so I have no idea why it's not doing so on your end. Run Compact at the account level too, to trim not just the inbox folder. Give it time to finish, it can take time if the mbox files are massive. Also, Compacting won't work if you don't have enough free space on disk.
Thank you. I tried compacting again and the Inbox file did indeed shrink. I am not sure why it didn't before but I may have had it open in some other program when I tried to compress it.