i like to keep an electronic record of certain on-line purchases by using "save page as" on the file drop-down. i store these records in a specified directory and see that for each page saved i have an html file and then also a corresponding "files" directory. it seems that many of the files within each "files" directory are duplicates (not within a specific directory, but between the various directories, i.e., the folder named "ABC.files" will have some of the same files as "DEF.files"). is there a better way to accomplish this so as not to use so much disk space for duplicate files? for example, i am wondering about something like a common files folder/directory which could eliminate so many duplicate files; since i store all of these pages in the same location this would seem to make sense. any suggestions appreciated. thank you.
A lot of hand-editing would be required, and there may be security restrictions on crossing directory boundaries. I'm not sure that you would save enough disk space to justify the complexity.
Personally I prefer to print pages to PDF to archive them if I do not need to actually edit them. You can use a free or non-free PDF printer driver, or the extension Print pages to Pdf.
thank you. i didn't think there was the kind of solution i was hoping for. however, print to pdf is certainly a possibility; if i could somehow also keep the html links on the page i would almost certainly do it. thanks again.
just installed the app and tried a test which worked fine including preserving links! this is an awesome app/add-on. thank you very much for pointing me in the right direction.
while print to pdf seems to work well, i wanted to find out more about its features, but simply cannot connect to the help website. i've been trying since i installed it a week ago and repeatedly get a "the connection was reset" error message. any suggestions or ideas? thank you.
I'm able to get in right now, but obviously that doesn't help you...
If all else fails, you could try the cached pages captured by Google during its last crawl: https://www.google.com/search?q=site%3Aprintpages2
thanks for trying, but i'm not getting anywhere with any of those links either . . . the pages try to load, but i get same error message eventually. must be something with my anti-virus stuff although a week ago i had problems getting some sites to load properly (on one the busy indicator was just spinning forever and the other would let me in but then not display information) so reset firefox which worked. i'll try to figure it out, it's just annoying. someone could use print to pdf to make the pages available somewhere. thanks again.
Were you able to find the links to Google's cached pages? They may be hidden above the "preview" image if you aren't using any add-ons to restore them to their original position next to the green URL. Usually in Google's cache you can view at least the text of the target page, even if the images aren't available.
Can you view the site in IE9?
Found Google's cached pages and someone helped me to get to the text version so i can at least see that. no, can't access the print pages to pdf site in ie9 either. thanks for your help.