(Illustration by Gaich Muramatsu)
From: Jan Harkes <jaharkes_at_cs.cmu.edu> Clearly the average file size is considerably larger and we are far more likely to see reasonable numbers for the number of cached files. If we have a 1TB cache we may see something in the order of 200K digital photos, 3000 whole-CD flacs, 1000 TV recordings, or a couple of hundred VM images. Actually, I *just now* checked my 20Gb homedir: % du -sk . ; find . -type f -print | wc -l 20088072 . 379371 20088072/379371 = 53 So my average file size is 53kb (a little less, actually, if du includes the blocks used by directories). I had no idea it was so small, since I have down in that tree a couple of CD images and even a complete vmware virtual filesystem for a virtual WinXP image sitting in a pair of files, plus some music and probably a few video clips. Note that my email is stored in babyl format, not maildir, so it's one file per mailbox, not one file per message. My mail base is about 1.7Gb/1200 files, going back to 1979. Clearly my average file size would drop a lot (by changing both the denominator & the numerator) if I flipped this over to maildir format. So the moral for me is that coda, for me, in 2007, needs to handle caches with - 10's of Gb to a small number of Tb, and - about a million files. I.e., the small-average-file-size case still needs to be handled, even when the file stores get much larger. That's for a *single* user. For a ten-thousand user "campus", add 4 zeroes. Over on the server where I keep my music, I have 320Gb in 41k files, so an average file of 7.8Mb. -OlinReceived on 2007-05-02 13:03:26