on exe compression by hcs at 10:59 PM EDT on June 17, 2010
mudlord and I were looking at disfilter. I built the code on Linux and tried it out with Firefox 3.6.3's xul.dll, and while it made it more compressible the decoded file didn't match the original, and wouldn't load. I may have screwed up copying and translating the PE headers.
I noticed that the largest single portion of time (37% by gprof) was spent in MoveToFront, which I thought I had a more efficient method for (from some earlier thinking on the Burrows-Wheeler Transform). But that method relied on having a small alphabet, it would require far too much memory here where the symbols are 32-bit addresses, at least not without modification. The idea was:
With an alphabet of N symbols, keep a "timer" t, initially set to N. Prepare a table T of N entries, set them to 0 through N-1 (it doesn't matter the initial order, just as long as the decoder starts with the same one). T[k] is the last "time" we saw the symbol k in the alphabet. When encoding symbol k, the output is t-T[k]-1 (cycles-1 since we last saw this char), and T[k] is then set to t and t is incremented. I thought it was clever, though I haven't tested it to see if I haven't overlooked something.
Any kind of associative lookup could be used, doesn't have to be 1-to-1 like the array I mentioned, so we could use a hash with a reasonably sized array in disfilter.
The other issue is that disfilter wants to only use this with the last 255 entries; we could simply ignore any references past that length, but we'll end up with a lot of stuff in the hash that we don't need unless we do garbage collection.
The method used in disfilter is the straightforward interpretation where everything in the array is moved down (for both encoding and decoding). It's fast enough that it really doesn't matter, oh well.
edited 11:03 PM EDT June 17, 2010
I owe an apology, the code works fine on xul.dll, I just screwed up in the conversion. The filtered file gzips about 20% better than the original dll, pretty nice!
Great to see you found the problem and fixed it. :)
I have some suggestions:
1) We could use 7-Zip's Deflate implementation in conjunction with disfilter. Apparently 7-Zip has rock solid Deflate compression, and compresses better than most compressors. 2) D3DX_**.DLL has exports which deal with zlib decompression. You can leverage those in your PE unpacker stub to unpack the contents with. I seen it done before. Granted they are specific to each Direct3D extenstion library DLL, but it sounds interesting to me. 3) Use LZMA. I'm pondering this, but the SDK is utter trash. Needs a decent encoder lib, and I'm questioning myself, why not since I am in the mood to code nothing better. 4) You can merge PE sections together when you pack, leaving the resources seperate, since disfilter hates those. My packer currently just packs every single section individually, and appends a unpacker section. Lazy, doesn't do point 4, but works *okay*. Granted, it compresses shit.
On testing, my idea doesn't work at all. The number of symbols since we last saw this symbol has nothing at all to do with the distance this symbol will have been pushed back in the MTF stack (besides an upper bound). If we see the same symbol over and over, the ones behind it don't get pushed back. And if we alternate between two, the third and on go no further.
Personally, what matters to me is the *decompression* speed in this case, since I would be dealing with performance-needing scenarios, like the loading of some small app. Compression wise, I think however it works pretty well. I'll be sure running some tests with differing algorithms to see which comes out on top.
In that case, why fix something that ain't broke unless you work out a different method to filter your code?
Builds for Linux with 2 executables, one with the original mechanism (disfilter.orig) and one with linked lists for the MTF (disfilter), which is also slightly more exhaustive. disfilter is slightly slower but produces slightly better output (the output formats are not compatible). If you un-#ifdef the code in MTFFuncTable::AddLazy it should give identical output, and the runtime is, at least on xul.dll, indistinguishable.
That is the result of a tuned zlib/deflate encoder, tweaked by 7-Zip. Would be interesting to see the results from the filter usage in conjunction with the compressor.
1) The encryptor uses RC4/TEA/Blowfish/AES/another symmetric cipher 2) The user is given a custom EXE which is custom encrypted. Each distribution uses a different key. Keyfiles are used to decrypt the executable's PE section or sections of code. Decryption/encryption code is compiled into the application itself. Only the key is in a keyfile. 3) The user puts a key into the application directory. 4) The key decrypts the code segments and reencrypts them on the fly. 5) If a user does not have a key, it is impossible to decrypt the PE sections. OR if a key is not added, a bogus key is used. Thus decrypting the section wrong and causing BSODs or access violations.
got back to work on this. got working code running. Working in the sense that...
* it doesnt support TLS callbacks * it doesnt react well with named resources * it is somewhat picky with resource compression. So if it is not in a means that the exe likes, the compressed app crashes.
uses aplib compression though, since the LZMA depacker sucks balls and is completely horrid to work with (doesn't respect the stack, etc)...
mudpack.exe (132kb to 64kb) VisualBoyAdvance-M.exe (3386KB to 1275KB) zsnesw.exe (3689KB to 571KB) snes9x.exe (3196KB to 781KB) bsnes.exe (1256KB to 385KB)
thats not including disfilter. LZMA tends though to pad out when target sizes are small though, might make it change algorithm depending on target size, etc...