Self-extracting archive

From Wikipedia, the free encyclopedia

A self-extracting archive created using 7-Zip

A self-extracting archive (SFX or SEA) is a computer executable program which contains compressed data in an archive file combined with machine-executable program instructions to extract this information on a compatible operating system and without the necessity for a suitable extractor to be already installed on the target computer. The executable part of the file is known as a decompressor stub.

Self-extracting files are used to share compressed files with a party that may not necessarily have the software to decompress the file. Users can also use self-extracting to distribute their own software. For example, the WinRAR installation program is made using the graphical GUI RAR self-extracting module Default.sfx.

Overview[]

It incorporates an executable file module, a module used to run uncompressed files from compressed files. Such a compressed file does not require an external program to decompress the contents of the self-extracting file, and it can run the operation itself. However, file archivers like WinRAR can still treat self-extracting files as any other compressed files. Users who are unwilling to run the self-extracting file they received (for example, when it may contain a virus) can use the file archiver to view or decompress its content without running executable code.

On executing a self-extracting archive under an operating system which supports it, the archive contents are extracted and stored as files on the disk. Often, the embedded self-extractor supports a number of command line arguments to control the behaviour, i.e. to specify the target location or select only specific files to be extracted.

Non-self-extracting archives contain the archived files only and therefore need to be extracted with a compatible program. Self-extracting archives cannot self-extract under a different operating system but most often can still be opened with a suitable extractor as this tool will disregard the executable part of the file and instead extract only the archive resource. In some cases this requires the self-extracting executable to be renamed to hold a file extension associated with the corresponding packer. Self-extracting files usually have an .exe extension like other executable files.

For example, an archive may be called somefiles.zip - it can be opened under any operating system by a suitable archive manager which supports both the file format and compression algorithm used. It could alternatively be converted into somefiles.exe which will self-extract on a machine running Microsoft Windows without the need for that suitable archive manager. It will not self-extract under Linux, but can be opened with a suitable Linux archive manager.

There are several functionally equivalent but incompatible archive file formats, including ZIP, RAR, 7z and many others. Some programs can manage (create, extract, or modify) only one type of archive whilst many others can handle multiple formats. There is additionally a distinction between the file format and compression algorithm used. A single file format, such as 7z, can support multiple different compression algorithms including LZMA, LZMA2, PPMd and BZip2. For a decompression utility to correctly expand an archive of either the self-extracting or standard variety, it must be able to operate on both the file format and algorithm used. The exact executable code placed at the beginning of a self-extracting archive may therefore need to be varied depending on what options were used to create the archive. The decompression routines will be different for a LZMA 7z archive when compared with a LZMA2 7z archive, for example.

Several programs can create self-extracting archives. For Windows there is WinZip, WinRAR, 7-Zip, , KGB Archiver, Make SFX, the built-in IExpress wizard and many others, some experimental. For Macintosh there are StuffIt, The Unarchiver, and 7zX. There are also programs that create self-extracting archives on Unix as shell scripts which utilizes programs like tar and gzip (which must be present in destination system). Others (like 7-Zip or RAR) can create self-extracting archives as regular executables in ELF format. An early example of a self-extracting archive was the Unix shar archive in which one or more text files were combined into a shell script that when executed recreated the original files.

Self-extracting archives can be used to archive any number of data as well as executable files. They must be distinguished from executable compression, where the executable file contains a single executable only and running the file does not result in the uncompressed file being stored on disk, but in its code being executed in memory after decompression.

Advantages[]

Archiving files rather than sending them separately allows several related files to be combined into a single resource. It also has the benefit of reducing the size of files not already efficiently compressed (many compression algorithms cannot make already compressed data any smaller. Compression will therefore usually reduce the size of a plain text document but hardly affect a JPEG picture or a word processor document. This is because most modern Word Processor file formats now involve a certain level of compression already). Self-extracting archives also extend the advantages of compressed archives to users who do not have the necessary programs installed on their computer to otherwise extract their contents, but are running a compatible operating system. However, for users who do have archive managing software, a self-extracting archive may still be slightly more convenient.

Self-extracting archives also allow for their contents to be encrypted for security, provided the chosen underlying compression algorithm and format allow for it. In many cases though the file and directory names are not part of the encryption and can be seen by anyone, even without the key or password. Additionally, some encryption algorithms rely on there being no known partial plaintexts available so if an attacker is able to guess part of the contents of the files from their names or context alone they may be able to break the encryption on the entire archive with only a reasonable amount of computing power and time. Care therefore needs to be taken or a more suitable encryption algorithm used.

Disadvantages[]

A disadvantage of self-extracting archives is that running executables of unverified reliability, for example when sent as an email attachment or downloaded from the Internet, may be a security risk. An executable file described as a self-extracting archive may actually be a malicious program. One protection against this is to open it with an archive manager instead of executing it (losing the minor advantage of self-extraction); the archive manager will either report the file as not an archive or will show the underlying metadata of the executable file - a strong indication that the file is not actually a self-extracting archive.

Additionally, some systems for distributing files do not accept executable files in order to prevent the transmission of malicious programs. These systems disallow self-extracting archive files unless they are cumbersomely renamed by the sender to, say, somefiles.exx, and later renamed back again by the recipient. This technique is gradually becoming less effective however as an increasing number of security suites and antivirus software packages instead scan file headers for the underlying format rather than relying on a correct file extension. These security systems will not be fooled by an incorrect file extension and are particularly prevalent in the analysis of email attachments.

Self-extracting archives will only run under the operating system family and platform with which they are compatible, making it more difficult to extract their contents under other systems. Examples of self-extracting archives, which can be run on multiple targets (such as DOS and CP/M) rather than only an archive's contents to be usable under multiple systems, are very rare, because they require the embedded decompressor stub to be a fat binary.[1][2][3][4]

Also, since the self-extracting archives must include executable code to handle the extraction of the contained archive file, they are a little larger than the original archive.

See also[]

References[]

  1. ^ Elliott, John C. (1997-01-18) [1997-01-11]. "PMSFX 2". Newsgroupcomp.os.cpm. Archived from the original on 2021-12-13. Retrieved 2021-12-13. […] I've written a version of PMSFX that produces .COM files unpackable under DOS and CP/M (the first three bytes are both legal Z80 code, legal 8086 code and legal PMA header). You can find it […] as a self-extracting archive. […]
  2. ^ Wilkinson, William "Bill" Albert; Seligman, Cory; Drushel, Richard F.; Harston, Jonathan Graham; Elliott, John C. (1999-02-17). "MS-DOS & CP/M-Compatible Binaries". Newsgroupcomp.os.cpm. Archived from the original on 2021-12-13. Retrieved 2021-12-13.
  3. ^ Elliott, John C. (2012-06-20) [2005-01-05]. "Generic CP/M". Seasip.info. Archived from the original on 2021-11-17. Retrieved 2021-12-12. […] Self-extracting archives are .COM files containing a number of smaller files. When you run one, it will create its smaller files […] The self-extract archive programs will run under DOS (2 or later) or CP/M, with identical effects. To extract them under Unix, you can use ZXCC […] PMSFX21X.COM […] PMSFX is the program that was used to generate these self-unpacking archives. This version (2.11) can generate archives which unpack themselves under CP/M or DOS. You will need PMARC to use PMSFX. […] {{cite web}}: External link in |quote= (help) [1]
  4. ^ Elliott, John C. (2009-10-27). "CP/M info program". Newsgroupcomp.os.cpm. Archived from the original on 2021-12-13. Retrieved 2021-12-13. […] More fun can be had with self-extract PMArc archives. Start one with […] defb 0EBh, 018h, '-pms-' […] and it's treated as a valid archive by the PMA utilities, sends 8086 processors to 011Ah, and Z80 processors to 0130h. […]

External links[]

Retrieved from ""