blob: 144cd9aee2ecc718086b16f3bae63e9073521813 [file] [log] [blame]
Stefan Reinauer1569a852012-11-03 11:45:19 +00001LZMA SDK 9.20
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +02002-------------
3
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +02004LZMA SDK provides the documentation, samples, header files, libraries,
5and tools you need to develop applications that use LZMA compression.
6
7LZMA is default and general compression method of 7z format
8in 7-Zip compression program (www.7-zip.org). LZMA provides high
9compression ratio and very fast decompression.
10
11LZMA is an improved version of famous LZ77 compression algorithm.
12It was improved in way of maximum increasing of compression ratio,
13keeping high decompression speed and low memory requirements for
14decompressing.
15
16
17
18LICENSE
19-------
20
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020021LZMA SDK is written and placed in the public domain by Igor Pavlov.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020022
Stefan Reinauer1569a852012-11-03 11:45:19 +000023Some code in LZMA SDK is based on public domain code from another developers:
24 1) PPMd var.H (2001): Dmitry Shkarin
25 2) SHA-256: Wei Dai (Crypto++ library)
26
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020027
28LZMA SDK Contents
29-----------------
30
31LZMA SDK includes:
32
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020033 - ANSI-C/C++/C#/Java source code for LZMA compressing and decompressing
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020034 - Compiled file->file LZMA compressing/decompressing program for Windows system
35
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020036
37UNIX/Linux version
38------------------
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020039To compile C++ version of file->file LZMA encoding, go to directory
Stefan Reinauer1569a852012-11-03 11:45:19 +000040CPP/7zip/Bundles/LzmaCon
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020041and call make to recompile it:
42 make -f makefile.gcc clean all
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020043
44In some UNIX/Linux versions you must compile LZMA with static libraries.
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020045To compile with static libraries, you can use
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020046LIB = -lm -static
47
48
49Files
50---------------------
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020051lzma.txt - LZMA SDK description (this file)
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200527zFormat.txt - 7z Format description
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200537zC.txt - 7z ANSI-C Decoder description
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020054methods.txt - Compression method IDs for .7z
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020055lzma.exe - Compiled file->file LZMA encoder/decoder for Windows
Stefan Reinauer1569a852012-11-03 11:45:19 +0000567zr.exe - 7-Zip with 7z/lzma/xz support.
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020057history.txt - history of the LZMA SDK
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020058
59
60Source code structure
61---------------------
62
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020063C/ - C files
64 7zCrc*.* - CRC code
65 Alloc.* - Memory allocation functions
66 Bra*.* - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code
67 LzFind.* - Match finder for LZ (LZMA) encoders
68 LzFindMt.* - Match finder for LZ (LZMA) encoders for multithreading encoding
69 LzHash.h - Additional file for LZ match finder
70 LzmaDec.* - LZMA decoding
71 LzmaEnc.* - LZMA encoding
72 LzmaLib.* - LZMA Library for DLL calling
73 Types.h - Basic types for another .c files
Stefan Reinauer1569a852012-11-03 11:45:19 +000074 Threads.* - The code for multithreading.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020075
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020076 LzmaLib - LZMA Library (.DLL for Windows)
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020077
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020078 LzmaUtil - LZMA Utility (file->file LZMA encoder/decoder).
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020079
80 Archive - files related to archiving
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020081 7z - 7z ANSI-C Decoder
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020082
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020083CPP/ -- CPP files
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020084
85 Common - common files for C++ projects
86 Windows - common files for Windows related code
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +020087
88 7zip - files related to 7-Zip Project
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020089
90 Common - common files for 7-Zip
91
92 Compress - files related to compression/decompression
93
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +020094 Archive - files related to archiving
95
96 Common - common files for archive handling
97 7z - 7z C++ Encoder/Decoder
98
99 Bundles - Modules that are bundles of other modules
100
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200101 Alone7z - 7zr.exe: Standalone version of 7z.exe that supports only 7z/LZMA/BCJ/BCJ2
Stefan Reinauer1569a852012-11-03 11:45:19 +0000102 LzmaCon - lzma.exe: LZMA compression/decompression
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200103 Format7zR - 7zr.dll: Reduced version of 7za.dll: extracting/compressing to 7z/LZMA/BCJ/BCJ2
104 Format7zExtractR - 7zxr.dll: Reduced version of 7zxa.dll: extracting from 7z/LZMA/BCJ/BCJ2.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200105
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200106 UI - User Interface files
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200107
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200108 Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200109 Common - Common UI files
110 Console - Code for console archiver
111
112
113
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200114CS/ - C# files
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200115 7zip
116 Common - some common files for 7-Zip
117 Compress - files related to compression/decompression
118 LZ - files related to LZ (Lempel-Ziv) compression algorithm
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200119 LZMA - LZMA compression/decompression
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200120 LzmaAlone - file->file LZMA compression/decompression
121 RangeCoder - Range Coder (special code of compression/decompression)
122
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200123Java/ - Java files
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200124 SevenZip
125 Compression - files related to compression/decompression
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200126 LZ - files related to LZ (Lempel-Ziv) compression algorithm
127 LZMA - LZMA compression/decompression
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200128 RangeCoder - Range Coder (special code of compression/decompression)
129
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200130
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200131C/C++ source code of LZMA SDK is part of 7-Zip project.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +02001327-Zip source code can be downloaded from 7-Zip's SourceForge page:
133
134 http://sourceforge.net/projects/sevenzip/
135
136
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200137
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200138LZMA features
139-------------
140 - Variable dictionary size (up to 1 GB)
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200141 - Estimated compressing speed: about 2 MB/s on 2 GHz CPU
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200142 - Estimated decompressing speed:
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200143 - 20-30 MB/s on 2 GHz Core 2 or AMD Athlon 64
144 - 1-2 MB/s on 200 MHz ARM, MIPS, PowerPC or other simple RISC
145 - Small memory requirements for decompressing (16 KB + DictionarySize)
146 - Small code size for decompressing: 5-8 KB
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200147
148LZMA decoder uses only integer operations and can be
149implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions).
150
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200151Some critical operations that affect the speed of LZMA decompression:
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200152 1) 32*16 bit integer multiply
153 2) Misspredicted branches (penalty mostly depends from pipeline length)
154 3) 32-bit shift and arithmetic operations
155
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200156The speed of LZMA decompressing mostly depends from CPU speed.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200157Memory speed has no big meaning. But if your CPU has small data cache,
158overall weight of memory speed will slightly increase.
159
160
161How To Use
162----------
163
164Using LZMA encoder/decoder executable
165--------------------------------------
166
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200167Usage: LZMA <e|d> inputFile outputFile [<switches>...]
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200168
169 e: encode file
170
171 d: decode file
172
173 b: Benchmark. There are two tests: compressing and decompressing
174 with LZMA method. Benchmark shows rating in MIPS (million
175 instructions per second). Rating value is calculated from
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200176 measured speed and it is normalized with Intel's Core 2 results.
177 Also Benchmark checks possible hardware errors (RAM
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200178 errors in most cases). Benchmark uses these settings:
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200179 (-a1, -d21, -fb32, -mfbt4). You can change only -d parameter.
180 Also you can change the number of iterations. Example for 30 iterations:
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200181 LZMA b 30
182 Default number of iterations is 10.
183
184<Switches>
185
186
187 -a{N}: set compression mode 0 = fast, 1 = normal
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200188 default: 1 (normal)
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200189
190 d{N}: Sets Dictionary size - [0, 30], default: 23 (8MB)
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200191 The maximum value for dictionary size is 1 GB = 2^30 bytes.
192 Dictionary size is calculated as DictionarySize = 2^N bytes.
193 For decompressing file compressed by LZMA method with dictionary
194 size D = 2^N you need about D bytes of memory (RAM).
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200195
196 -fb{N}: set number of fast bytes - [5, 273], default: 128
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200197 Usually big number gives a little bit better compression ratio
198 and slower compression process.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200199
200 -lc{N}: set number of literal context bits - [0, 8], default: 3
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200201 Sometimes lc=4 gives gain for big files.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200202
203 -lp{N}: set number of literal pos bits - [0, 4], default: 0
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200204 lp switch is intended for periodical data when period is
205 equal 2^N. For example, for 32-bit (4 bytes)
206 periodical data you can use lp=2. Often it's better to set lc0,
207 if you change lp switch.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200208
209 -pb{N}: set number of pos bits - [0, 4], default: 2
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200210 pb switch is intended for periodical data
211 when period is equal 2^N.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200212
213 -mf{MF_ID}: set Match Finder. Default: bt4.
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200214 Algorithms from hc* group doesn't provide good compression
215 ratio, but they often works pretty fast in combination with
216 fast mode (-a0).
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200217
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200218 Memory requirements depend from dictionary size
219 (parameter "d" in table below).
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200220
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200221 MF_ID Memory Description
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200222
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200223 bt2 d * 9.5 + 4MB Binary Tree with 2 bytes hashing.
224 bt3 d * 11.5 + 4MB Binary Tree with 3 bytes hashing.
225 bt4 d * 11.5 + 4MB Binary Tree with 4 bytes hashing.
226 hc4 d * 7.5 + 4MB Hash Chain with 4 bytes hashing.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200227
228 -eos: write End Of Stream marker. By default LZMA doesn't write
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200229 eos marker, since LZMA decoder knows uncompressed size
230 stored in .lzma file header.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200231
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200232 -si: Read data from stdin (it will write End Of Stream marker).
233 -so: Write data to stdout
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200234
235
236Examples:
237
2381) LZMA e file.bin file.lzma -d16 -lc0
239
240compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K)
241and 0 literal context bits. -lc0 allows to reduce memory requirements
242for decompression.
243
244
2452) LZMA e file.bin file.lzma -lc0 -lp2
246
247compresses file.bin to file.lzma with settings suitable
248for 32-bit periodical data (for example, ARM or MIPS code).
249
2503) LZMA d file.lzma file.bin
251
252decompresses file.lzma to file.bin.
253
254
255Compression ratio hints
256-----------------------
257
258Recommendations
259---------------
260
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200261To increase the compression ratio for LZMA compressing it's desirable
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200262to have aligned data (if it's possible) and also it's desirable to locate
263data in such order, where code is grouped in one place and data is
264grouped in other place (it's better than such mixing: code, data, code,
265data, ...).
266
267
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200268Filters
269-------
270You can increase the compression ratio for some data types, using
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200271special filters before compressing. For example, it's possible to
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200272increase the compression ratio on 5-10% for code for those CPU ISAs:
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200273x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC.
274
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200275You can find C source code of such filters in C/Bra*.* files
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200276
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200277You can check the compression ratio gain of these filters with such
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +02002787-Zip commands (example for ARM code):
279No filter:
280 7z a a1.7z a.bin -m0=lzma
281
282With filter for little-endian ARM code:
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200283 7z a a2.7z a.bin -m0=arm -m1=lzma
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200284
285It works in such manner:
286Compressing = Filter_encoding + LZMA_encoding
287Decompressing = LZMA_decoding + Filter_decoding
288
289Compressing and decompressing speed of such filters is very high,
290so it will not increase decompressing time too much.
291Moreover, it reduces decompression time for LZMA_decoding,
292since compression ratio with filtering is higher.
293
294These filters convert CALL (calling procedure) instructions
295from relative offsets to absolute addresses, so such data becomes more
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200296compressible.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200297
298For some ISAs (for example, for MIPS) it's impossible to get gain from such filter.
299
300
301LZMA compressed file format
302---------------------------
303Offset Size Description
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200304 0 1 Special LZMA properties (lc,lp, pb in encoded form)
305 1 4 Dictionary size (little endian)
306 5 8 Uncompressed size (little endian). -1 means unknown size
307 13 Compressed data
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200308
309
310ANSI-C LZMA Decoder
311~~~~~~~~~~~~~~~~~~~
312
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200313Please note that interfaces for ANSI-C code were changed in LZMA SDK 4.58.
314If you want to use old interfaces you can download previous version of LZMA SDK
315from sourceforge.net site.
316
317To use ANSI-C LZMA Decoder you need the following files:
3181) LzmaDec.h + LzmaDec.c + Types.h
319LzmaUtil/LzmaUtil.c is example application that uses these files.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200320
321
322Memory requirements for LZMA decoding
323-------------------------------------
324
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200325Stack usage of LZMA decoding function for local variables is not
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200326larger than 200-400 bytes.
327
328LZMA Decoder uses dictionary buffer and internal state structure.
329Internal state structure consumes
330 state_size = (4 + (1.5 << (lc + lp))) KB
331by default (lc=3, lp=0), state_size = 16 KB.
332
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200333
334How To decompress data
335----------------------
336
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200337LZMA Decoder (ANSI-C version) now supports 2 interfaces:
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +02003381) Single-call Decompressing
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02003392) Multi-call State Decompressing (zlib-like interface)
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200340
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200341You must use external allocator:
342Example:
343void *SzAlloc(void *p, size_t size) { p = p; return malloc(size); }
344void SzFree(void *p, void *address) { p = p; free(address); }
345ISzAlloc alloc = { SzAlloc, SzFree };
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200346
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200347You can use p = p; operator to disable compiler warnings.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200348
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200349
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200350Single-call Decompressing
351-------------------------
352When to use: RAM->RAM decompressing
353Compile files: LzmaDec.h + LzmaDec.c + Types.h
354Compile defines: no defines
355Memory Requirements:
356 - Input buffer: compressed size
357 - Output buffer: uncompressed size
358 - LZMA Internal Structures: state_size (16 KB for default settings)
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200359
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200360Interface:
361 int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
362 const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,
363 ELzmaStatus *status, ISzAlloc *alloc);
364 In:
365 dest - output data
366 destLen - output data size
367 src - input data
368 srcLen - input data size
369 propData - LZMA properties (5 bytes)
370 propSize - size of propData buffer (5 bytes)
371 finishMode - It has meaning only if the decoding reaches output limit (*destLen).
Stefan Reinauer1569a852012-11-03 11:45:19 +0000372 LZMA_FINISH_ANY - Decode just destLen bytes.
373 LZMA_FINISH_END - Stream must be finished after (*destLen).
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200374 You can use LZMA_FINISH_END, when you know that
375 current output buffer covers last bytes of stream.
376 alloc - Memory allocator.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200377
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200378 Out:
379 destLen - processed output size
380 srcLen - processed input size
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200381
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200382 Output:
383 SZ_OK
384 status:
385 LZMA_STATUS_FINISHED_WITH_MARK
386 LZMA_STATUS_NOT_FINISHED
387 LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
388 SZ_ERROR_DATA - Data error
389 SZ_ERROR_MEM - Memory allocation error
390 SZ_ERROR_UNSUPPORTED - Unsupported properties
391 SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200392
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200393 If LZMA decoder sees end_marker before reaching output limit, it returns OK result,
394 and output value of destLen will be less than output buffer size limit.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200395
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200396 You can use multiple checks to test data integrity after full decompression:
397 1) Check Result and "status" variable.
398 2) Check that output(destLen) = uncompressedSize, if you know real uncompressedSize.
399 3) Check that output(srcLen) = compressedSize, if you know real compressedSize.
400 You must use correct finish mode in that case. */
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200401
402
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200403Multi-call State Decompressing (zlib-like interface)
404----------------------------------------------------
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200405
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200406When to use: file->file decompressing
407Compile files: LzmaDec.h + LzmaDec.c + Types.h
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200408
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200409Memory Requirements:
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200410 - Buffer for input stream: any size (for example, 16 KB)
411 - Buffer for output stream: any size (for example, 16 KB)
412 - LZMA Internal Structures: state_size (16 KB for default settings)
413 - LZMA dictionary (dictionary size is encoded in LZMA properties header)
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200414
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02004151) read LZMA properties (5 bytes) and uncompressed size (8 bytes, little-endian) to header:
416 unsigned char header[LZMA_PROPS_SIZE + 8];
417 ReadFile(inFile, header, sizeof(header)
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200418
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02004192) Allocate CLzmaDec structures (state + dictionary) using LZMA properties
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200420
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200421 CLzmaDec state;
422 LzmaDec_Constr(&state);
423 res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc);
424 if (res != SZ_OK)
425 return res;
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200426
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02004273) Init LzmaDec structure before any new LZMA stream. And call LzmaDec_DecodeToBuf in loop
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200428
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200429 LzmaDec_Init(&state);
430 for (;;)
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200431 {
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200432 ...
433 int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
Stefan Reinauer1569a852012-11-03 11:45:19 +0000434 const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200435 ...
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200436 }
437
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200438
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02004394) Free all allocated structures
440 LzmaDec_Free(&state, &g_Alloc);
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200441
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200442For full code example, look at C/LzmaUtil/LzmaUtil.c code.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200443
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200444
445How To compress data
446--------------------
447
448Compile files: LzmaEnc.h + LzmaEnc.c + Types.h +
449LzFind.c + LzFind.h + LzFindMt.c + LzFindMt.h + LzHash.h
450
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200451Memory Requirements:
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200452 - (dictSize * 11.5 + 6 MB) + state_size
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200453
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200454Lzma Encoder can use two memory allocators:
4551) alloc - for small arrays.
4562) allocBig - for big arrays.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200457
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200458For example, you can use Large RAM Pages (2 MB) in allocBig allocator for
459better compression speed. Note that Windows has bad implementation for
460Large RAM Pages.
461It's OK to use same allocator for alloc and allocBig.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200462
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200463
464Single-call Compression with callbacks
465--------------------------------------
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200466
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200467Check C/LzmaUtil/LzmaUtil.c as example,
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200468
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200469When to use: file->file decompressing
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200470
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02004711) you must implement callback structures for interfaces:
472ISeqInStream
473ISeqOutStream
474ICompressProgress
475ISzAlloc
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200476
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200477static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
478static void SzFree(void *p, void *address) { p = p; MyFree(address); }
479static ISzAlloc g_Alloc = { SzAlloc, SzFree };
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200480
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200481 CFileSeqInStream inStream;
482 CFileSeqOutStream outStream;
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200483
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200484 inStream.funcTable.Read = MyRead;
485 inStream.file = inFile;
486 outStream.funcTable.Write = MyWrite;
487 outStream.file = outFile;
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200488
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200489
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02004902) Create CLzmaEncHandle object;
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200491
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200492 CLzmaEncHandle enc;
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200493
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200494 enc = LzmaEnc_Create(&g_Alloc);
495 if (enc == 0)
496 return SZ_ERROR_MEM;
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200497
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200498
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02004993) initialize CLzmaEncProps properties;
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200500
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200501 LzmaEncProps_Init(&props);
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200502
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200503 Then you can change some properties in that structure.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200504
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02005054) Send LZMA properties to LZMA Encoder
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200506
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200507 res = LzmaEnc_SetProps(enc, &props);
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200508
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02005095) Write encoded properties to header
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200510
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200511 Byte header[LZMA_PROPS_SIZE + 8];
512 size_t headerSize = LZMA_PROPS_SIZE;
513 UInt64 fileSize;
514 int i;
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200515
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200516 res = LzmaEnc_WriteProperties(enc, header, &headerSize);
517 fileSize = MyGetFileLength(inFile);
518 for (i = 0; i < 8; i++)
519 header[headerSize++] = (Byte)(fileSize >> (8 * i));
520 MyWriteFileAndCheck(outFile, header, headerSize)
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200521
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02005226) Call encoding function:
523 res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable,
524 NULL, &g_Alloc, &g_Alloc);
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200525
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +02005267) Destroy LZMA Encoder Object
527 LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc);
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200528
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200529
Stefan Reinauer1569a852012-11-03 11:45:19 +0000530If callback function return some error code, LzmaEnc_Encode also returns that code
531or it can return the code like SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200532
533
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200534Single-call RAM->RAM Compression
535--------------------------------
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200536
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200537Single-call RAM->RAM Compression is similar to Compression with callbacks,
538but you provide pointers to buffers instead of pointers to stream callbacks:
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200539
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200540HRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
541 CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,
542 ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200543
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200544Return code:
545 SZ_OK - OK
546 SZ_ERROR_MEM - Memory allocation error
547 SZ_ERROR_PARAM - Incorrect paramater
548 SZ_ERROR_OUTPUT_EOF - output buffer overflow
549 SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200550
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200551
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200552
Stefan Reinauer1569a852012-11-03 11:45:19 +0000553Defines
554-------
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200555
556_LZMA_SIZE_OPT - Enable some optimizations in LZMA Decoder to get smaller executable code.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200557
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200558_LZMA_PROB32 - It can increase the speed on some 32-bit CPUs, but memory usage for
559 some structures will be doubled in that case.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200560
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200561_LZMA_UINT32_IS_ULONG - Define it if int is 16-bit on your compiler and long is 32-bit.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200562
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200563_LZMA_NO_SYSTEM_SIZE_T - Define it if you don't want to use size_t type.
564
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200565
Stefan Reinauer1569a852012-11-03 11:45:19 +0000566_7ZIP_PPMD_SUPPPORT - Define it if you don't want to support PPMD method in AMSI-C .7z decoder.
567
568
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200569C++ LZMA Encoder/Decoder
570~~~~~~~~~~~~~~~~~~~~~~~~
571C++ LZMA code use COM-like interfaces. So if you want to use it,
572you can study basics of COM/OLE.
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200573C++ LZMA code is just wrapper over ANSI-C code.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200574
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200575
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200576C++ Notes
577~~~~~~~~~~~~~~~~~~~~~~~~
578If you use some C++ code folders in 7-Zip (for example, C++ code for .7z handling),
579you must check that you correctly work with "new" operator.
5807-Zip can be compiled with MSVC 6.0 that doesn't throw "exception" from "new" operator.
581So 7-Zip uses "CPP\Common\NewHandler.cpp" that redefines "new" operator:
582operator new(size_t size)
583{
584 void *p = ::malloc(size);
585 if (p == 0)
586 throw CNewException();
587 return p;
588}
589If you use MSCV that throws exception for "new" operator, you can compile without
590"NewHandler.cpp". So standard exception will be used. Actually some code of
5917-Zip catches any exception in internal code and converts it to HRESULT code.
592So you don't need to catch CNewException, if you call COM interfaces of 7-Zip.
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200593
594---
595
596http://www.7-zip.org
Luigi 'Comio' Mantellinid02bd742009-07-21 10:45:49 +0200597http://www.7-zip.org/sdk.html
Luigi 'Comio' Mantellini35afc062008-09-08 02:46:13 +0200598http://www.7-zip.org/support.html