$Id: FAQ.txt,v 1.16 2006/09/06 13:46:30 roca Exp $ ------------------------------------------ LDPC Codec Frequently Asked Questions (FAQ) vincent.roca@inrialpes.fr ------------------------------------------ 1- General 1.0- Terminology 1.1- What is the LDGM/LDPC codec useful for ? 1.2- Why is it called ``large block'' ? 1.3- What about small objects/blocks ? 1.4- I have my own content transfer tool, how can I use the codec ? 1.5- How can I best reduce memory requirements ? 1.6- How to use the pkt_canvas[] tables of BuildParitySymbol/ DecodingStepWithSymbol during encoding/decoding ? 1.7- Is this codec patent free ? 2- Performances 2.1- What encoding/decoding speed can I achieve ? 2.2- What is the decoding inefficiency ratio I can expect ? 2.3- What are the maximum memory requirements ? 2.4- What are the benefits of the external memory management callbacks 3- Troubleshooting -------------------------------------------------------------------------------- SECTION 1: General ------------------ 1.0- Terminology ---------------- See draft-ietf-rmt-fec-bb-revised-03.txt and draft-ietf-rmt-bb-fec-ldpc-03.txt: Object: An ordered sequence of bytes to be transferred by the transport protocol. For example, a file or stream. Symbol: A unit of data processed by the Forward Error Correction code. A symbol is always considered as a unit i.e. it is either completely received or completely lost. Source symbol: A symbol containing information from the original object. Repair symbol: A symbol containing information generated by the FEC code which can be used to recover lost source symbols. Encoding symbol: A source symbol or a repair symbol. Encoder: The FEC scheme specific functions required to transform a object into FEC encoded data. i.e. the functions that produce repair symbols using source symbols. Decoder: The FEC scheme specific functions required to transform received FEC encoded data into a copy of the original object Receiver: A system supporting the receiving functions of a CDP and FEC scheme according to this specification Sender: A system supporting the sending functions of a CDP and FEC scheme according to this specification Source Block: A part of the object formed from a subset of the object's source symbols CDP: Content Delivery Protocol FEC: Forward Error Correction Encoding Symbol Group: a group of encoding symbols that are sent together, within the same packet, and whose relationships to the source object can be derived from a single Encoding Symbol ID Source Packet: a data packet containing only source symbols Repair Packet: a data packet containing only repair symbols 1.1- What is the LDGM/LDPC codec useful for ? --------------------------------------------- This codec is useful for all applications that transfer large amounts of data. Yet our codec only operates on so-called ``Packet Erasure Channels'' (or PEC), i.e. channels that operate on packets (rather than bit streams), and where a packet arriving at the decoder is guaranteed to be error-free. Yet some of these packets may get lost (erased) by the channel. Simply speaking, transmissions over the Internet and (W)LANs are examples of such PEC. Over the Internet packets may be dropped because of congestion problems. And on wireless LANs, packets may get corrupted by interferences and dropped after CRC check at the receiver. But the content of a UDP/IP packet given to the LDGM/LDPC decoder is guaranteed to be error-free. Distributed storage and distributed content distribution (e.g. within FEC enabled peer-to-peer systems) are other classes of applications that may take advantage of our LDGM/LDPC codec. 1.2- Why is it called ``large block'' ? --------------------------------------- This is the key feature of these codes. They can efficiently operate on source blocks that are much larger than can more traditional Reed-Solomon or Cauchy erasure codes. For instance a typical Reed-Solomon codec is limited to blocks of size less than 255 packets, whereas our LDGM/LDPC codec easily operate on blocks of several tens or hundreds of thousands of packets. The only limit is the amount of physical memory available at the encoder or decoder. And with 1 kilobyte packets, it means blocks of several tens or hundreds of megabytes ! All objects (e.g. files) of size inferior to this threshold will be encoded optimally, within a single block. 1.3- What about small objects/blocks ? -------------------------------------- If performance is higher with large blocks, these codes can also be used with smaller blocks (up to a certain limit), using the LDPCFecScheme class. This class defines the notion of packet, i.e. the possibility to group several symbols in the same packet. The block size is artificially increased, which dramatically improves the codec efficiency with small objects. With large objects, there will be no change, a packet being composed of a single symbol. See the draft-ietf-rmt-bb-fec-ldpc-01.txt document for more information. Using this class makes the symbol(s) <=> packet mapping almost transparent to the user, the optimal number of symbols per packet being automatically defined. Example: 1) without any optimization, for an object composed of 100 source symbols (and packets), each 1024 byte long, and the same number of parity symbols: $ ./perf_tool2 -k100 -r100 -ss1024 -ps1024 random seed=745530 init_start=1141831988.745629 init_end=1141831988.745874 init_time=0.000245 LDPC/LDGM large block FEC codec - Version 2.0, March 8th, 2006 Copyright (c) 2002-2006 INRIA - All rights reserved Authors: C. Neumann, V. Roca, L. Fazio, J. Laboure This codec contains code from R. Neal: Copyright (c) 1995-2003 by Radford M. Neal See the associated LICENCE.TXT file for licence information LDPC-Staircase codec LDPC/LDGM extended performance tool data_symbols=100 fec_symbols=100 symbol_size=1024 nb_symbol_per_pkt=1 total_nb_pkts=200 pkt_size=1024 object_size=102400 left_degree=3 build_fec_start=1141831988.746402 build_fec_end=1141831988.746716 build_fec_time=0.000314 random seed=747217 decoding_start=1141831988.747266 decoding_end=1141831988.747961 decoding_time=0.000695 decoding_steps=122 inefficiency_ratio=1.220 Result: 22.0% additional symbols are needed for decoding to complete (i.e. inefficiency_ratio=1.220). 2) with symbol grouping, for the same 102400 byte object, same FEC ratio (as many source and parity symbols), and same packet size: [roca@iseran linux]$ ./perf_tool2 -os102400 -fec2.0 -ps1024 random seed=742879 init_start=1141832022.742982 init_end=1141832022.745334 init_time=0.002352 LDPC/LDGM large block FEC codec - Version 2.0, March 8th, 2006 Copyright (c) 2002-2006 INRIA - All rights reserved Authors: C. Neumann, V. Roca, L. Fazio, J. Laboure This codec contains code from R. Neal: Copyright (c) 1995-2003 by Radford M. Neal See the associated LICENCE.TXT file for licence information LDPC-Staircase codec LDPC/LDGM extended performance tool data_symbols=1600 fec_symbols=1600 symbol_size=64 nb_symbol_per_pkt=16 total_nb_pkts=200 pkt_size=1024 object_size=102400 left_degree=3 build_fec_start=1141832022.746314 build_fec_end=1141832022.747025 build_fec_time=0.000711 random seed=747806 decoding_start=1141832022.747846 decoding_end=1141832022.750386 decoding_time=0.002540 decoding_steps=110 inefficiency_ratio=1.100 Result: only 10.0% additional symbols are needed for decoding to complete (i.e. inefficiency_ratio=1.100). Here the number of source symbols is artificially increased to 1600, as well as parity symbols. That's why... The downside is that encoding/decoding is longer (i.e. decoding_time=0.002540 seconds instead of 0.000695), and memory requirements too (not visible). 1.4- I have my own content transfer tool, how can I use the codec ? ------------------------------------------------------------------- The LDGM/LDPC codec has limitations in terms of block length (i.e. the number of symbols in the block) and even if it called a large block FEC codec, you'll probably have to split files that exceed a few 100,000s*symbol_size bytes into several blocks... It makes the solution more complex but you have no choice here. Then each entity (sender and receiver) must agree on: - the blocking algorithm: several solutions are possible here, the FLUTE document identifies what is probably the most efficient one since it enables all blocks to have the same size (+ or - 1 byte). See RFC3926, section 5.1.2.3, "Algorithm for Computing Source Block Structure" - several FEC codec parameters, for the object: these parameters are defined in document: draft-ietf-rmt-bb-fec-ldpc-01.txt. Then the EXT_FTI header extension of ALC that contains all the information required for a given object has the following format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | HET = 64 | HEL (=4 or 5) | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Transfer-Length (L) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Encoding Symbol Length (E) | G | B (MSB) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | B (LSB) | Max Nb of Enc. Symbols (max_n) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . Optional PRNG seed . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ With this info plus the blocking algorithm, the receiver can deduce how many symbols are in each block. Then each symbol sent by the source must also include the ID of the associated block and the ID of the symbol in this block. This is the FEC Payload ID info (FPI): 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Block Number | Encoding Symbol ID (20 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ We use a linear ID sequence space for all symbols generated for a block, {0..k-1} are source packets, {k..n-1} are parity symbols. So, even if this is not a trivial task, it's really worth the pain... And read draft-ietf-rmt-bb-fec-ldpc-01.txt first, it will give you the general guidelines, and have a look at our FLUTE/ALC implementation in MCLv3. 1.5- How can I best reduce memory requirements ? ------------------------------------------------ At the encoder, if the application does not need to store FEC packets, but can use an FEC packet as soon as it has been encoded, then, with LDPC Staircase you only need to keep a copy of the current FEC packet (of index i) and the previously produced FEC packet (of index i-1). Then, once packet i has been sent, free packet i-1 which will no longer be required. Packet i is required for producing packet i+1, and then can be freed, and so on. Warning, if you use another codec than LDPC Staircase, then you cannot apply this optimization! But not all applications can do that. So if your application needs to remember all FEC packets produced, you have no choice, but this is required by your application, not by the LDGM/LDPC codec! At the decoder, FEC packets received from the network can be freed immediately after they have been passed to the LDPCFecSession::DecodingStepWithSymbol() function. They won't be used any more, neither by the codec, nor by your application (IMHO). Read 1.6- and 2.3- now... 1.6- How to use the pkt_canvas[] tables of BuildParitySymbol/ DecodingStepWithSymbol during encoding/decoding ? ------------------------------------------------------------- Read 1.5- first. At the encoder, with the two kinds of applications mentioned in 1.5, the void* pkt_canvas[]; argument of LDPCFecSession::BuildParitySymbol() function, must have "n" entries, to store all source plus FEC packets. But it does not mean that you absolutely need to keep all FEC packets in memory, and with the optimized encoding application above, the application should set the pkt_canvas[] entries for no longer required FEC packets to NULL. At the decoder, the void* pkt_canvas[]; argument of LDPCFecSession::DecodingStepWithSymbol() function, must have "k" entries to store only source packets. FEC packets previously received or decoded are not required at all, and the codec remembers by other means they have been already processed. WARNING: this is a change compared to codecs up to version 1.3 included! The pkt_canvas[] table had to be of size "n", not "k". Yet giving a larger table will not create any problem to the codec, it's just a waste of room. 1.7- Is this codec patent free ? -------------------------------- To the best of our knowledge, the basic encoding/decoding schemes do not infringe any IPR/patent. Yet there is always a risk that a company shows up and claims they hold IPRs. Only a tribunal can then decide, a posteriori, if this argument can be retained or not. So, as the LGPL License says: This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. If ever you think our codec infringes a patent, please let us know so that we can see how to address the problem! SECTION 2: Performances ----------------------- 2.1- What encoding/decoding speed can I achieve ? ------------------------------------------------- The perf_tool/perf_tool2 tools can be used for performance evaluation. With the LDPC_v1.8 codec version, on a Pentium IV/3.06GHz/Linux, and an object of 20,000 packets, each of size 1 kilobyte, producing 10,000 parity packets (FEC ratio=1.5, or code rate=2/3), we achieved: C. Neumann, ``Large Scale Content Delivery applied to Files and Videos'', PhD Dissertation, INPG (Institut National Polytechnique de Grenoble), France, December 2005. STAIRCASE: encoding (considering FEC packets only): 734.0 Mbps encoding (considering source+FEC packets): 2,205.0 Mbps decoding: 816.5 Mbps TRIANGLE: encoding (considering FEC packets only): 443.8 Mbps encoding (considering source+FEC packets): 1,331.4 Mbps decoding: 434.9 Mbps Just for comparison, with a Reed-Solomon codec, we achieved: REED-SOLOMON, if we set n = 255 (better erasure protection): encoding (considering FEC packets only): 21.5 Mbps encoding (considering source+FEC packets): 64.4 Mbps decoding: 52.3 Mbps REED-SOLOMON, if we set k = 51 (higher speed): encoding (considering FEC packets only): 86.0 Mbps encoding (considering source+FEC packets): 258.0 Mbps decoding: 242.0 Mbps In all cases, the CPU is 100% busy. (see the INRIA Research Report RR-5225 for more details). So the LDPC codec is compatible with all kinds of underlying networks. It is also means that this codec requires less processing power than a Reed-Solomon codec, a great advantage on lightweight hosts (PDAs, smart-phones). 2.2- What is the decoding inefficiency ratio I can expect ? ----------------------------------------------------------- This is a key aspect, which depends on: - the block size, - the FEC ratio parameter, - whether LDPC-Staircase or LDPC-Triangle is used, - the order in which the source and parity symbols are sent, - the channel erasure ratio, and - the decoding algorithm. There is no predefined answer and this ratio can not be evaluated in advance. This is typically an experimental measure. You'll find some hints though in the following documents: [Roca04] Roca, V. and C. Neumann, "Design, Evaluation and Comparison of Four Large Block FEC Codecs: LDPC, LDGM, LDPC-Staircase and LDPC-Triangle, Plus a Reed-Solomon Small Block FEC Codec", INRIA Research Report RR-5225, June 2004. [Neumann05] C. Neumann, V. Roca, A. Francillon and D. Furodet, CoNext'05, October 2005. 2.3- What are the maximum memory requirements ? ----------------------------------------------- Read 1.5- first. Memory is required both to store source and FEC packets, and to store the parity check matrix (plus the check nodes at the decoder). With an object of 20,000 packets, each of size 1 kilobyte, producing 10,000 parity packets (FEC ratio=1.5, or code rate=2/3), with LDPC Staircase (LDPC_v1.4), we measured: Parity Check Matrix: 3 MB (thanks to structures dedicated to its sparse nature) Other memory requirements at the encoder: 30 MB (default) 20 MB (the application encodes and sends FEC packets on the fly, then frees the packet buffer) Other memory requirements at the decoder: 25 MB (see the INRIA Research Report RR-5225 for more details). So source and FEC packet storage largely dominate memory requirements at both the encoder and decoder. If you __only__ use LDPC-Staircase, you should activate the OPTIMIZEFORMEMORY define in the src/ldpc_matrix_sparse.h file. If you __also use LDPC-Triangle, then you should not activate the OPTIMIZEFORMEMORY define (comment it) since it will cause severe performance degradations. By default, this define is not activated, as show below: /** * This define optimizes the memory consumption of matrices * (the matrix size is decreased by 1/6). * WARNING: with LDPC Triangle, the initialization time * becomes quite important when activating this option!!! * In all other cases (LDGM, LDPC Staircase, LDPC) we * did not noticed any impact on init/encoding/decoding * times. */ //#define OPTIMIZEFORMEMORY The maximum memory requirements can also be significantly reduced by using a dedicated external memory management scheme, as explained in section 2.4. 2.4- What are the benefits of the external memory management callbacks ---------------------------------------------------------------------- In release 1.8, May 2005, a large number of callbacks have been added in addition to the already existing DecodedPkt_callback one. The full list is: DecodedPkt_callback AllocTmpBuffer_callback GetData_callback GetDataPtrOnly_callback StoreData_callback FreePkt_callback The idea is to have an external memory management scheme, that will allocate/free memory instead of the LDPC codec, e.g. for internal needs (e.g. for check node buffers and the various tables) or when a new packet is decoded. In MCLv3 (starting from release 2.99.6, May 2005), this feature is used by a virtual receive memory management scheme, that enables a receiver to decode objects whose size is an order of magnitude larger than the available physical memory (RAM) size. Therefore, the LDPC codec is no longer limited by the amount of RAM available in the host. Of course it means that disk I/Os are sometimes required, which reduces the codec speed. This is the downside of the technique.