$Id: FAQ.txt,v 1.16 2006/09/06 13:46:30 roca Exp $
	------------------------------------------
	LDPC Codec Frequently Asked Questions (FAQ)
		vincent.roca@inrialpes.fr
	------------------------------------------


1- General
   1.0- Terminology
   1.1- What is the LDGM/LDPC codec useful for ?
   1.2- Why is it called ``large block'' ?
   1.3- What about small objects/blocks ?
   1.4- I have my own content transfer tool, how can I use the codec ?
   1.5- How can I best reduce memory requirements ?
   1.6- How to use the pkt_canvas[] tables of BuildParitySymbol/
	DecodingStepWithSymbol during encoding/decoding ?
   1.7- Is this codec patent free ?

2- Performances
   2.1- What encoding/decoding speed can I achieve ?
   2.2- What is the decoding inefficiency ratio I can expect ?
   2.3- What are the maximum memory requirements ?
   2.4- What are the benefits of the external memory management callbacks

3- Troubleshooting


--------------------------------------------------------------------------------

SECTION 1: General
------------------

1.0- Terminology
----------------

See draft-ietf-rmt-fec-bb-revised-03.txt and draft-ietf-rmt-bb-fec-ldpc-03.txt:

   Object: An ordered sequence of bytes to be transferred by the
      transport protocol.  For example, a file or stream. 
   
   Symbol: A unit of data processed by the Forward Error Correction 
      code.  A symbol is always considered as a unit i.e. it is either
      completely received or completely lost.

   Source symbol: A symbol containing information from the original
      object.

   Repair symbol: A symbol containing information generated by the FEC
      code which can be used to recover lost source symbols.

   Encoding symbol: A source symbol or a repair symbol.

   Encoder: The FEC scheme specific functions required to transform a
      object into FEC encoded data. i.e. the functions that produce 
      repair symbols using source symbols.
   
   Decoder: The FEC scheme specific functions required to transform
      received FEC encoded data into a copy of the original object

   Receiver: A system supporting the receiving functions of a CDP and
      FEC scheme according to this specification

   Sender: A system supporting the sending functions of a CDP and FEC
      scheme according to this specification

   Source Block: A part of the object formed from a subset of the
      object's source symbols

   CDP: Content Delivery Protocol

   FEC: Forward Error Correction

   Encoding Symbol Group: a group of encoding symbols that are sent
      together, within the same packet, and whose relationships to the
      source object can be derived from a single Encoding Symbol ID

   Source Packet: a data packet containing only source symbols

   Repair Packet: a data packet containing only repair symbols


1.1- What is the LDGM/LDPC codec useful for ?
---------------------------------------------

This codec is useful for all applications that transfer large amounts
of data. Yet our codec only operates on so-called ``Packet Erasure Channels''
(or PEC), i.e. channels that operate on packets (rather than bit streams),
and where a packet arriving at the decoder is guaranteed to be error-free.
Yet some of these packets may get lost (erased) by the channel.

Simply speaking, transmissions over the Internet and (W)LANs are examples of
such PEC. Over the Internet packets may be dropped because of congestion
problems. And on wireless LANs, packets may get corrupted by interferences
and dropped after CRC check at the receiver. But the content of a UDP/IP
packet given to the LDGM/LDPC decoder is guaranteed to be error-free.

Distributed storage and distributed content distribution (e.g. within
FEC enabled peer-to-peer systems) are other classes of applications that
may take advantage of our LDGM/LDPC codec.


1.2- Why is it called ``large block'' ?
---------------------------------------

This is the key feature of these codes. They can efficiently operate on
source blocks that are much larger than can more traditional Reed-Solomon
or Cauchy erasure codes. For instance a typical Reed-Solomon codec is
limited to blocks of size less than 255 packets, whereas our LDGM/LDPC
codec easily operate on blocks of several tens or hundreds of thousands of
packets. The only limit is the amount of physical memory available at
the encoder or decoder. And with 1 kilobyte packets, it means blocks of
several tens or hundreds of megabytes ! All objects (e.g. files) of
size inferior to this threshold will be encoded optimally, within a single
block.


1.3- What about small objects/blocks ?
--------------------------------------

If performance is higher with large blocks, these codes can also be
used with smaller blocks (up to a certain limit), using the LDPCFecScheme
class. This class defines the notion of packet, i.e. the possibility to
group several symbols in the same packet. The block size is artificially
increased, which dramatically improves the codec efficiency with small
objects. With large objects, there will be no change, a packet being
composed of a single symbol.
See the draft-ietf-rmt-bb-fec-ldpc-01.txt document for more information.
Using this class makes the symbol(s) <=> packet mapping almost transparent
to the user, the optimal number of symbols per packet being automatically
defined.

Example:

1) without any optimization, for an object composed of 100 source symbols
(and packets), each 1024 byte long, and the same number of parity symbols:

$ ./perf_tool2 -k100 -r100 -ss1024 -ps1024
random seed=745530
init_start=1141831988.745629
init_end=1141831988.745874  init_time=0.000245
LDPC/LDGM large block FEC codec - Version 2.0, March 8th, 2006
  Copyright (c) 2002-2006 INRIA - All rights reserved
  Authors: C. Neumann, V. Roca, L. Fazio, J. Laboure
  This codec contains code from R. Neal:
  Copyright (c) 1995-2003 by Radford M. Neal
  See the associated LICENCE.TXT file for licence information
  LDPC-Staircase codec

LDPC/LDGM extended performance tool
data_symbols=100  fec_symbols=100  symbol_size=1024  nb_symbol_per_pkt=1
total_nb_pkts=200  pkt_size=1024  object_size=102400  left_degree=3

build_fec_start=1141831988.746402
build_fec_end=1141831988.746716  build_fec_time=0.000314
random seed=747217
decoding_start=1141831988.747266
decoding_end=1141831988.747961  decoding_time=0.000695  decoding_steps=122
inefficiency_ratio=1.220

Result: 22.0% additional symbols are needed for decoding to complete
(i.e. inefficiency_ratio=1.220).


2) with symbol grouping, for the same 102400 byte object, same FEC ratio
(as many source and parity symbols), and same packet size:

[roca@iseran linux]$ ./perf_tool2 -os102400 -fec2.0 -ps1024
random seed=742879
init_start=1141832022.742982
init_end=1141832022.745334  init_time=0.002352
LDPC/LDGM large block FEC codec - Version 2.0, March 8th, 2006
  Copyright (c) 2002-2006 INRIA - All rights reserved
  Authors: C. Neumann, V. Roca, L. Fazio, J. Laboure
  This codec contains code from R. Neal:
  Copyright (c) 1995-2003 by Radford M. Neal
  See the associated LICENCE.TXT file for licence information
  LDPC-Staircase codec

LDPC/LDGM extended performance tool
data_symbols=1600  fec_symbols=1600  symbol_size=64  nb_symbol_per_pkt=16
total_nb_pkts=200  pkt_size=1024  object_size=102400  left_degree=3

build_fec_start=1141832022.746314
build_fec_end=1141832022.747025  build_fec_time=0.000711
random seed=747806
decoding_start=1141832022.747846
decoding_end=1141832022.750386  decoding_time=0.002540  decoding_steps=110
inefficiency_ratio=1.100

Result: only 10.0% additional symbols are needed for decoding to complete
(i.e. inefficiency_ratio=1.100). Here the number of source symbols is
artificially increased to 1600, as well as parity symbols. That's why...
The downside is that encoding/decoding is longer (i.e. decoding_time=0.002540
seconds instead of 0.000695), and memory requirements too (not visible).


1.4- I have my own content transfer tool, how can I use the codec ?
-------------------------------------------------------------------

The LDGM/LDPC codec has limitations in terms of block length (i.e. the number
of symbols in the block) and even if it called a large block FEC codec,
you'll probably have to split files that exceed a few 100,000s*symbol_size
bytes into several blocks... It makes the solution more complex but you have
no choice here.

Then each entity (sender and receiver) must agree on:

- the blocking algorithm: several solutions are possible here, the
  FLUTE document identifies what is probably the most efficient one
  since it enables all blocks to have the same size (+ or - 1 byte).
  See RFC3926, section 5.1.2.3, "Algorithm for Computing Source Block
  Structure"

- several FEC codec parameters, for the object: these parameters are
  defined in document: draft-ietf-rmt-bb-fec-ldpc-01.txt.

  Then the EXT_FTI header extension of ALC that contains all the
  information required for a given object has the following format:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   HET = 64    | HEL (=4 or 5) |                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
   |                      Transfer-Length (L)                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   Encoding Symbol Length (E)  |       G       |   B (MSB)     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |        B (LSB)        |   Max Nb of Enc. Symbols  (max_n)     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   .                       Optional PRNG seed                      .
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


With this info plus the blocking algorithm, the receiver can deduce how
many symbols are in each block.

Then each symbol sent by the source must also include the ID of the
associated block and the ID of the symbol in this block.
This is the FEC Payload ID info (FPI):

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Source Block Number  |      Encoding Symbol ID (20 bits)     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


We use a linear ID sequence space for all symbols generated for a block,
{0..k-1} are source packets, {k..n-1} are parity symbols.

So, even if this is not a trivial task, it's really worth the pain...
And read draft-ietf-rmt-bb-fec-ldpc-01.txt first, it will give you
the general guidelines, and have a look at our FLUTE/ALC implementation
in MCLv3.


1.5- How can I best reduce memory requirements ?
------------------------------------------------

At the encoder, if the application does not need to store FEC packets, but
can use an FEC packet as soon as it has been encoded, then, with LDPC
Staircase you only need to keep a copy of the current FEC packet (of index i)
and the previously produced FEC packet (of index i-1). Then, once packet i
has been sent, free packet i-1 which will no longer be required. Packet i is
required for producing packet i+1, and then can be freed, and so on.
Warning, if you use another codec than LDPC Staircase, then you cannot
apply this optimization!

But not all applications can do that. So if your application needs to 
remember all FEC packets produced, you have no choice, but this is required
by your application, not by the LDGM/LDPC codec!

At the decoder, FEC packets received from the network can be freed
immediately after they have been passed to the
LDPCFecSession::DecodingStepWithSymbol() function. They won't be used any
more, neither by the codec, nor by your application (IMHO).

Read 1.6- and 2.3- now...


1.6- How to use the pkt_canvas[] tables of BuildParitySymbol/
DecodingStepWithSymbol during encoding/decoding ?
-------------------------------------------------------------

Read 1.5- first.
At the encoder, with the two kinds of applications mentioned in 1.5, the
	void*	pkt_canvas[]; 
argument of LDPCFecSession::BuildParitySymbol() function, must have "n" entries,
to store all source plus FEC packets. But it does not mean that you absolutely
need to keep all FEC packets in memory, and with the optimized encoding
application above, the application should set the pkt_canvas[] entries for
no longer required FEC packets to NULL.

At the decoder, the
	void*	pkt_canvas[]; 
argument of LDPCFecSession::DecodingStepWithSymbol() function, must have
"k" entries to store only source packets. FEC packets previously received or decoded
are not required at all, and the codec remembers by other means they have
been already processed.

WARNING: this is a change compared to codecs up to version 1.3 included!
	The pkt_canvas[] table had to be of size "n", not "k". Yet giving
	a larger table will not create any problem to the codec, it's just
	a waste of room.


1.7- Is this codec patent free ?
--------------------------------

To the best of our knowledge, the basic encoding/decoding schemes do not
infringe any IPR/patent. Yet there is always a risk that a company shows
up and claims they hold IPRs. Only a tribunal can then decide, a posteriori,
if this argument can be retained or not. So, as the LGPL License says:

    This library is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
    Lesser General Public License for more details.

If ever you think our codec infringes a patent, please let us know so
that we can see how to address the problem!


SECTION 2: Performances
-----------------------


2.1- What encoding/decoding speed can I achieve ?
-------------------------------------------------

The perf_tool/perf_tool2 tools can be used for performance evaluation.

With the LDPC_v1.8 codec version, on a Pentium IV/3.06GHz/Linux, and
an object of 20,000 packets, each of size 1 kilobyte, producing 10,000
parity packets (FEC ratio=1.5, or code rate=2/3), we achieved:

  C. Neumann, ``Large Scale Content Delivery applied to Files and Videos'',
  PhD Dissertation, INPG (Institut National Polytechnique de Grenoble),
  France, December 2005.


STAIRCASE:
encoding (considering FEC packets only):	  734.0 Mbps
encoding (considering source+FEC packets):	2,205.0 Mbps
decoding:					  816.5 Mbps

TRIANGLE:
encoding (considering FEC packets only):	  443.8 Mbps
encoding (considering source+FEC packets):	1,331.4 Mbps
decoding:					  434.9 Mbps

Just for comparison, with a Reed-Solomon codec, we achieved:

REED-SOLOMON, if we set n = 255 (better erasure protection):
encoding (considering FEC packets only):	  21.5 Mbps
encoding (considering source+FEC packets):	  64.4 Mbps
decoding:					  52.3 Mbps

REED-SOLOMON, if we set k = 51 (higher speed):
encoding (considering FEC packets only):	  86.0 Mbps
encoding (considering source+FEC packets):	 258.0 Mbps
decoding:					 242.0 Mbps

In all cases, the CPU is 100% busy.
(see the INRIA Research Report RR-5225 for more details).

So the LDPC codec is compatible with all kinds of underlying networks.
It is also means that this codec requires less processing power than a
Reed-Solomon codec, a great advantage on lightweight hosts (PDAs,
smart-phones).


2.2- What is the decoding inefficiency ratio I can expect ?
-----------------------------------------------------------

This is a key aspect, which depends on:
	- the block size,
	- the FEC ratio parameter,
	- whether LDPC-Staircase or LDPC-Triangle is used,
	- the order in which the source and parity symbols are sent,
	- the channel erasure ratio, and
	- the decoding algorithm.
There is no predefined answer and this ratio can not be evaluated
in advance. This is typically an experimental measure.
You'll find some hints though in the following documents:

   [Roca04]   Roca, V. and C. Neumann, "Design, Evaluation and
              Comparison of Four Large Block FEC Codecs: LDPC, LDGM,
              LDPC-Staircase and LDPC-Triangle, Plus a Reed-Solomon
              Small Block FEC Codec",  INRIA Research Report RR-5225,
              June 2004.

   [Neumann05] C. Neumann, V. Roca, A. Francillon and D. Furodet,
              CoNext'05, October 2005.


2.3- What are the maximum memory requirements ?
-----------------------------------------------

Read 1.5- first.

Memory is required both to store source and FEC packets, and to store
the parity check matrix (plus the check nodes at the decoder).
With an object of 20,000 packets, each of size 1 kilobyte, producing
10,000 parity packets (FEC ratio=1.5, or code rate=2/3), with LDPC
Staircase (LDPC_v1.4), we measured:

Parity Check Matrix:	3 MB	(thanks to structures dedicated to its
				 sparse nature)
Other memory requirements at the encoder:
			30 MB	(default)
			20 MB	(the application encodes and sends FEC packets
				 on the fly, then frees the packet buffer)
Other memory requirements at the decoder:
			25 MB

(see the INRIA Research Report RR-5225 for more details).

So source and FEC packet storage largely dominate memory requirements at
both the encoder and decoder.

If you __only__ use LDPC-Staircase, you should activate the
OPTIMIZEFORMEMORY define in the src/ldpc_matrix_sparse.h file.

If you __also use LDPC-Triangle, then you should not activate the
OPTIMIZEFORMEMORY define (comment it) since it will cause severe performance
degradations.

By default, this define is not activated, as show below:
/**
 *  This define optimizes the memory consumption of matrices
 *  (the matrix size is decreased by 1/6).
 *  WARNING: with LDPC Triangle, the initialization time 
 *  becomes quite important when activating this option!!!
 *  In all other cases (LDGM, LDPC Staircase, LDPC) we
 *  did not noticed any impact on init/encoding/decoding
 *  times.
 */
//#define OPTIMIZEFORMEMORY

The maximum memory requirements can also be significantly reduced
by using a dedicated external memory management scheme, as explained
in section 2.4.


2.4- What are the benefits of the external memory management callbacks
----------------------------------------------------------------------

In release 1.8, May 2005, a large number of callbacks have been added
in addition to the already existing DecodedPkt_callback one.
The full list is:
	DecodedPkt_callback
	AllocTmpBuffer_callback
	GetData_callback
	GetDataPtrOnly_callback
	StoreData_callback
	FreePkt_callback
The idea is to have an external memory management scheme, that will
allocate/free memory instead of the LDPC codec, e.g. for internal
needs (e.g. for check node buffers and the various tables) or when
a new packet is decoded.

In MCLv3 (starting from release 2.99.6, May 2005), this feature is used
by a virtual receive memory management scheme, that enables a receiver
to decode objects whose size is an order of magnitude larger than the
available physical memory (RAM) size. Therefore, the LDPC codec is no
longer limited by the amount of RAM available in the host. Of course
it means that disk I/Os are sometimes required, which reduces the codec
speed. This is the downside of the technique.