Class BHSDCodec


  • public final class BHSDCodec
    extends Codec
    A BHSD codec is a means of encoding integer values as a sequence of bytes or vice versa using a specified "BHSD" encoding mechanism. It uses a variable-length encoding and a modified sign representation such that small numbers are represented as a single byte, whilst larger numbers take more bytes to encode. The number may be signed or unsigned; if it is unsigned, it can be weighted towards positive numbers or equally distributed using a one's complement. The Codec also supports delta coding, where a sequence of numbers is represented as a series of first-order differences. So a delta encoding of the integers [1..10] would be represented as a sequence of 10x1s. This allows the absolute value of a coded integer to fall outside of the 'small number' range, whilst still being encoded as a single byte. A BHSD codec is configured with four parameters:
    B
    The maximum number of bytes that each value is encoded as. B must be a value between [1..5]. For a pass-through coding (where each byte is encoded as itself, aka Codec.BYTE1, B is 1 (each byte takes a maximum of 1 byte).
    H
    The radix of the integer. Values are defined as a sequence of values, where value n is multiplied by H^n. So the number 1234 may be represented as the sequence 4 3 2 1 with a radix (H) of 10. Note that other permutations are also possible; 43 2 1 will also encode 1234. The co-parameter L is defined as 256-H. This is important because only the last value in a sequence may be < L; all prior values must be > L.
    S
    Whether the codec represents signed values (or not). This may have 3 values; 0 (unsigned), 1 (signed, one's complement) or 2 (signed, two's complement)
    D
    Whether the codec represents a delta encoding. This may be 0 (no delta) or 1 (delta encoding). A delta encoding of 1 indicates that values are cumulative; a sequence of 1 1 1 1 1 will represent the sequence 1 2 3 4 5. For this reason, the codec supports two variants of decode; one with and one without a last parameter. If the codec is a non-delta encoding, then the value is ignored if passed. If the codec is a delta encoding, it is a run-time error to call the value without the extra parameter, and the previous value should be returned. (It was designed this way to support multi-threaded access without requiring a new instance of the Codec to be cloned for each use.)
    Codecs are notated as (B,H,S,D) and either D or S,D may be omitted if zero. Thus Codec.BYTE1 is denoted (1,256,0,0) or (1,256). The toString() method prints out the condensed form of the encoding. Often, the last character in the name (Codec.BYTE1, Codec.UNSIGNED5) gives a clue as to the B value. Those that start with U (Codec.UDELTA5, Codec.UNSIGNED5) are unsigned; otherwise, in most cases, they are signed. The presence of the word Delta (Codec.DELTA5, Codec.UDELTA5) indicates a delta encoding is used.
    • Constructor Summary

      Constructors 
      Constructor Description
      BHSDCodec​(int b, int h)
      Constructs an unsigned, non-delta Codec with the given B and H values.
      BHSDCodec​(int b, int h, int s)
      Constructs a non-delta Codec with the given B, H and S values.
      BHSDCodec​(int b, int h, int s, int d)
      Constructs a Codec with the given B, H, S and D values.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      long cardinality()
      Returns the cardinality of this codec; that is, the number of distinct values that it can contain.
      int decode​(java.io.InputStream in)
      Decode a sequence of bytes from the given input stream, returning the value as a long.
      int decode​(java.io.InputStream in, long last)
      Decode a sequence of bytes from the given input stream, returning the value as a long.
      int[] decodeInts​(int n, java.io.InputStream in)
      Decodes a sequence of n values from in.
      int[] decodeInts​(int n, java.io.InputStream in, int firstValue)
      Decodes a sequence of n values from in.
      byte[] encode​(int value)
      Encode a single value into a sequence of bytes.
      byte[] encode​(int value, int last)
      Encode a single value into a sequence of bytes.
      boolean encodes​(long value)
      True if this encoding can code the given value
      boolean equals​(java.lang.Object o)  
      int getB()  
      int getH()  
      int getL()  
      int getS()  
      int hashCode()  
      boolean isDelta()
      Returns true if this codec is a delta codec
      boolean isSigned()
      Returns true if this codec is a signed codec
      long largest()
      Returns the largest value that this codec can represent.
      long smallest()
      Returns the smallest value that this codec can represent.
      java.lang.String toString()
      Returns the codec in the form (1,256) or (1,64,1,1).
      • Methods inherited from class org.apache.commons.compress.harmony.pack200.Codec

        encode
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, wait, wait, wait
    • Constructor Detail

      • BHSDCodec

        public BHSDCodec​(int b,
                         int h)
        Constructs an unsigned, non-delta Codec with the given B and H values.
        Parameters:
        b - the maximum number of bytes that a value can be encoded as [1..5]
        h - the radix of the encoding [1..256]
      • BHSDCodec

        public BHSDCodec​(int b,
                         int h,
                         int s)
        Constructs a non-delta Codec with the given B, H and S values.
        Parameters:
        b - the maximum number of bytes that a value can be encoded as [1..5]
        h - the radix of the encoding [1..256]
        s - whether the encoding represents signed numbers (s=0 is unsigned; s=1 is signed with 1s complement; s=2 is signed with ?)
      • BHSDCodec

        public BHSDCodec​(int b,
                         int h,
                         int s,
                         int d)
        Constructs a Codec with the given B, H, S and D values.
        Parameters:
        b - the maximum number of bytes that a value can be encoded as [1..5]
        h - the radix of the encoding [1..256]
        s - whether the encoding represents signed numbers (s=0 is unsigned; s=1 is signed with 1s complement; s=2 is signed with ?)
        d - whether this is a delta encoding (d=0 is non-delta; d=1 is delta)
    • Method Detail

      • cardinality

        public long cardinality()
        Returns the cardinality of this codec; that is, the number of distinct values that it can contain.
        Returns:
        the cardinality of this codec
      • decode

        public int decode​(java.io.InputStream in)
                   throws java.io.IOException,
                          Pack200Exception
        Description copied from class: Codec
        Decode a sequence of bytes from the given input stream, returning the value as a long. Note that this method can only be applied for non-delta encodings.
        Specified by:
        decode in class Codec
        Parameters:
        in - the input stream to read from
        Returns:
        the value as a long
        Throws:
        java.io.IOException - if there is a problem reading from the underlying input stream
        Pack200Exception - if the encoding is a delta encoding
      • decode

        public int decode​(java.io.InputStream in,
                          long last)
                   throws java.io.IOException,
                          Pack200Exception
        Description copied from class: Codec
        Decode a sequence of bytes from the given input stream, returning the value as a long. If this encoding is a delta encoding (d=1) then the previous value must be passed in as a parameter. If it is a non-delta encoding, then it does not matter what value is passed in, so it makes sense for the value to be passed in by default using code similar to:
         long last = 0;
         while (condition) {
             last = codec.decode(in, last);
             // do something with last
         }
         
        Specified by:
        decode in class Codec
        Parameters:
        in - the input stream to read from
        last - the previous value read, which must be supplied if the codec is a delta encoding
        Returns:
        the value as a long
        Throws:
        java.io.IOException - if there is a problem reading from the underlying input stream
        Pack200Exception - if there is a problem decoding the value or that the value is invalid
      • decodeInts

        public int[] decodeInts​(int n,
                                java.io.InputStream in)
                         throws java.io.IOException,
                                Pack200Exception
        Description copied from class: Codec
        Decodes a sequence of n values from in. This should probably be used in most cases, since some codecs (such as PopulationCodec) only work when the number of values to be read is known.
        Overrides:
        decodeInts in class Codec
        Parameters:
        n - the number of values to decode
        in - the input stream to read from
        Returns:
        an array of int values corresponding to values decoded
        Throws:
        java.io.IOException - if there is a problem reading from the underlying input stream
        Pack200Exception - if there is a problem decoding the value or that the value is invalid
      • decodeInts

        public int[] decodeInts​(int n,
                                java.io.InputStream in,
                                int firstValue)
                         throws java.io.IOException,
                                Pack200Exception
        Description copied from class: Codec
        Decodes a sequence of n values from in.
        Overrides:
        decodeInts in class Codec
        Parameters:
        n - the number of values to decode
        in - the input stream to read from
        firstValue - the first value in the band if it has already been read
        Returns:
        an array of int values corresponding to values decoded, with firstValue as the first value in the array.
        Throws:
        java.io.IOException - if there is a problem reading from the underlying input stream
        Pack200Exception - if there is a problem decoding the value or that the value is invalid
      • encodes

        public boolean encodes​(long value)
        True if this encoding can code the given value
        Parameters:
        value - the value to check
        Returns:
        true if the encoding can encode this value
      • encode

        public byte[] encode​(int value,
                             int last)
                      throws Pack200Exception
        Description copied from class: Codec
        Encode a single value into a sequence of bytes.
        Specified by:
        encode in class Codec
        Parameters:
        value - the value to encode
        last - the previous value encoded (for delta encodings)
        Returns:
        the encoded bytes
        Throws:
        Pack200Exception - TODO
      • encode

        public byte[] encode​(int value)
                      throws Pack200Exception
        Description copied from class: Codec
        Encode a single value into a sequence of bytes. Note that this method can only be used for non-delta encodings.
        Specified by:
        encode in class Codec
        Parameters:
        value - the value to encode
        Returns:
        the encoded bytes
        Throws:
        Pack200Exception - TODO
      • isDelta

        public boolean isDelta()
        Returns true if this codec is a delta codec
        Returns:
        true if this codec is a delta codec
      • isSigned

        public boolean isSigned()
        Returns true if this codec is a signed codec
        Returns:
        true if this codec is a signed codec
      • largest

        public long largest()
        Returns the largest value that this codec can represent.
        Returns:
        the largest value that this codec can represent.
      • smallest

        public long smallest()
        Returns the smallest value that this codec can represent.
        Returns:
        the smallest value that this codec can represent.
      • toString

        public java.lang.String toString()
        Returns the codec in the form (1,256) or (1,64,1,1). Note that trailing zero fields are not shown.
        Overrides:
        toString in class java.lang.Object
      • getB

        public int getB()
        Returns:
        the b
      • getH

        public int getH()
        Returns:
        the h
      • getS

        public int getS()
        Returns:
        the s
      • getL

        public int getL()
        Returns:
        the l
      • equals

        public boolean equals​(java.lang.Object o)
        Overrides:
        equals in class java.lang.Object
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object