DLHN

Overview

DLHN ( Pronounced the same as "Dullahan" ) is a language and platform neutral binary serialization format that is inspired by JSON, CSV, MessagePack, and Protocol Buffers. It is designed for blazing fast serialization and deserialization with the smallest possible data size without the need for schema file. However, we are also considering supporting schema file in the future.

Data Structures

DLHN has a header and a body part, which can be separated from each other. It also supports Stream.

Header

Header only.

Header

Body

Body only. The most efficient format. However, the deserializer must know the header information in advance.

Body

Header + Body

Header followed by body. Unlike the body-only case, this format does not require the deserializer to know the header information in advance.

HeaderBody

Header (Stream)

Stream version with headers only.

HeaderHeaderHeaderHeader..

Body (Stream)

Stream version with bodies only.

BodyBodyBodyBody..

Header + Body (Stream)

Stream version with a header and bodies. All bodies must have the same type as the first header.

HeaderBodyBodyBodyBody..

(Header + Body) (Stream)

Stream version with headers and bodies. Similar to Header + Body (Stream) but can store different types of data in each pair.

HeaderBodyHeaderBodyHeaderBody..

Types

Unit

Represents an Unit.

Optional<T>

Represents some that contains a T-type value or none that does not contain a value.

Boolean

Represents a boolean.

UInt8

Represents an 8-bit unsigned integer.

UInt16

Represents a 16-bit unsigned integer.

UInt32

Represents a 32-bit unsigned integer.

UInt64

Represents a 64-bit unsigned integer.

Int8

Represents an 8-bit signed integer.

Int16

Represents a 16-bit signed integer.

Int32

Represents a 32-bit signed integer.

Int64

Represents a 64-bit signed integer.

Float32

Represents an IEEE 754 single precision floating point number.

Float64

Represents an IEEE 754 double precision floating point number.

BigUInt

Represents an arbitrary precision unsigned integer.

BigInt

Represents an arbitrary precision signed integer.

BigDecimal

Represents an arbitrary precision signed decimal number.

String

Represents an UTF-8 string.

Binary

Represents a byte array.

Array<T>

Represents a sequence of values of the same type T.

Tuple<(T1, ..)>

Represents a collection of values of different types.

Map<T>

Represents a collection of key value pairs of string key type and same value type.

Enum { Field1(T1), Field..(T..), }

Represents a custom defined type which may be one of a few different variants.

Date

Represents a proleptic Gregorian calendar date.

DateTime

Represents a TimeZone independent moment in nanoseconds starting from 1970-01-01T00:00Z.

Body

Unit

0 byte

A value of type Unit. Not a single byte is written during serialization, and not a single byte is read during deserialization. If you want to make it Nullable, use the Optional type.

Unit

Optional<T>

1 ~ N bytes

Optional can store a single value of type T inside, and has a state of None or Some(value: T).

None
0x00
Somevalue: T
0x01serialize_body(value)

None: Optional<Boolean>

None
0x00

Some(true): Optional<Boolean>

Somevalue: Boolean
0x010x01

Boolean

1 byte

Stores a true or false value.

value: Boolean
serialize_body(value: Boolean)

false

Boolean value
0x00

true

Boolean value
0x01

UInt8

1 byte

An 8-bit little-endian unsigned integer. (Note: PrefixVarint is not used)

value: UInt8
serialize_body(value: UInt8)

0

value: UInt8
0x00

1

value: UInt8
0x01

255

value: UInt8
0xff

UInt16

1 ~ 3 bytes

16bit unsigned integer. Encode with PrefixVarint.

value: UInt16
serialize_body(value: UInt16)

0

value: UInt16
0x00

1

value: UInt16
0x01

127

value: UInt16
0x7f

128

value: UInt16
0x8002

16383

value: UInt16
0xbfff

16384

value: UInt16
0xc00040

65535

value: UInt16
0xc0ffff

UInt32

1 ~ 5 bytes

32bit unsigned integer. Encode with PrefixVarint.

value: UInt32
serialize_body(value: UInt32)

0

value: UInt32
0x00

1

value: UInt32
0x01

127

value: UInt32
0x7f

128

value: UInt32
0x8002

16383

value: UInt32
0xbfff

16384

value: UInt32
0xc00002

2097151

value: UInt32
0xdfffff

2097152

value: UInt32
0xe0000002

268435455

value: UInt32
0xefffffff

268435456

value: UInt32
0xf000000010

4294967295

value: UInt32
0xf0ffffffff

UInt64

1 ~ 9 bytes

64bit unsigned integer. Encode with PrefixVarint.

value: UInt64
serialize_body(value: UInt64)

0

value: UInt64
0x00

1

value: UInt64
0x01

127

value: UInt64
0x7f

128

value: UInt64
0x8002

16383

value: UInt64
0xbfff

16384

value: UInt64
0xc00002

2097151

value: UInt64
0xdfffff

2097152

value: UInt64
0xe0000002

268435455

value: UInt64
0xefffffff

268435456

value: UInt64
0xf000000002

34359738367

value: UInt64
0xf7ffffffff

34359738368

value: UInt64
0xf80000000002

4398046511103

value: UInt64
0xfbffffffffff

4398046511104

value: UInt64
0xfc000000000002

562949953421311

value: UInt64
0xfdffffffffffff

562949953421312

value: UInt64
0xfe00000000000002

72057594037927935

value: UInt64
0xfeffffffffffffff

72057594037927936

value: UInt64
0xff0000000000000001

18446744073709551615

value: UInt64
0xffffffffffffffffff

Int8

1 byte

An 8-bit little-endian integer. (Note: ZigZag encoding and PrefixVarint are not used)

value: Int8
serialize_body(value: Int8)

-128

value: Int8
0x80

-1

value: Int8
0xff

0

value: Int8
0x00

1

value: Int8
0x01

127

value: Int8
0x7f

Int16

1 ~ 3 bytes

16bit integer. Encode with ZigZag encoding and PrefixVarint.

value: Int16
serialize_body(value: Int16)

-32768

value: Int16
0xc0ffff

-8193

value: Int16
0xc00140

-8192

value: Int16
0xbfff

-65

value: Int16
0x8102

-64

value: Int16
0x7f

-1

value: Int16
0x01

0

value: Int16
0x00

1

value: Int16
0x02

63

value: Int16
0x7e

64

value: Int16
0x8002

8191

value: Int16
0xbeff

8192

value: Int16
0xc00040

32767

value: Int16
0xc0feff

Int32

1 ~ 5 bytes

32bit integer. Encode with ZigZag encoding and PrefixVarint.

value: Int32
serialize_body(value: Int32)

-2147483648

value: Int32
0xf0ffffffff

-134217729

value: Int32
0xf001000010

-134217728

value: Int32
0xefffffff

-1048577

value: Int32
0xe1000002

-1048576

value: Int32
0xdfffff

-8193

value: Int32
0xc10002

-8192

value: Int32
0xbfff

-65

value: Int32
0x8102

-64

value: Int32
0x7f

-1

value: Int32
0x01

0

value: Int32
0x00

1

value: Int32
0x02

63

value: Int32
0x7e

64

value: Int32
0x8002

8191

value: Int32
0xbeff

8192

value: Int32
0xc00002

1048575

value: Int32
0xdeffff

1048576

value: Int32
0xe0000002

134217727

value: Int32
0xeeffffff

134217728

value: Int32
0xf000000010

2147483647

value: Int32
0xf0feffffff

Int64

1 ~ 9 bytes

64bit integer. Encode with ZigZag encoding and PrefixVarint.

value: Int64
serialize_body(value: Int64)

-9223372036854775808

value: Int64
0xffffffffffffffffff

-36028797018963969

value: Int64
0xff0100000000000001

-36028797018963968

value: Int64
0xfeffffffffffffff

-281474976710657

value: Int64
0xfe01000000000002

-281474976710656

value: Int64
0xfdffffffffffff

-2199023255553

value: Int64
0xfd000000000002

-2199023255552

value: Int64
0xfbffffffffff

-17179869185

value: Int64
0xf90000000002

-17179869184

value: Int64
0xf7ffffffff

-134217729

value: Int64
0xf100000002

-134217728

value: Int64
0xefffffff

-1048577

value: Int64
0xe1000002

-1048576

value: Int64
0xdfffff

-8193

value: Int64
0xc10002

-8192

value: Int64
0xbfff

-65

value: Int64
0x8102

-64

value: Int64
0x7f

-1

value: Int64
0x01

0

value: Int64
0x00

1

value: Int64
0x02

63

value: Int64
0x7e

64

value: Int64
0x8002

8191

value: Int64
0xbeff

8192

value: Int64
0xc00002

1048575

value: Int64
0xdeffff

1048576

value: Int64
0xe0000002

134217727

value: Int64
0xeeffffff

134217728

value: Int64
0xf000000002

17179869183

value: Int64
0xf6ffffffff

17179869184

value: Int64
0xf80000000002

2199023255551

value: Int64
0xfaffffffffff

2199023255552

value: Int64
0xfc000000000002

281474976710655

value: Int64
0xfcffffffffffff

281474976710656

value: Int64
0xfe00000000000002

36028797018963967

value: Int64
0xfefeffffffffffff

36028797018963968

value: Int64
0xff0000000000000001

9223372036854775807

value: Int64
0xfffeffffffffffffff

Float32

4 bytes

32bit float. Stores IEEE 754 single precision floating point number format in little endian.

value: Float32
serialize_body(value: Float32)

-Infinity

value: Float32
0x000080ff

-1.1

value: Float32
0xcdcc8cbf

0

value: Float32
0x00000000

1.1

value: Float32
0xcdcc8c3f

Infinity

value: Float32
0x0000807f

NaN

value: Float32
0x0000c07f

Float64

8 bytes

64bit float. Stores IEEE 754 double precision floating point number format in little endian.

value: Float64
serialize_body(value: Float64)

-Infinity

value: Float64
0x000000000000f0ff

-1.1

value: Float64
0x9a9999999999f1bf

0

value: Float64
0x0000000000000000

1.1

value: Float64
0x9a9999999999f13f

Infinity

value: Float64
0x000000000000f07f

NaN

value: Float64
0x000000000000f87f

BigUInt

1 ~ N bytes

Unsigned BigInt. The first byte is a BigUInt byte number ( UInt64 ) , followed by a little-endian unsigned BigInt. If the value of BigUInt is 0, the byte number part will be filled with 0 and the BigUInt part will not store anything.

Bytes of serialized BigUint value: UInt64value: BigUInt
serialize_body(Bytes of serialized BigUint value: UInt64)serialize_body(value: BigUInt)

0

Bytes of serialized BigUint value: UInt64
0x00

1234567890

Bytes of serialized BigUint value: UInt64value: BigUInt
0x040xd2029649

BigInt

1 ~ N bytes

BigInt. The first byte is a BigInt byte number ( UInt64 ) , followed by a little-endian BigInt. If the value of BigInt is 0, the byte number part will be filled with 0 and the BigInt part will not store anything.

Bytes of serialized BigInt value: UInt64value: BigInt
serialize_body(Bytes of serialized BigInt value: UInt64)serialize_body(value: BigInt)

0

Bytes of serialized BigInt value: UInt64
0x00

1234567890

Bytes of serialized BigInt value: UInt64value: BigInt
0x040xd2029649

-1234567890

Bytes of serialized BigInt value: UInt64value: BigInt
0x040x2efd69b6

BigDecimal

1 ~ N bytes

BigDecimal. Stores an arbitrary precision integer unscaled value ( BigInt ) and a scale ( Int64 ).

unscaled value: BigIntscale: Int64
serialize_body(unscaled value: BigInt)serialize_body(scale: Int64)

0

unscaled value: BigInt
0x00

1.23

unscaled value: BigIntscale: Int64
0x017b0x04

-1.23

unscaled value: BigIntscale: Int64
0x01850x04

String

1 ~ N bytes

UTF-8 String. Stores a byte number of a string value ( UInt64 ) and the string value. If the string is empty, the string part will not store anything.

Bytes of serialized String value: UInt64value: String
serialize_body(Bytes of serialized String value: UInt64)serialize_body(value: String)

""

Bytes of serialized String value: UInt64
0x00

"Test"

Bytes of serialized String value: UInt64value: String
0x040x54657374

Binary

1 ~ N bytes

Binary. Stores a byte number of a byte sequence ( UInt64 ) and the byte sequence. When the byte sequence is empty, nothing is stored in the byte sequence section.

Bytes of serialized Binary value: UInt64value: Binary
serialize_body(Bytes of serialized Binary value: UInt64)serialize_body(value: Binary)

[]

Bytes of serialized Binary value: UInt64
0x00

[1, 2, 3]

Bytes of serialized Binary value: UInt64value: Binary
0x030x010203

Array<T>

1 ~ N bytes

An array of type T. Stores a number of elements ( UInt64 ) and the encoded values of all T types in order.

Number of values in Array: UInt64Array[0] value: TArray[..] value: T
serialize_body(Number of values in Array: UInt64)serialize_body(Array[0] value: T)serialize_body(Array[..] value: T)..

[]: Array<UInt8>

Number of values in Array: UInt64
0x00

[1, 2, 3]: Array<UInt8>

Number of values in Array: UInt64Array[0] value: UInt8Array[1] value: UInt8Array[2] value: UInt8
0x030x010x020x03

Tuple<(T1, ..)>

N bytes

Tuple. Stores the encoded values in the order of their definitions.

value: T1value: T..
serialize_body(value: T1)serialize_body(value: T)..

(123, "Test"): (UInt8, String)

value: UInt8value: String
0x7b0x0454657374

Map<T>

1 ~ N bytes

Map. Stores a number of elements ( UInt64 ), keys and values. The keys can only store String and all the values must be the same type. Encodes all the elements in order of key and value.

Number of elements in Map: UInt64key: Stringvalue: Tkey: String..value: T..
serialize_body(Number of elements in Map: UInt64)serialize_body(key: String)serialize_body(value: T)serialize_body(key: String)..serialize_body(value: T)..

Map { "field1": true, "field2", false }: Map<Boolean>

Number of elements in Map: UInt64key: Stringvalue: Booleankey: Stringvalue: Boolean
0x020x066669656c64320x000x066669656c64310x01

Enum { Field1(T1), Field..(T..) }

1 ~ N bytes

Enum. Stores a field number ( UInt64 ) that starts from 0. Then, encodes and stores the values in the field.

Field number: UInt64Field value: T..
serialize_body(Field number: UInt64)serialize_body(Field value: T..)

Enum::B(123): Enum { A(Boolean), B(UInt8), C(Boolean, String) }

Field number: UInt64Field value: UInt8
0x010x7b

Date

2 ~ N bytes

Date. Stores a year ( Int32 ) which represents 2000 as a value 0, followed by an ordinal ( Int16 ) which represents January 1 as a value 0.

Year value: Int32Ordinal value: UInt16
serialize_body(Year value: Int32)serialize_body(Ordinal value: UInt16)

2000-01-01

Year value: Int32Ordinal value: UInt16
0x000x00

2020-08-04

Year value: Int32Ordinal value: UInt16
0x280x9803

DateTime

2 ~ N bytes

DateTime. Stores seconds ( Int64 ) and nanoseconds ( UInt32 ) that represent the time passed since 1970-01-01 00:00:00 UTC.

Seconds value: Int64Nano seconds value: UInt32
serialize_body(Seconds value: Int64)serialize_body(Nano seconds value: UInt32)

1970-01-01 00:00:00.000000000

Seconds value: Int64Nano seconds value: UInt32
0x000x00

2020-08-04 12:34:56.123456789

Seconds value: Int64Nano seconds value: UInt32
0xf07c55ca170xe5d1bc75