Rlay Ontology - Serialization Formats

The rlay_ontology crate provides multiple serialization formats for Rlay Ontology Entities:

Protobuf based format - currently the main format used in the Solidity OntologyStorage, and the one used for calculating the CIDs
Web3 / JSON based format - Used for the JSONRPC in rlay-client and the representation in the Javascript libraries
v0 CBOR format - CBOR based format for future use. Still under development and not fully specified.

Protobuf based format

Pros:

Protobuf libaries were easily available for prototyping in Rust and Solidity at time of creation
Via ordered fields in protobuf schemas, it is pretty easy to have a determenistic content-addressable format
Low size overhead over contents

Cons:

Protobuf is comparatively complex for the simple features we need of it
As the protobuf encoding doesn't contain any information about the entity kind, the entity kind has to be known for the encoding to be correctly interpreted
Per-EntityKind CID multicodecs would require a lot of codecs to be registered/coordinated
Unwieldy to use in end-user applications

Used for CID calculation
See ontology.proto for the Protobuf schema
Before calculating the CID, all the Array fields are sorted, to bring the entity into a determenistic canonicalized format
Values in non-CID bytes fields is not strictly defined but assumed to be CBOR encoded
Each EntityKind has a different 3 byte CID multicodec (which are not registered in the official multicodec list).
Uses a keccak-256 hash of the protobuf encoding of the entity for CID calculation
Example CID in hex encoding:

019580031b2088868a58d3aac6d2558a29b3b8cacf3c9788364f57a3470158283121a15dcae0
^^      ^^  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| ^^^^^^ |^^     |
|   |    | |     |
|   |    | |     32 byte keccak-256 hash of protobuf encoded entity
|   |    | |
|   |    | 1 byte variint specifying a multihash length of 32
|   |    |
|   |    Multihash identifier for a keccak-256 hash
|   |
|   3 byte multicodec for entity kind (in this case Annotation)
|
1 byte varint for CID version

Web3 / JSON based format

Pros:

Easy to use in end user applications, where entities are read/created/modified
0x encoding for byte fields fits in with Web3 ecosystem

Cons:

As keys in JSON are often not ordered in implementations, not well suited for producing hashes for CIDs
0x encoding doesn't fit in with multiformats
Not size efficient

JSON based
The type field contains the EntityKind of the entity
Values in non-CID bytes fields is not strictly defined but assumed to be CBOR encoded

Example:

{
  "type": "Annotation",
  "property": "0x019780031b20b3179194677268c88cfd1644c6a1e100729465b42846a2bf7f0bddcd07e300a9",
  "value": "0x664b72616b656e"
}