Skip to content
/ ccnpy Public

Pure Python implementation of the CCNx 1.0 client libraries

License

Notifications You must be signed in to change notification settings

mmosko/ccnpy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pure Python CCNx 1.0

ccnpy is a pure python implementation of the CCNx 1.0 protocols (RFC 8609 and RFC 8569).

The implementation focuses on the client libraries used to consume or produce content and to organize it in manifests. There is no plan to create a python CCNx forwarder. Currently, the code only writes packets, in wire format, to files or reads them from files; there are no network operations.

The primary use of this code, at the moment, is to prototype the FLIC Manifest specification. Everything is still in play at the moment and this is not a final specification or implementation yet.

This project uses poetry for the python build system.

Table Of Contents:

Usage

Application Interface

  • ccnpy.apps.manifest_writer: slice up a file into nameless data content objects and organize them into a manifest tree. The output packets are written to a file system directory.
  • ccnpy.apps.packet_reader: reads a packet from the file system and decodes it. Still a little messy on the display.
  • ccnpy.apps.manifest_reader: given a manifest name, assembles the application data and writes it to a file. (IN PROGRESS)

Programming Interfaces

  • ccnpy.core: This package has the main CCNx objects.
  • ccnpy.flic: The FLIC objects for manifests
  • ccnpy.flic.tree: Tree building and related classes.
  • ccnpy.flic.presharedkey: The preshared key encryptor/decryptor for manifests
  • ccnpy.crypto: Crypto algorithms for AES and RSA. Used by encryptor/decryptor and ccnpy signers and verifiers.

Three CCNx Modes

As per the FLIC specification, Sec 3.9.1, there are three main ways that CCNx can use FLIC: Hash Schema, Single Prefix schema, and Segmented Schema. We do not repeat all the text from the specification, but only give an overview of the usage.

Hash Schema

In this mode, there is one CCNx name associated with the root manifest and a CCNx locator used to fetch the nameless objects (top manifest, internal manifests, and data objects). The manifests may use one locator and the data objects could use a second locator. For example, the nameless object manifests could be stored under ccnx:/foo and the data stored under ccnx:/bar.

manifest_writer --schema HashSchema --name RN [--manifest-locator ML] [--data-locator DL] ...

Nameless objects require a locator. The default is to use RN as the locator for all nameless objects. If ML is given, then ML is used as the manifest locator instead of RN. If DL is given, then DL is used for the data locator instead of RN.

Specifying ML or DL causes manifest and data to use separate hash groups.

Single Prefix Schema

In this mode, there is a single CCNx name used for all manifests and data. They are differentiated only by the ContentObjectHash.

If only name is given, it is used for all manifests and data. The name is always used for the root manifest name. In this case, there is only one hash group, and it has a single locator of name.

If MP is given, it is the common name for all non-root manifests. There will be two hash groups, as al data objects will use name as their locator.

If 'DP' is given, it is the common name for all data objects.

manifest_writer --schema PrefixSchema --name N --manifest-prefix MP --data-prefix DP ...

Segmented Schema

In this mode, one name is used for the manifest tree and another name is used for the data tree. Every name has a ChunkNumber. Each GroupData has a StartSegmentId in it to help with the numbering of chunks. The root manifest has a unique name. There are always two name spaces for Segmented Schema.

No locators are used, as all objects have their own name. (For an NDN implementation, this could be different)

The Root Manifest contains the NsDefs for the name constructors. These contain the node locators.

manifest_writer --schema SegmentedSchema --name N --manifest-prefix MP --data-prefix DP  ...

The manifest prefix must be different from the data prefix. FLIC will append chunk numbers to each of the names.

TThe root manifest will be named simply 'N'. The internal manifest nodes (and top node) will use chunked names of prefix 'MP'. MP may be the same as N, in which case the root name is unchunked and the internal names are chunked. Likewise, 'DP' prefix could be the same as 'N', as long as 'MP' is distinct from 'DP'.

Encryption

A manifest tree and its data are unencrypted unless otherwise specified. The FLIC specification has an AES encrypted mode. The AES keys can either be referenced in a security context or can be encrypted under RSA-OAEP and wrapped inside the manifest.

The manifest_writer utility does not support separate manifest and data encryption. If the user specifies encryption on the command line, manifests and data are encrypted under the given AES key.

Examples

TBD: These need to be re-factored based on the 3 usages above.

In this example, we will use ccnpy.apps.manifest_writer to split a file into namesless content objects and construct a manifest tree around them. First, we look at the command-line for manifest-writer. See below for background on CCNx FLIC manifets

You may need to run poetry build and poetry install before poetry run.

ccnpy$ poetry run manifest_writer --help
usage: manifest_writer [-h] [--schema {Hashed,Prefix,Segmented}] --name NAME [--manifest-locator MANIFEST_LOCATOR] [--data-locator DATA_LOCATOR] [--manifest-prefix MANIFEST_PREFIX]
                       [--data-prefix DATA_PREFIX] [-d TREE_DEGREE] [-k KEY_FILE] [-p KEY_PASS] [--enc-key ENC_KEY] [--key-num KEY_NUM] [-s MAX_SIZE] [-o OUT_DIR] [-T]
                       [--root-expiry ROOT_EXPIRY] [--node-expiry NODE_EXPIRY] [--data-expiry DATA_EXPIRY]
                       filename

positional arguments:
  filename              The filename to split into the manifest

options:
  -h, --help            show this help message and exit
  --schema {Hashed,Prefix,Segmented}
                        Name constructor schema (default Hashed)
  --name NAME           CCNx URI for root manifest
  --manifest-locator MANIFEST_LOCATOR
                        CCNx URI for manifest locator
  --data-locator DATA_LOCATOR
                        CCNx URI for data locator
  --manifest-prefix MANIFEST_PREFIX
                        CCNx URI for manifests (Segmented only)
  --data-prefix DATA_PREFIX
                        CCNx URI for data (Segmented only)
  -d TREE_DEGREE        manifest tree degree (default is max that fits in a packet)
  -k KEY_FILE           RSA private key in PEM format to sign the root manifest
  -p KEY_PASS           RSA private key password (otherwise will prompt)
  --enc-key ENC_KEY     AES encryption key (hex string)
  --key-num KEY_NUM     Key number of pre-shared key (defaults to key hash)
  -s MAX_SIZE           maximum content object size (default 1500)
  -o OUT_DIR            output directory (default='.')
  -T                    Use TCP to 127.0.0.1:9896
  --root-expiry ROOT_EXPIRY
                        Expiry time (ISO format, .e.g 2020-12-31T23:59:59+00:00) to expire root manifest
  --node-expiry NODE_EXPIRY
                        Expiry time (ISO format) to expire node manifests
  --data-expiry DATA_EXPIRY
                        Expiry time (ISO format) to expire data nameless objects

The default behavior is to write the wire format packets to a directory. With the -T option, it will write them to the standard CCNx port.

Small Packet Example

We create an RSA key that will be used to sign the root manifest, create a temporary output directory, and then run manifest_writer. We limit the tree to node degree 11 and a maximum packet size of 500 bytes. Using at 1500 byte packet will allow a tree degree of 41. Internally, ccnpy.flic.tree.TreeOptimizer calculates the best tradeoff between direct and indirect pointers per internal manifest node to minimize the waste in the tree, so you do not need to specify the exact fanout.

ccnpy$ openssl genrsa -out test_key.pem
ccnpy$ openssl rsa -pubout -in test_key.pem -out test_key.pub
ccnpy$ mkdir output
ccnpy$ poetry run manifest_writer \
   --schema Hashed \
   --name ccnx:/foo.com/object \
   --link \
   -k test_key.pem -p '' \
   --enc-key 0102030405060708090a0b0c0d0e0f10 --salt 0x01020304 --key-num 1 --aes-mode CCM \
   -s 500 \
   -o output \
   LICENSE
Namespace(schema='Hashed', name='ccnx:/foo.com/object', manifest_locator=None, data_locator=None, manifest_prefix=None, data_prefix=None, tree_degree=None, key_file='test_key.pem', key_pass='', enc_key='0102030405060708090a0b0c0d0e0f10', aes_mode='CCM', key_num=1, salt=16909060, max_size=500, out_dir='output', write_links=True, use_tcp=False, root_expiry=None, node_expiry=None, data_expiry=None, filename='LICENSE')
AeadImpl: (num: 1, salt: b'\x01\x02\x03\x04', mode: CCM, key len: 128)
Creating manifest tree
Root manifest hash: HashValue: {alg: 'SHA256', val: '72948d88ecc64b528c8d76db86e26b147db23dc485313bd09a2c08ae01a4b5e8'}

First, let us go over the command-line term by term:

  • --schema specifies the Hashed schema, so only the root manifest will have a name.
  • --name is the name of the root manifest.
  • --link is useful for writing objects to a direct. It creates a link from the name to the object hash, so manifest_reader can find the root object without typing in the hash value.
  • -k and '-p' open up a PEM private key file to use for signing the root manifest. Using "-p ''" uses a blank password for the PEM file. If -p is not specified, manifest_writer will prompt for a password.
  • The AEAD parameters are --enc-key and --salt and --key-num and --aes-mode. The first specifies the encryption key as a hex string (16 bytes or 32 bytes). The salt is an optional 4-byte value (as an int or hex string) to use with the nonce to create an IV. The key number identifies the key to the consumer. The AES mode can be either GCM or CCM.
  • -s limits the maximum packet size to 500 bytes. We picked a smaller value to illustrate multiple packets. 1500 or 1492 or 1480 are more common values.
  • -o is the output directory to write the wire-format objects. There is a -T option to use the network.
  • LICENSE is the filename to chunk up and wrap in a manifest.

The text output lines are:

  • Namespace is all the CLI arguments (the work Namespace is from the python argument parser).
  • AeadImpl is the encryption implementation parameters
  • Root manifest hash... is the SHA256 hash of the root manifest object, which we will use shortly.

Looking at the output directory, we see that all the CCNx Packets are 500 bytes or less, which is exactly what we asked for. The ones exactly 500 bytes are the data content objects. The others are manifests, which do not exactly fit in 500 bytes. The various sizes depend on the number of pointers in each one. We will look at packet dumps below.

ccnpy$ ls -lgo output
-rw-r--r--@ 1   500 Nov 10 11:51 0c48afc336dfbc04aae31b1c20f159c53ba5d212160ae48015358bcfe1d223fd
-rw-r--r--@ 1   500 Nov 10 11:51 0f5043db4c988440d9803c71e6d4daf47867cdba56e182ccc2e830231a8178fb
-rw-r--r--@ 1   500 Nov 10 11:51 125fae41a28989145d34ab188fe2190caa4b97011e69446dfe49f5232d609b3b
-rw-r--r--@ 1   500 Nov 10 11:51 166fc57cad5de9584c3ebdac85a1db968ae41b2d59112ac4818ac3242bf2ff4a
-rw-r--r--@ 1   500 Nov 10 11:51 1da52e06097ebf55200640b24e065976943d661133bbe7376801e10f45c2d1f4
-rw-r--r--@ 1   324 Nov 10 11:51 249b13c4a21062eaba0e2a4e1170b6f7a3a003d260b6fcab3566d4c82cd5cb10
-rw-r--r--@ 1   361 Nov 10 11:51 28df0ce6953593d4f869a0a1a45682c52752303329628daf7263dcc3fa8afa4d
-rw-r--r--@ 1   500 Nov 10 11:51 2b293564ccc0ba4f8f85e8e5a4ef90bb58c429a7a0b388a441b086488a288427
-rw-r--r--@ 1   500 Nov 10 11:51 31065331e00e3eb32fee93c9f2f6339e788d041c32bd242444892c6249e08e90
-rw-r--r--@ 1   500 Nov 10 11:51 4d2f184d12c10e103898277348a756e1c5bdb592eeb6e2f12cd0dcceed905bac
-rw-r--r--@ 1   500 Nov 10 11:51 64d8aaebd9f402b833d4c3c64b0b4fed40101f3388a1fa1e0d8eedef4ae23617
-rw-r--r--@ 1   500 Nov 10 11:51 6698535f4847008068589a117bdb410c17d8d04bf6b91ba5bfcbd43ec49e5f5e
-rw-r--r--@ 1   500 Nov 10 11:51 67cbb9b8b5ddee8d98311bbcdb792c0adc14171785aca5b1777dd8b2b4a70ed8
-rw-r--r--@ 1   500 Nov 10 11:51 6d0e16c90c3d8188f7befdd8ce1e72c21d225cc0b52439d3411a4f51b09b5aed
-rw-r--r--@ 1   468 Nov 10 11:51 6db7a2edef022949ad96e58945930ed7ceb4593d1b23dffee90a018154cefd42
-rw-r--r--@ 1   343 Nov 10 11:51 72948d88ecc64b528c8d76db86e26b147db23dc485313bd09a2c08ae01a4b5e8
-rw-r--r--@ 1   500 Nov 10 11:51 83ae6c02983fc75e0eb756d8b6780f3b8ac54bfe46f2886013ea1ec8262a517f
-rw-r--r--@ 1   500 Nov 10 11:51 887335c9ad28820c8c7ea6fdc1a958161e3c853c246038a90787876843cc4f5d
-rw-r--r--@ 1   500 Nov 10 11:51 af182acb54e102a5dd1ea4e944a2b0bc04d89aaac5b7d22d860a9cc970d88185
-rw-r--r--@ 1   500 Nov 10 11:51 b2180a827443e3329fe3863656312ccf1978d212b49975e41499f908d39b9704
-rw-r--r--@ 1   500 Nov 10 11:51 d246d972b2fe993556041a27d1244a3fe3122105927aaed587448083247d9d4a
-rw-r--r--@ 1   500 Nov 10 11:51 d7bc2a27eb1c1bf08c31f1de582f7c49acccddee141058ccac5a41988f7d4a6c
-rw-r--r--@ 1   500 Nov 10 11:51 d9a71da31961aa48e32e5a6b0b3784204984cd1e5a4471226bcd6a32f42c4fe8
-rw-r--r--@ 1   500 Nov 10 11:51 dfd5474165928f5c87717674fb5f76cf39241a9ea8842ea009870827890dfc59
-rw-r--r--@ 1   500 Nov 10 11:51 e3df9814e3f6e030fa90d512b519693f9d87a1e1f893efe4e3a7c2238e966527
-rw-r--r--@ 1   500 Nov 10 11:51 e6743bcfb3fbb12daa2bc9f4bbad14e8ec620e82c6b929506167bd324ecaa9f1
-rw-r--r--@ 1   468 Nov 10 11:51 e8230daf3502a6e120300d1e9d3565769f34026a65023a9ab85b5e421ab593f3
-rw-r--r--@ 1   500 Nov 10 11:51 f68375a22c5654f1f180c12dc040e8a94cc7aae5edaebfd7ab02a3a92094a47d
-rw-r--r--@ 1   239 Nov 10 11:51 link_0000001500010007666f6f2e636f6d000100066f626a656374

One special file is link_0000001500010007666f6f2e636f6d000100066f626a656374. It was generated by the --link CLI option. In this first packet decode, we see it is a Content Object that has a payload type of LINK. The payload is a Link TLV with the name ccnx:/foo.com/object and a hash restriction of 72948d88ecc64b528c8d76db86e26b147db23dc485313bd09a2c08ae01a4b5e8. That is the same hash of the root manifest written out above by manifest_writer. We will see how it is used in just a bit below.

Note in the validation algorithm, we have an RSA SHA256 signature, which is validated by test_key.pub. The hash shown in the RsaSha256Verifier is the public key ID. You can verify this on the CLI with: openssl rsa -pubin -in test_key.pub -outform DER | openssl sha256.

poetry run packet_reader \
                --pretty \
                -i output \
                -k test_key.pub \
                link_0000001500010007666f6f2e636f6d000100066f626a656374
{
   Packet: {
      FH: {
         ver: 1,
         pt: 1,
         plen: 239,
         flds: '000000',
         hlen: 8
      },
      CO: {
         NAME: [Name = b 'foo.com', Name = b 'object'],
         None,
         PLDTYP: 'LINK',
         Link(NAME: [Name = b 'foo.com', Name = b 'object'], None, HashValue: {
            alg: 'SHA256',
            val: '72948d88ecc64b528c8d76db86e26b147db23dc485313bd09a2c08ae01a4b5e8'
         }),
         None
      },
      RsaSha256: {
         keyid: HashValue: {
            alg: 'SHA256',
            val: 'c94f873e56e52e317d405dcd9c293baa0ed1f04c12b0e0b3a1ba88c08ceb1044'
         },
         pk: None,
         keylink: None,
         'SignatureTime': '2024-11-10T19:51:45.477000+00:00'
      },
      ValPld: 'cbd2478893b2019918d3eb0ba03ad4a343dc68e00bdb564a1069f3ce7515ecaedb60946bea9edf5c78ae3556700de107f016827e6e17106fee08899b1d56273e'
   }
}

Packet validation success with RsaSha256Verifier(HashValue: {alg: 'SHA256', val: 'c94f873e56e52e317d405dcd9c293baa0ed1f04c12b0e0b3a1ba88c08ceb1044'})

We can look into each of these packets. First, look at the root manifest, whose hash-based name was in the output of manifest_writer. packet_reader can use either a private key or public key to verify the signature on a CCNx packet. We show the usage with a public key, but the syntax is the same for a private key. Note that after displaying the content object, it shows "Packet validation success..." before the decrypted packet.

The CLI arguments for packet_reader are largely the same as manifest_writer. The difference is --pretty controls if a verbose structured output is used, or a more compact format otherwise. The filename is what to read, not what to encode. In this example, we use the root manfiest content object hash, as that is the filename in the output directory we want to read.

If the AES encryption parametes are not given, packet_reader will only display the ContentObject, but cannot decode the embedded manifest. We see in the Packet {...} section, the Content Object CO {...} has a payload type of "MANIFEST" and it shows what it can. In this case, the manifest is encrypted so it can only show the preshared key information, the encrypted node bytes, and the AEAD authentication tag. Note that the nonce is only 8 bytes, not 12, because we added a 4-byte salt.

ccnpy$ poetry run packet_reader \
  --pretty \                
  -i output \
  --enc-key 0102030405060708090a0b0c0d0e0f10 --salt 0x01020304 --key-num 1 --aes-mode CCM \
  -k test_key.pub \
  72948d88ecc64b528c8d76db86e26b147db23dc485313bd09a2c08ae01a4b5e8
  
AeadImpl: (num: 1, salt: b'\x01\x02\x03\x04', mode: CCM, key len: 128)
{
   Packet: {
      FH: {
         ver: 1,
         pt: 1,
         plen: 343,
         flds: '000000',
         hlen: 8
      },
      CO: {
         NAME: [Name = b 'foo.com', Name = b 'object'],
         None,
         PLDTYP: 'MANIFEST',
         Manifest: {
            PSK: {
               kn: 1,
               iv: 'fb12ad1400d2c6a9',
               mode: 'AES-CCM-128'
            },
            EncNode: 'b7b44763e6b670743d4dfc03e471555141af4da4ca28254d349b2ef879a0ccc9080d6627ea7ebd87220e283ef0b826f4c1f79a6f71ea73cfe62c26bbf6bca7413aba1fe1a721ef4bd201a702264f0929d364e97d5e916e3293ccc701ee1c488cf31c81372a9346f856b7ec2c66bae782096006',
            AuthTag: '0a1ba6ab91530089c42cbbea52858fd8'
         },
         None
      },
      RsaSha256: {
         keyid: HashValue: {
            alg: 'SHA256',
            val: 'c94f873e56e52e317d405dcd9c293baa0ed1f04c12b0e0b3a1ba88c08ceb1044'
         },
         pk: None,
         keylink: None,
         'SignatureTime': '2024-11-10T19:51:45.474000+00:00'
      },
      ValPld: 'be6eb21a8c37af22bfed341aa81fdd4a02e4d6d03c4bc6b91d76ca40f247c56b879b6d5e88c56637888c66983e569fdf1e3b6f3876051a9592e842b4e8574f00'
   }
}

Packet validation success with RsaSha256Verifier(HashValue: {alg: 'SHA256', val: 'c94f873e56e52e317d405dcd9c293baa0ed1f04c12b0e0b3a1ba88c08ceb1044'})
AeadImpl: (num: 1, salt: b'\x01\x02\x03\x04', mode: CCM, key len: 128)
Manifest: {
   None,
   Node: {
      NodeData: {
         SubtreeSize: 11357,
         None,
         None,
         [NCDEF: (NCID: 1, HS: Locators: [Locator: Link(NAME: [Name =
            b 'foo.com', Name = b 'object'
         ], None, None)], None)],
         None
      },
      1,
      [HashGroup: {
         GroupData: {
            None,
            None,
            None,
            None,
            NCID: 1,
            None
         },
         Ptrs: [HashValue: {
            alg: 'SHA256',
            val: 'e8230daf3502a6e120300d1e9d3565769f34026a65023a9ab85b5e421ab593f3'
         }]
      }]
   },
   None
}

Because we provided the correct decryption key and key number on the command-line, PacketReader also decrypted the manifest. This shows there is a Node with NodeData and a subtree size of 11,357 bytes (the file size of LICENSE). There is 1 HashGroup with one pointer, as is normal for the named and signed root manifest. The hash group uses NCID 1, which was defined in the NodeData.

The NodeData has one name constructor definition, with a locator of ccnx:/example.com/manifest. That is the same name as the root manifest, as we only provided the --name flag. See below for an example with the --manifest-locator and --data-locator flags.

Using the root manifest pointer, the next manifest decodes as below. This is a nameless content object: there is no name and there is no validation, we only refer to it by its hash name. The decryption shows that the manifest has 10 hash pointers, which is less than we limited the tree to (it was 11 to manifest_writer). Most of those are direct data pointers and the last few will be indirect manifest pointers. A quick scan of the file list above shows that the 1da... file is the last in the list to be exactly 500 bytes, so there are 8 direct pointers and 2 indirect pointers (indirect pointers are always last due to the post order traversal).

ccnpy$ poetry run packet_reader \
  --pretty \                
  -i output \
  --enc-key 0102030405060708090a0b0c0d0e0f10 --salt 0x01020304 --key-num 1 --aes-mode CCM \
  -k test_key.pub \
  e8230daf3502a6e120300d1e9d3565769f34026a65023a9ab85b5e421ab593f3
AeadImpl: (num: 1, salt: b'\x01\x02\x03\x04', mode: CCM, key len: 128)
{
   Packet: {
      FH: {
         ver: 1,
         pt: 1,
         plen: 468,
         flds: '000000',
         hlen: 8
      },
      CO: {
         None,
         None,
         PLDTYP: 'MANIFEST',
         Manifest: {
            PSK: {
               kn: 1,
               iv: '321d47ac0fd0284c',
               mode: 'AES-CCM-128'
            },
            EncNode: 'a05ee07e5ca3b1dc4900f85db61e33a844079e8aa4eb8ba6ee11a53e0b48cc9c5953d2832a4678bdfd844c2cca511718909774aa1fdc110cd46e645dc97dde2cfe34b3a7bc66ecfc62c5f560d49b354820367c1c8ffc10ccb8e3a7343e6f472952c256b099b1e77d07ed9e6d46cd61bfda42f34e9663d94f302388459d6388eaca2e906a8d4750d00b18f2bd5eda863c8e385ab0e6183fb54c7323531fa07e05642f2eb96b2157aa9cd739c7ee7f5386e2711c8e73a084f5c6456b08b0c05fd71609a102c5745b80a0c2e0f98f2e99a3f2e08d93566273d678d8dea4398bb85dac356badba6f5be9a5561c55ffdbd7fb832a02446ca9d21694bbaf871b83f8cf1ae9be9fbe2d79d3715079264bca49e911273bac31bcfd32af75d30c516b8429580b6ebba20e866b86ab7f44250ca63954c0faae3f035c88cbddbc2d3ef8f9a63f0c38c57f5af6bba0c7047ba6cca5d460b01c077090ee8e8a5041d2ede8982e2ee3f14fda25627bb693815cd4da125aa83ffbf8b31d1dc861cb7f24e70e96eeb94b9e8c141fa3659d',
            AuthTag: 'ee905d3a8c427763f06d3820df55662c'
         },
         None
      },
      None,
      None
   }
}

AeadImpl: (num: 1, salt: b'\x01\x02\x03\x04', mode: CCM, key len: 128)
Manifest: {
   None,
   Node: {
      NodeData: {
         SubtreeSize: 11357,
         None,
         None,
         [],
         None
      },
      10,
      [HashGroup: {
         GroupData: {
            None,
            None,
            None,
            None,
            NCID: 1,
            None
         },
         Ptrs: [HashValue: {
            alg: 'SHA256',
            val: '31065331e00e3eb32fee93c9f2f6339e788d041c32bd242444892c6249e08e90'
         }, HashValue: {
            alg: 'SHA256',
            val: 'e6743bcfb3fbb12daa2bc9f4bbad14e8ec620e82c6b929506167bd324ecaa9f1'
         }, HashValue: {
            alg: 'SHA256',
            val: 'af182acb54e102a5dd1ea4e944a2b0bc04d89aaac5b7d22d860a9cc970d88185'
         }, HashValue: {
            alg: 'SHA256',
            val: '887335c9ad28820c8c7ea6fdc1a958161e3c853c246038a90787876843cc4f5d'
         }, HashValue: {
            alg: 'SHA256',
            val: 'e3df9814e3f6e030fa90d512b519693f9d87a1e1f893efe4e3a7c2238e966527'
         }, HashValue: {
            alg: 'SHA256',
            val: '4d2f184d12c10e103898277348a756e1c5bdb592eeb6e2f12cd0dcceed905bac'
         }, HashValue: {
            alg: 'SHA256',
            val: '83ae6c02983fc75e0eb756d8b6780f3b8ac54bfe46f2886013ea1ec8262a517f'
         }, HashValue: {
            alg: 'SHA256',
            val: '1da52e06097ebf55200640b24e065976943d661133bbe7376801e10f45c2d1f4'
         }, HashValue: {
            alg: 'SHA256',
            val: '249b13c4a21062eaba0e2a4e1170b6f7a3a003d260b6fcab3566d4c82cd5cb10'
         }, HashValue: {
            alg: 'SHA256',
            val: '6db7a2edef022949ad96e58945930ed7ceb4593d1b23dffee90a018154cefd42'
         }]
      }]
   },
   None
}

The --link option is useful if you will use manifest_reader on the directory. In that case, it can quickly find the root manifest by name alone rather than have to search for it.

The link filename is always of the form link_{serialized_name}, where serialized_name is the hex encoding of the Name TLV. In the example below, this parses as:

0000 0015      ; Name TLV (type = 0, length = 21)
0001 0007      ; Name Component TLV (type = 1, length = 7)
666f6f2e636f6d ; hex for 'foo.com'
0001 0006      ; Name Component TLV (type = 1, length = 7)
6f626a656374   ; hex for 'object'
ccnpy$ poetry run packet_reader -i output --pretty -k test_key.pem -p '' link_0000001500010007666f6f2e636f6d000100066f626a656374
{
   Packet: {
      FH: {
         ver: 1,
         pt: 1,
         plen: 239,
         flds: '000000',
         hlen: 8
      },
      CO: {
         NAME: [Name = b 'foo.com', Name = b 'object'],
         None,
         PLDTYP: 'LINK',
         Link(NAME: [Name = b 'foo.com', Name = b 'object'], None, HashValue: {
            alg: 'SHA256',
            val: '72948d88ecc64b528c8d76db86e26b147db23dc485313bd09a2c08ae01a4b5e8'
         }),
         None
      },
      RsaSha256: {
         keyid: HashValue: {
            alg: 'SHA256',
            val: 'c94f873e56e52e317d405dcd9c293baa0ed1f04c12b0e0b3a1ba88c08ceb1044'
         },
         pk: None,
         keylink: None,
         'SignatureTime': '2024-11-10T19:51:45.477000+00:00'
      },
      ValPld: 'cbd2478893b2019918d3eb0ba03ad4a343dc68e00bdb564a1069f3ce7515ecaedb60946bea9edf5c78ae3556700de107f016827e6e17106fee08899b1d56273e'
   }
}

Packet validation success with RsaSha256Verifier(HashValue: {alg: 'SHA256', val: 'c94f873e56e52e317d405dcd9c293baa0ed1f04c12b0e0b3a1ba88c08ceb1044'})

Using manifest_reader

The utility manifest_reader reads what manifest_writer produces. In this example, we ask it to read ccnx:/foo.com/object, which is the name we used above in manifest_writer. Because we include --link, the reader uses that to find the hash value of the root manifest and reads that in. It discovers the first NcDef and learns about NcId 1. Each NcCache has an instance (inst) identifier, because name definitions can change as we traverse a manifest. Anytime there is a new NcDef in the manifest tree, the reader copies the current NcCache and adds or udpates the definitions for that branch.

The read bytes, in flic.txt are exactly the same as the original file LICENSE.

ccnpy$ run manifest_reader  \
   -i output \
   --enc-key 0102030405060708090a0b0c0d0e0f10 --salt 0x01020304 --key-num 1 --aes-mode CCM \
   -k test_key.pub \
   --name ccnx:/foo.com/object \
   --output flic.txt
Dereferenced link link_0000001500010007666f6f2e636f6d000100066f626a656374 to load packet HashValue: {alg: 'SHA256', val: '72948d88ecc64b528c8d76db86e26b147db23dc485313bd09a2c08ae01a4b5e8'}
Packet validation success with RsaSha256Verifier(HashValue: {alg: 'SHA256', val: 'c94f873e56e52e317d405dcd9c293baa0ed1f04c12b0e0b3a1ba88c08ceb1044'})
AeadImpl: (num: 1, salt: b'\x01\x02\x03\x04', mode: CCM, key len: 128)
NcCache[inst=2][ncid=1] = HS: Locators: [Locator: Link(NAME: [Name=b'foo.com', Name=b'object'], None, None)], None

Finished traversal, 28 objects procssed

ccnpy$ ls -l flic.txt LICENSE
-rwxr-xr-x@ 1 marc  staff  11357 Oct  1 20:52 LICENSE
-rw-r--r--@ 1 marc  staff  11357 Nov 10 12:28 flic.txt
ccnpy$ diff flic.txt LICENSE; echo $?
0

An example using Segmented names

TBD

Large Degree Tree

We create a 1MiB file that has all zeros and put it in a Manifest limited to 1500 byte packets. This should create only one or two nameless data objects, then a tree with many pointers to the same zeros.

ccnpy$ dd if=/dev/zero of=zeros bs=1000 count=1000
ccnpy$ mkdir out2
ccnpy$ poetry run manifest_writer  \
                   --name ccnx:/example.com/manifest \
                   --manifest-locator ccnx:/manifest \
                   --data-locator ccnx:/data
                   -k test_key.pem \
                   -p '' \
                   -s 1500 \
                   -o ./out2  \
                   --enc-key 0102030405060708090a0b0c0d0e0f10 \
                   --key-num 22  \
                   zeros
                   
Creating manifest tree
Root manifest hash: HashValue: {alg: 'SHA256', val: 'f74a2dd53446f597a4659d160945186b31e87f2c43f632dac54a1da033fbe147'}

The root manifest f74a2... is 490 bytes. The main data object 81e2... is exactly 1500 bytes. The other manifests mostly 1471 bytes. The other small objects are the remaining zeros of the file's tail (44b8...) and a small internal manifest without all the pointers.

ccnpy$ ls -l out2
total 176
-rw-r--r--+ 1 mmosko  1987151510  1363 Jul  1 22:32 01f57d6f3fc815352022e91d23b92bfd5c6e4a867a0c25c55eb3218440a2e37e
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 0911e9cb7115068126d2776a2d1f0b654f1a239d1df05256fc365eb4704e7d6e
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 27dcd884d48ecb54a6d6d04573484f94fcb521d7c15d2bef65f42c958020e896
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 29443f81373a91d0c750a1dcd08f5452d39a2450aa3a2ae2462a05c5787e7e49
-rw-r--r--+ 1 mmosko  1987151510   715 Jul  1 22:32 3767cd69a0e075c6bab208ff7e7d0370342028fd8ba0593c2b8c52d6539712c0
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 4298aab32d5f55d43f16caba47091038fb427cb29dafaa36c21ffad8c04849cf
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 44b7e81a2833e6a15f4b9e015a13b6eb82b44a89f2abfe4aaa120a1a28247c1e
-rw-r--r--+ 1 mmosko  1987151510   217 Jul  1 22:32 44b8f04d36f09a6295447c47c6e0501cbe83382776140e0039d7fe48d3a2c74f
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 56369e591865e4f7a9ba6a9fe2047344b83ba39505f62a8c6e8605ab4a5d51c3
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 67e1a08a586689f1e5667836a63dfa2fe8b6d8579a7ad25d085832556c7f1604
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 6aeaaf7ea24e6b0f70024591c2eff1bd7b173fe18bf9444c8beaacab7ed49dbd
-rw-r--r--+ 1 mmosko  1987151510  1363 Jul  1 22:32 7ee96456d5cf06b7f26a55b4db0a48ed52ab6a586d949d3e2e3aeca5c40a216d
-rw-r--r--+ 1 mmosko  1987151510  1500 Jul  1 22:32 81e24663be0c7c9a9e461c03392e30c7f0492fccbe0b59d41ee2913385dbf712
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 8323a0eff0a359fde41b332c19ba30f6115f687e9375bbacbe1fc5fa46a805c2
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 93a9225b2a23af8bbfea7585563d7d1dd6fcc0b9588bdc28628431fb577cd5fc
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 9c87210a9adfd49bbf985e1169e7c7797d2e753a34e95ed5260ef6fadee74954
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 a394b694fb5bc5ca90039cc46f45ccc850d943edca48ed7f14086eb84394a69e
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 b9b55f1b4ddd4bd92176b05da187cec5db37b491df1a6bac09a18b6a00a02a70
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 bde2ff654795cbb58757b4ef5b2ea384246904cf8463be685a2638afee0bcc33
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 c4911a0e6b3a11bda2eb255b44e604246ce084f3139adf43bf3d055fd3584eae
-rw-r--r--+ 1 mmosko  1987151510  1471 Jul  1 22:32 f63d4b208e24ed915a76eddcf89fb9cec86eb18f6d11b5c26da2852b1497bb02
-rw-r--r--+ 1 mmosko  1987151510   490 Jul  1 22:32 f74a2dd53446f597a4659d160945186b31e87f2c43f632dac54a1da033fbe147

Packet 44b8f04d36f09a6295447c47c6e0501cbe83382776140e0039d7fe48d3a2c74f is:

    {
       Packet: {
          FH: {
             ver: 1,
             pt: 1,
             plen: 217,
             flds: '000000',
             hlen: 8
          },
          CO: {
             None,
             None,
             PLDTYP: 'DATA',
             PAYLOAD: '00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000'
          },
          None,
          None
       }
    }

If we did not use encryption, then the output would be even more compressed. That is because most of the manifest nodes look just like the other manifest nodes, so we get data de-duplication of manifest nodes. With encryption, each manifest node is unique due to different IVs.

In this example without encryption, the entire 1 MB zeros file and manifest tree fit in just 7 objects with a total wire-format size of 7173 bytes.

ccnpy$ python3 -m ccnpy.apps.manifest_writer  \
                   -n ccnx:/example.com/manifest \
                   -k test_key.pem \
                   -p '' \
                   -s 1500 \
                   -o ./out3  \
                   zeros
                   
Creating manifest tree
Root manifest hash: HashValue: {alg: 'SHA256', val: '96dc49fa08b26d569e652e6dfe2890b901f98f7ed71b57b8084f873bafd61e80'}

ccnpy$ ls -l out3
total 56
-rw-r--r--+ 1 mmosko  1987151510  1489 Jul  1 22:38 1cf93ae7af50140435592e5fc10e07fe5c8ec0e356ceb93c4b847eb2a04373a4
-rw-r--r--+ 1 mmosko  1987151510  1489 Jul  1 22:38 1d9eb906e894ec892eff2f10d6664909668c7d19ba7300cca7c10af5a63db990
-rw-r--r--+ 1 mmosko  1987151510   553 Jul  1 22:38 29f93a68e2402dc163954e21975c6031064b6f82f81773776167a5a9f5b2262f
-rw-r--r--+ 1 mmosko  1987151510  1489 Jul  1 22:38 4037ff9614afd66ef676b6beab14ca93a3cd2ebbb79684f3bded8f93ee3e2f90
-rw-r--r--+ 1 mmosko  1987151510   217 Jul  1 22:38 44b8f04d36f09a6295447c47c6e0501cbe83382776140e0039d7fe48d3a2c74f
-rw-r--r--+ 1 mmosko  1987151510  1500 Jul  1 22:38 81e24663be0c7c9a9e461c03392e30c7f0492fccbe0b59d41ee2913385dbf712
-rw-r--r--+ 1 mmosko  1987151510   436 Jul  1 22:38 96dc49fa08b26d569e652e6dfe2890b901f98f7ed71b57b8084f873bafd61e80

FLIC Manifests

See the IRTF draft on FLIC for a description of the CCNx objects and grammar. Below, we provide some examples to help show how manifest_writer works.

A Manifest is embedded inside a CCNx Content Object:

ManifestContentObject = TYPE LENGTH [Name] [ExpiryTime] PayloadType Payload
Name = TYPE LENGTH *OCTET ; As per RFC8569
ExpiryTime = TYPE LENGTH *OCTET ; As per RFC8569
PayloadType = TYPE LENGTH T_PYLDTYPE_MANIFEST
Payload : TYPE LENGTH *OCTET ; the serialized Manifest object

Manifest Examples

NOTE: These examples are a bit old and do not include the revision of putting the manifest inside the Payload.

Example of a full Manifest node, such as a root manifest

[FIXED_HEADER OCTET[8]]
(ContentObject/T_OBJECT
    (Name/T_NAME ...)
    (ExpiryTime/T_EXPIRY 20190630Z000000)
    (Manifest
        (Node
            (NodeData
                (SubtreeSize 5678)
                (SubtreeDigest (HashValue SHA256 a1b2...))
                (Locators (Final FALSE) (Link /example.com/repo))
            )
            (HashGroup
                (GroupData
                    (SubtreeSize 1234)
                    (SubtreeDigest (HashValue SHA256 abcd...))
            )
            (Pointers
                (Ptr ...)
                (Ptr ...)
            )
        )
    )
)
(ValidationAlg ...)
(ValidationPayload ...)

To use an encrypted manifest, create an unencrypted manifest with the SecurityCtx and AuthTag, then do an in-place encryption with AES-GCM-256. Put the Authentication Tag in the AuthTag value. After the encryption, change the TLV type of Node to EncryptedNode. Note that if the publisher should finish the encryption and TLV type changes before signing the ContentObject with the ValidationPayload.

[FIXED_HEADER OCTET[8]]
(ContentObject/T_OBJECT
    (Name/T_NAME ...)
    (ExpiryTime/T_EXPIRY 20190630Z000000)
    (Manifest
        (SecurityCtx
            (PresharedKey (KeyNum 55) (IV 8585...) (Mode AES-GCM-256))
        )
        (Node
            (NodeData
                (SubtreeSize 5678)
                (SubtreeDigest (HashValue SHA256 a1b2...))
                (Locators (Final FALSE) (Link /example.com/repo))
            )
            (HashGroup
                (GroupData
                    (SubtreeSize 1234)
                    (SubtreeDigest (HashValue SHA256 abcd...))
            )
            (Pointers
                (Ptr ...)
                (Ptr ...)
            )
        )
        (AuthTag 0x00...)
    )
)
(ValidationAlg ...)
(ValidationPayload ...)

Example of a nameless and encrypted manifest node

[FIXED_HEADER OCTET[8]]
(ContentObject/T_OBJECT
    (ExpiryTime/T_EXPIRY 20190630Z000000)
    (Manifest
        (SecurityCtx
            (PresharedKey (KeyNum 55) (IV 8585...) (Mode AES-GCM-256))
        )
        (EncryptedNode ...)
        (AuthTag ...)
    )
)

After in-place decryption, change type of EncryptedNode to Node and change AuthTag to PAD and overwrite the value with zeros.

[FIXED_HEADER OCTET[8]]    
(ContentObject/T_OBJECT
    (ExpiryTime/T_EXPIRY 20190630Z000000)
    (Manifest
        (SecurityCtx
            (PresharedKey (KeyNum 55) (IV 8585...) (Mode AES-GCM-256))
        )
        (Node ...)
        (PAD ...)
    )
)

AEAD Encryption Algorithm

AeadData := KeyNum Nonce Mode
KeyNum := INTEGER
Nonce := OCTET+
Mode := AES-GCM-128 AES-GCM-256 AES-CCM-128 AES-CCM-256

The KeyNum identifies a key on the receiver. The key must be of the correct length of the Mode used. If the key is longer, use the left bits. Many receivers many have the same key with the same KeyNum. A publisher creates a signed root manifest with a security context. A consumer must ensure that the root manifest signer is the expected publisher for use with the pre-shared key, which may be shared with many other consumers. The publisher may use either method 8.2.1 (deterministic IV) or 8.2.2 (RBG-based IV) [NIST 800-38D] for creating the Nonce. It is also recommended that the publisher and consumers share a 4-byte salt, which is not transmitted in-band.

Each encrypted manifest node (root manifest or internal manifest) has a full security context (KeyNum, Nonce, Mode). The AES-GCM decryption is independent for each manifest so Manifest objects can be fetched and decrypted in any order. This design also ensures that if a manifest tree points to the same subtree repeatedly, such as for deduplication, the decryptions are all idempotent.

The functions for authenticated encryption and authenticated decryption are as given in Sections 7.1 and 7.2 of NIST 800-38D: GCM-AE_K(IV, P, A) and GCM-AD_K(IV, C, A, T).

EncryptNode(SecurityCtx, Node, K, IV) -> GCM-AE_K(IV, P, A) -> (C, T)
    Node: The wire format of the Node (P)
    SecurityCtx: The wire format of the SecurityCtx as the Additional Authenticated Data (A)
    K: the pre-shared key (128 or 256 bits)
    IV: The initialization vector (usually 96 or 128 bits)
    C: The cipher text
    T: The authentication tag

The pair (C,T) is the OpaqueNode encoded as a TLV structure:

(OpaqueNode (CipherText C) (AuthTag T))

DecryptNode(SecurityCtx, C, T, K, IV) -> GCM-AD_K (IV, C, A, T) -> (Node, FailFlag)
    Node: The wire format of the decrypted Node
    FailFlag: Indicates authenticated decryption failure (true or false)

If doing in-place decryption, the cipher text C will be enclosed in an EncryptedNode TLV value. After decryption, change the TLV type to Node. The length should be the same. After decryption the AuthTag is no longer needed. The TLV type should be changed to T_PAD and the value zeroed. The SecurityCtx could be changed to T_PAD and zeroed or left as-is.

Implementation notes

dependencies

The dependencies are in the pyproject.toml file for use with poetry.

graphviz is required on the system if you will use the ManifestGraph module generated by Traversal.

Serialization and Deserialization

The class methods deserialize(buffer) take a byte array (array.array("B", ...)). They are found in ccnpy.Packet.deserialize(buffer) and ccnpy.FixedHeader.deserialize(buffer) and ccnpy.Tlv.deserialize(buffer) and ccnpy.Link.deserialize(buffer). Other classes work at the TLV level via the class method parse(tlv).

Typically, all one needs to do is call ccnpy.Packet.deserialize(buffer) or ccnpy.Packet.load(filename) and everthing else is done automatically.

The serialize() methods always return a byte array (array.array("B", ...)). Typically, all one needs to do is call ccnpy.Packet.serialize() or ccnpy.Packet.save(filename).

Building Trees

ccnpy.flic.tree.TreeBuilder will construct a pre-order tree in a single pass going from the tail of the data to the beginning. This allows us to create all the children of a parent before the parent, which means we can populate all the hash pointers.

Pre-order traversal and the reverse pre-order traversal are shown below. In a nutshell, we create the right-most-child manifest, then its parent, then the indirect pointers of that parent, then the parent's direct pointers, then the parent of the parent (repeating). This process uses recursion, as I think it is the clearest way to show the code. A more optimized approach could do it in a true single pass.

Here is the pseudocode for preorder and reverse_preorder traversals of a tree. The pseudocode below, and the class TreeBuilder, use the reverse_preorder approach to building the manifest tree.

preorder(node)
    if (node = null)
        return
    visit(node)
    preorder(node.left)
    preorder(node.right)

reverse_preorder(node)
    if (node = null)
        return
    reverse_preorder(node.right)
    reverse_preorder(node.left)
    visit(node)

Because we're building from the bottom up, we use the term 'level' to be the distance from the right-most child up. Level 0 is the bottom-most level of the tree, such as where node 7 is:

        1
    2       3
  4  5    6  7
  preorder: 1 2 4 5 3 6 7
  reverse:  7 6 3 5 4 2 1

Here is the pseudo-code for what TreeBuilder does:

build_tree(data[0..n-1], n, k, m)
    # data is the application data
    # n is the number of data items
    # k is the number of direct pointers per internal node
    # m is the number of indirect pointers per internal node

    segment = namedtuple('Segment', 'head tail')(0, n)
    level = 0

    # This bootstraps the process by creating the right most child manifest
    # A leaf manifest has no indirect pointers, so k+m are direct pointers
    root = leaf_manifest(data, segment, k + m)

    # Keep building subtrees until we're out of direct pointers
    while not segment.empty():
        level += 1
        root = bottom_up_preorder(data, segment, level, k, m, root)

    return root

bottom_up_preorder(data, segment, level, k, m, right_most_child=None)
    manifest = None
    if level == 0:
        assert right_most_child is None
        # build a leaf manifest with only direct pointers
        manifest = leaf_manifest(data, segment, k + m)
    else:
        # If the number of remaining direct pointers will fit in a leaf node, make one of those.
        # Otherwise, we need to be an interior node
        if right_most_child is None and segment.length() <= k + m:
            manifest = leaf_manifest(data, segment, k+m)
        else:
            manifest = interior_manifest(data, segment, level, k, m, right_most_child)
    return manifest

leaf_manifest(data, segment, count)
    # At most count items, but never go before the head
    start = max(segment.head(), segment.tail() - count)
    manifest = Manifest(data[start:segment.tail])
    segment.tail -= segment.tail() - start
    return manifest

interior_manifest(data, segment, level, k, m, right_most_child)
    children = []
    if right_most_child is not None:
        children.append(right_most_child)

    interior_indirect(data, segment, level, k, m, children)
    interior_direct(data, segment, level, k, m, children)

    manifest = Manifest(children)
    return manifest, tail

interior_indirect(data, segment, level, k, m, children)
    # Reserve space at the head of the segment for this node's direct pointers before
    # descending to children.  We want the top of the tree packed.
    reserve_count = min(m, segment.tail - segment.head)
    segment.head += reserve_count

    while len(children) < m and not segment.head == segment.tail:
        child = bottom_up_preorder(data, segment, level - 1, k, m)
        # prepend
        children.insert(0, child)

    # Pull back our reservation and put those pointers in our direct children
    segment.head -= reserve_count

interior_direct(data, segment, level, k, m, children)
    while len(children) < k+m and not segment.head == segment.tail:
        pointer = data[segment.tail() - 1]
        children.insert(0, pointer)
        segment.tail -= 1

About

Pure Python implementation of the CCNx 1.0 client libraries

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages