ol
New member
I decided to continue work on modular network backend for Dash Core software. The idea is to provide modular framework that allows Dash nodes to communicate using different network protocols. New protocols can be added as modules for this modular framework. The initial goal is to move current TCP-based protocol into a module and then write a module for I2P, an anonymous overlay network.
I've prepared a pre-proposal for Dash governance system where I describe rationale for this project in more details. But to proceed further with this project some possibly breaking changes are needed for Dash protocol to support address formats other than IPv6 address and port that is currently used. For example, I2P address is SHA2-256 hash.
So, I would like to start a discussion about what exactly these changes are and whether Dash Core team is willing to accept these changes to Dash Core software.
I've identified the following places in Dash protocol/software where TCP address/port is serialised/deserialised. Please let me know if I missed something.
"version" message.
It contains two addresses:
"addr" message.
This message is used to advertise addresses of other known nodes.
It contains number of addresses (in compact size format) followed by address records.
Each address record contains:
I see two possibilities here:
Masternode addresses.
Currently there is a requirement for a masternode to have public routable IPv4 address. There are multiple places in the protocol where this address is sent (but still in IPv6-mapped format). I don't expect this requirement to be lifted right now, but it would be beneficial to be prepared for further extension. Masternodes that have multiple addresses in different networks will significantly improve resilience of Dash network.
It makes no sense to change address format in legacy code that implements pre-deterministic masternode functionality, because it will be removed soon. Hence, I'd like to concentrate on functionality that implements deterministic masternodes. And there is one big place where masternode address is used: TRANSACTION_PROVIDER_REGISTER transaction type.
Currently TRANSACTION_PROVIDER_REGISTER transaction contains a single IPv6(-mapped) address and TCP port. This can be extended to contain a vector of addresses for further extension. But currently a check enforcing that this vector contains a single IPv6-mapped IPv4 address can be added.
I see the following possibilities to introduce extended address vector to TRANSACTION_PROVIDER_REGISTER transaction.
File formats.
There are several data files that contain addresses. Format of these files have to be extended in incompatible way. Files with serialised data have a version in their header, so a new version has to be introduces. After software upgrade, file of an old version will be read in compatible way, but it will be saved using new version and extended addresses.
These are the following files.
Universal address serialisation format.
The whole idea of modular network backend is to be able to have multiple backends and be able to add new backends without bumping protocol version number. Hence, it's possible that some nodes will announce addresses that other nodes don't understand. In this case, nodes should just ignore addresses they don't know how to handle (but they may still relay these addresses in "addr" messages).
I propose a universal address serialisation to be in TLV (type, length, value) format.
Concusion.
As you see, changes required for modular network backend are not so dramatic, and they can be introduced in compatible way to not break old nodes.
I would like to hear from Dash Core team whether there are any objections to introducing these protocol changes and to accepting pull requests implementing them.
You can watch the current state of my work here:
https://github.com/OlegGirko/dash/commits/modular_net_backend
But be careful if you check out this branch: I'm going to rebase it a lot before submitting pull requests.
Update 1. Corrected stupid mistake about estimated size of length field of universal address serialisation format.
I've prepared a pre-proposal for Dash governance system where I describe rationale for this project in more details. But to proceed further with this project some possibly breaking changes are needed for Dash protocol to support address formats other than IPv6 address and port that is currently used. For example, I2P address is SHA2-256 hash.
So, I would like to start a discussion about what exactly these changes are and whether Dash Core team is willing to accept these changes to Dash Core software.
I've identified the following places in Dash protocol/software where TCP address/port is serialised/deserialised. Please let me know if I missed something.
"version" message.
It contains two addresses:
- address of message receiver as seen from message sender; this is used to adjust scores of externally visible node addresses to decide which ones to advertise;
- address of sender; this is not used anymore because it can leak private IP addresses; zero address (::) is always sent.
"addr" message.
This message is used to advertise addresses of other known nodes.
It contains number of addresses (in compact size format) followed by address records.
Each address record contains:
- timestamp;
- bitmask of flags describing services provided by the node;
- 128-bit IPv6 address;
- 16-bit TCP port number.
I see two possibilities here:
- just change address format to universal one starting with some protocol version and read address in old format from nodes that advertise older protocol version;
- add a new message type for advertising extended addresses.
Masternode addresses.
Currently there is a requirement for a masternode to have public routable IPv4 address. There are multiple places in the protocol where this address is sent (but still in IPv6-mapped format). I don't expect this requirement to be lifted right now, but it would be beneficial to be prepared for further extension. Masternodes that have multiple addresses in different networks will significantly improve resilience of Dash network.
It makes no sense to change address format in legacy code that implements pre-deterministic masternode functionality, because it will be removed soon. Hence, I'd like to concentrate on functionality that implements deterministic masternodes. And there is one big place where masternode address is used: TRANSACTION_PROVIDER_REGISTER transaction type.
Currently TRANSACTION_PROVIDER_REGISTER transaction contains a single IPv6(-mapped) address and TCP port. This can be extended to contain a vector of addresses for further extension. But currently a check enforcing that this vector contains a single IPv6-mapped IPv4 address can be added.
I see the following possibilities to introduce extended address vector to TRANSACTION_PROVIDER_REGISTER transaction.
- switch format with a bump of transaction version;
- introduce another transaction type.
File formats.
There are several data files that contain addresses. Format of these files have to be extended in incompatible way. Files with serialised data have a version in their header, so a new version has to be introduces. After software upgrade, file of an old version will be read in compatible way, but it will be saved using new version and extended addresses.
These are the following files.
- banlist.dat — banned addresses and subnets;
- peers.dat — known nodes;
- mncache.dat — known masternodes;
- netfulfilled.dat — fulfilled synchronisation requests; this file contains no version; it can be probably just removed on upgrade.
Universal address serialisation format.
The whole idea of modular network backend is to be able to have multiple backends and be able to add new backends without bumping protocol version number. Hence, it's possible that some nodes will announce addresses that other nodes don't understand. In this case, nodes should just ignore addresses they don't know how to handle (but they may still relay these addresses in "addr" messages).
I propose a universal address serialisation to be in TLV (type, length, value) format.
- Type is a numerical label that is assigned to network backend by a central registry (Dash Core team). Size of this label is up to discussion, but I think that 32 bits should be enough. There also should be a range allocated for experimental (unstable) extensions.
- Length is necessary because a network backend label (and address size) can be unknown to a node, but it should be still able to handle it. Again, size of this field is up to discussion. Probably, 16 bits will be enough. Or, even better, compact size format can be used: it occupies 1 octet for sizes up to 252 octets (2016 bits).
- Value is raw backend specific data. For TCP-based backend it will be 18 octets containing 128-bit IPv6 address and 16-bit TCP port. For I2P backend it will be 32 octets containing SHA2-256 hash.
Concusion.
As you see, changes required for modular network backend are not so dramatic, and they can be introduced in compatible way to not break old nodes.
I would like to hear from Dash Core team whether there are any objections to introducing these protocol changes and to accepting pull requests implementing them.
You can watch the current state of my work here:
https://github.com/OlegGirko/dash/commits/modular_net_backend
But be careful if you check out this branch: I'm going to rebase it a lot before submitting pull requests.
Update 1. Corrected stupid mistake about estimated size of length field of universal address serialisation format.
Last edited: