• Forum has been upgraded, all links, images, etc are as they were. Please see Official Announcements for more information

POSE banned Evo node "is not a Regular"?

Theia

New member
I got an Evo node POSE banned by a hard restart of the VPS. This node had payouts before, so I know the setup, pro reg tx IDs, BLS keys etc. was correct before. When it relaunched, some database was corrupt (can't this be prevented?):

2024-05-14T08:51:16Z cl-schdlr thread start
2024-05-14T08:51:16Z Fatal LevelDB error: Corruption: checksum mismatch: /home/theia/.dashcore/llmq/isdb/000406.log
2024-05-14T08:51:16Z You can use -debug=leveldb to get more complete diagnostic messages
2024-05-14T08:51:16Z Fatal LevelDB error: Corruption: checksum mismatch: /home/theia/.dashcore/llmq/isdb/000406.log
2024-05-14T08:51:16Z : Error opening block database.
Please restart with -reindex or -reindex-chainstate to recover.

I had setup the node according to this guide: https://www.dash.org/forum/index.ph...e-setup-with-systemd-auto-re-start-rfc.39460/ so instead of the regular sudo systemctl start dashd I su'ed into the dash user and called dashd manually:
/opt/dash/bin/dashd -reindex

When it was done I rebooted the server again, this time from the command line and let the systemd setup auto-relaunch. It appears the node is fully synched:

dash-cli mnsync status
{
"AssetID": 999,
"AssetName": "MASTERNODE_SYNC_FINISHED",
"AssetStartTime": 1715691591,
"Attempt": 0,
"IsBlockchainSynced": true,
"IsSynced": true
}

The log shows tens of thousands of lines like:
2024-05-14T14:13:54Z ThreadSocketHandler -- removing node: peer=42303 nRefCount=1 fInbound=1 m_masternode_connection=1 m_masternode_iqr_connection=0
I wasn't sure if this is part of a clean-up after the reindex, but the peer=x number just keeps counting up seemingly endlessly. Is this normal?

Anyway, I tried to unban the node and constructed the protx update_service command in Dash-QT (desktop wallet) according to https://docs.dash.org/en/stable/docs/user/masternodes/maintenance.html#proupservtx Double-and triple checked all parameters are correct, also the empty one ("") for operatorPayoutAddress. The response is:

masternode with proTxHash [hash] is not a Regular (code -1)

What does that mean? "not a regular" as in "it's an Evo node"? Does that require a different protx update_service ? What can I do?
 
YES! For Evo is different update tx.

Code:
protx update_service_evo "proTxHash" "ipAndPort" "operatorKey" "platformNodeID" platformP2PPort platformHTTPPort ( "operatorPayoutAddress" "feeSourceAddress" )

Creates and sends a ProUpServTx to the network. This will update the IP address and the Platform fields
of an EvoNode.
If this is done for an EvoNode that got PoSe-banned, the ProUpServTx will also revive this EvoNode.

Requires wallet passphrase to be set with walletpassphrase call if wallet is encrypted.

Arguments:
1. proTxHash                (string, required) The hash of the initial ProRegTx.
2. ipAndPort                (string, required) IP and port in the form "IP:PORT". Must be unique on the network.
3. operatorKey              (string, required) The operator BLS private key associated with the
                            registered operator public key.
4. platformNodeID           (string, required) Platform P2P node ID, derived from P2P public key.
5. platformP2PPort          (numeric, required) TCP port of Dash Platform peer-to-peer communication between nodes (network byte order).
6. platformHTTPPort         (numeric, required) TCP port of Platform HTTP/API interface (network byte order).
7. operatorPayoutAddress    (string, optional, default=) The address used for operator reward payments.
                            Only allowed when the ProRegTx had a non-zero operatorReward value.
                            If set to an empty string, the currently active payout address is reused.
8. feeSourceAddress         (string, optional, default=) If specified wallet will only use coins from this address to fund ProTx.
                            If not specified, payoutAddress is the one that is going to be used.
                            The private key belonging to this address must be known in your wallet.
 
The log shows tens of thousands of lines like:
2024-05-14T14:13:54Z ThreadSocketHandler -- removing node: peer=42303 nRefCount=1 fInbound=1 m_masternode_connection=1 m_masternode_iqr_connection=0
I wasn't sure if this is part of a clean-up after the reindex, but the peer=x number just keeps counting up seemingly endlessly. Is this normal?
I think this was more of an issue on some previous release versions. Recent releases don't seem to do this as much (at least not counting into the thousands like you're showing). When it does counts up to many thousands, this can start affecting dashd's resource usage (which risks PoSe scores). These peer removals are not saved permanently and are in memory only for the currently running instance. A simple restart every once in a while keeps the memory usage low. Also, make sure you've updated to the latest version of dash.

Another thing: after finishing a full reindex/resync you should stop and restart dashd to reclaim some extra memory. The goal is for it to have as much resources available when randomly selected for quorum duties.
 
@Theia

Currently Evonodes have just L1 access and basically operate as normal masternodes. Once Platform activates, Evonodes will need L2 access (Platform access), at which point you could find yourself in a position where it becomes clear that is possibly not supported through the 'System wide Masternode Setup with Systemd auto (re)start RFC' methode.

I hope you thought of this and made plans to either use Dashmate (include Docker system) or Masternode Zeus (exclude Docker system), once Platform actives on Mainnet and the new payment scheme (masternode rewards & platform credits rewards) takes effect for your Evonode.

Or be really really sure that this 'System wide Masternode Setup with Systemd auto (re)start RFC' methode will actually fully support Evonodes with both L1 & L2 access.
 
Last edited:
YES! For Evo is different update tx.

Code:
protx update_service_evo "proTxHash" "ipAndPort" "operatorKey" "platformNodeID" platformP2PPort platformHTTPPort ( "operatorPayoutAddress" "feeSourceAddress" )

Thanks! The documentation doesn't explain this:
Dash docu no protx update_service_evo.png


Is the documentation a community effort? Can I help update it somehow?
 
I think this was more of an issue on some previous release versions. Recent releases don't seem to do this as much (at least not counting into the thousands like you're showing). When it does counts up to many thousands, this can start affecting dashd's resource usage (which risks PoSe scores). These peer removals are not saved permanently and are in memory only for the currently running instance. A simple restart every once in a while keeps the memory usage low. Also, make sure you've updated to the latest version of dash.
It's at over 90,000 now:

2024-05-15T09:24:43Z ThreadSocketHandler -- removing node: peer=90118 nRefCount=1 fInbound=1 m_masternode_connection=1 m_masternode_iqr_connection=0

My version is 20.1.0. I see there is a 20.1.1 point release where the release notes say:

Work Queue RPC Fix / Deadlock Fix
A deadlock caused nodes to become non-responsive and RPC to report "Work depth queue exceeded". Thanks to Konstantin Akimov (knst) who discovered the cause. This previously caused masternodes to become PoSe banned.


Is that this issue?

I've had a regular MN crash every 1-2 weeks in the past if not manually rebooted in time. This happened across several major releases. Maybe it's related? Never noticed this wall of ThreadSocketHandler log messages before though.

The simple restart is ok, as long as it doesn't end up corrupting the database forcing me to reindex, which will take long enough to get me banned. That's why I was asking in the beginning if this can be prevented. I thought Dash would be resilient against reboots.
 
Last edited:
It's at over 90,000 now:

2024-05-15T09:24:43Z ThreadSocketHandler -- removing node: peer=90118 nRefCount=1 fInbound=1 m_masternode_connection=1 m_masternode_iqr_connection=0

My version is 20.1.0. I see there is a 20.1.1 point release where the release notes say:

Work Queue RPC Fix / Deadlock Fix
A deadlock caused nodes to become non-responsive and RPC to report "Work depth queue exceeded". Thanks to Konstantin Akimov (knst) who discovered the cause. This previously caused masternodes to become PoSe banned.


Is that this issue?

I've had a regular MN crash every 1-2 weeks in the past if not manually rebooted in time. This happened across several major releases. Maybe it's related? Never noticed this wall of ThreadSocketHandler log messages before though.


This is a complete red herring. The message you are seeing is completely fine.

Screenshot 2024-05-15 205838.png
 
It's at over 90,000 now:

2024-05-15T09:24:43Z ThreadSocketHandler -- removing node: peer=90118 nRefCount=1 fInbound=1 m_masternode_connection=1 m_masternode_iqr_connection=0

My version is 20.1.0. I see there is a 20.1.1 point release where the release notes say:

Work Queue RPC Fix / Deadlock Fix
A deadlock caused nodes to become non-responsive and RPC to report "Work depth queue exceeded". Thanks to Konstantin Akimov (knst) who discovered the cause. This previously caused masternodes to become PoSe banned.


Is that this issue?

I've had a regular MN crash every 1-2 weeks in the past if not manually rebooted in time. This happened across several major releases. Maybe it's related? Never noticed this wall of ThreadSocketHandler log messages before though.

The simple restart is ok, as long as it doesn't end up corrupting the database forcing me to reindex, which will take long enough to get me banned. That's why I was asking in the beginning if this can be prevented. I thought Dash would be resilient against reboots.
I have noticed greater stability (PoSe scores for no reason) with 20.1.1 compared to 20.1.0, so you should definitely upgrade. The correlation between high removed peer count could just be coincidental, but from my observation it seemed that a randomly pose-banned node would also have a high count and high memory usage when compared to a "healthy" node on the same system specs.

With regard to corruption, i think overall resiliency is good. What I've experienced is that the worst that happens is you might have to delete a corrupt sporks.dat or settings.json before the wallet will start after a system crash or if you run out of disk space.
 
Update: It still happens with 20.1.1. Dashd seems to eventually eat up 3x its "normal" memory usage. Only fix I've found is to periodically restart. Once a month or every few weeks should be sufficient.
 
Back
Top