Evolution - Dashdrive Discussion

oaxaca

Well-known Member
Foundation Member
Jul 8, 2014
573
832
263
Each member of the second tier will be required to have a specific amount of storage space in
order to power the DashDrive filesystem. By sharding the storage via the collateral transaction
hash, we can define 1024 different shared storage devices on the network. We use 1024
because, we can identify shards by using the first 10 bits of a unique hash per storage object.
For example, with a 40GB allocation requirement, the network can enjoy about 40960GB of
storage space. When users interact with the network they will transmit information to be stored
on DashDrive via the decentralized API.

For redundancy, each shard will be stored multiple times on the network. For example if the
network has 5000 Masternodes, we will store each item ((5000/1024)+seed_count) times.
DashDrive supports a few advanced features such as transactional commits, where users can
require multiple files get written to different destinations on the network. If any write fails, the
entire commit for all files will be reverted.

In addition, reading or writing files is only possible when a user has access to a given file, such
as their own profile page. When trying to read files a user does not have access to, they will be
denied access.

Writing files can be done only by having enough quorum signatures, and can be used to do
maintenance or allow users to update information on the network.
 

oaxaca

Well-known Member
Foundation Member
Jul 8, 2014
573
832
263
I get it. It's terrifying and beautiful.

Gone are the days of downloading 3500 copies of the blockchain onto the drive of each node. Now, the hard drives of each node are 1/3500 of the blockchain. Instead of querying your local copy "dash-cli getinfo", you query the high-performance shared computing resource called Dashdrive. This is disk drive striping on steroids.
 

Solarminer

Well-known Member
Apr 4, 2015
762
922
163
Is DashDrive only intended for blockchain and transactions?

This will be the most expensive secure storage available. It would be best to keep any outside services off of the DashDrive. Bitcoin is getting sidechains polluting their blockchain. Eventually, this DashDrive will be huge and you don't want to be stuck because 3rd party storage was allowed early on.

A single transaction lookup would be slower if you have to figure out where it is stored, then request it and download it. Maybe you don't. If you store your transactions locally, they don't take up much space, you have all of your data to search around if you want. The cool part is that there is no waiting for a blockchain to download to make a transaction. That brings this to merchant friendly territory.
 

stan.distortion

Well-known Member
Oct 30, 2014
902
525
163
Its planned to move the blockchain to distributed mirrored storage using DashDrive? I hadn't come across that bit yet, is it in the Evolution docs? Certainly a step that justifies the "Evolution" tag, getting close to evolution squared at this stage :)
 

pille

Active Member
Feb 18, 2015
273
302
123
((5000/1024)+seed_count) - can somebody explain what seed_count is, please?
 

Lebubar

Active Member
Mar 15, 2014
249
213
103
I think we have to take really care of this: how redundancy will work.
I don't think we can lose any data on DashDrive.
If for example the various copy of one shard can be identified and located (on which MN they are) this can be a big weakness and open posibilites to hacker to DDOS or even data loss.
 

oaxaca

Well-known Member
Foundation Member
Jul 8, 2014
573
832
263
I think we have to take really care of this: how redundancy will work.
I don't think we can lose any data on DashDrive.
If for example the various copy of one shard can be identified and located (on which MN they are) this can be a big weakness and open posibilites to hacker to DDOS or even data loss.
As I understand it, the shards are "striped" across various masternodes as well as having redundant sets. The optimum number of copies can be calculated using "uptime" of each node.

Should be pretty slick.

Unless the kill switch is pulled on the internet of course.
 
  • Like
Reactions: bhkien and Lebubar

greensheep

New Member
Jan 3, 2016
19
19
3
The evolution paper does not talk about storing the blockchain in DashDrive in a sharded way (as primary source of truth). If this is really the plan, this would definitely deserve its section on its own. There are major challenges: E.g.: Who validates the transactions in a block? How to agree which is the longest chain? What happens if the longest chain changes?

I think the current sharding system has problem with reliability and consistency and therefore should not be used to store any information that is critical for the system to process transaction or to prevent fraud.

Quoting the Dash Evolution Paper:
"By sharding the storage via the collateral transaction hash, we can define 1024 different shared storage devices on the network. We use 1024 because, we can identify shards by using the first 10 bits of a unique hash per storage object.
...
For redundancy, each shard will be stored multiple times on the network. For example if the network has 5000 Masternodes, we will store each item ((5000/1024)+seed_count) times. "

I don't know what seed_count is supposed to be. Let's do some math: If we assume that the collateral transaction hash is randomly distributed and we have 5000 masternodes. What will the chance be that a shard has no masternode assigned to it? For a given shard x, the probability that all 5000 hashes are different from x is (1023/1024)^5000 = 0.755762539% This seems low. However, this is just for a single shard x. We have 1024 of them. What is the probability that all of them have at least one masternode assigned to it? It is (1-0.00755762539%)^1024 = 0.042288898% With high probability there is always some shard that has not a single masternode assigned to it! Do not trust on randomness to produce uniform distributions.

I think this could be solved by assigning shards to masternodes in a deterministic way in the Quorum Chain: Whenever a new masternode is added to the chain, it can be assigned to the shard that has the lowest number masternodes assigned to them so far. This will provide an uniform assignment as long as there are no long series of removes without adding new masternodes.

Another problems of the sharding are DoS attacks: An attacker can DoS all masternodes of a specific shard. This will make this shard inaccessible. Wallets stored in this shard can no longer be accessed.

Even worse, an attacker can replace all masternodes of a specific shard by its own, with running just a few masternodes on his own: The shard of its own masternodes can be chosen (if sharding is done using collateral transaction hash the transaction can be modified until the hash shows the desired shard, with the Quorum Chain, the masternodes can be added at the correct time). According to the paper, dash evolution double-spend prevention relies on DashDrive:
"Double spending is not possible on the Dash Network due to the “Commit or Rollback” (COR) feature of DashDrive. After sending a transaction to the network, it will be written to DashDrive and the inputs will be reversed via usage of the filesystem."
As far as I understand, the prevention works by writing to a DashDrive file which is given by the input address. If the write succeeds, it is assumed that there is no double spend. A double spend attacker knows into which shard this write will go. By controlling all masternodes of this shard, the attacker can accept both writes making the network fail to detect the double spend of his input. Therefore, I do not think DashDrive is suitable for Double-Spend detection with its current design.

Even in absence of malicious attackers I do not see how the DASH network guarantees that the content of the DashDrive of all masternodes in a shard is consistent: Writes to some masternodes can succeed while writes to others fail, masternodes can crash, there can be network partitioning and all the bad things that are possible to happen in a distributed storage system.If the shards are not consistent, the dash quorum lookups will have different results which will cause transactions to fail.

Thoughts?






 

TanteStefana

Grizzled Member
Foundation Member
Mar 9, 2014
2,863
1,854
1,283
I don't think that's the plan. I might be wrong, but I think the sharded storage is for customer information, which is going to be a strain on the system, but not if sharded.

Eventually, I think there will be a very secure way to trim the blockchain, but that's not currently a priority :)

Anyway, your other observations are deeper than I dare go, but I'll PM Evan to look at what you wrote in case you found an issue :) Thanks!!!
 
Last edited by a moderator:
  • Like
Reactions: coingun

InTheWoods

Well-known Member
Foundation Member
Oct 12, 2014
721
941
263
Do we know the exact Dash Evolution node hosting requirements? Have they been posted anywhere?
 

coingun

Active Member
Masternode Owner/Operator
Jul 8, 2014
489
402
133
masternode.io
I don't think that's the plan. I might be wrong, but I think the sharded storage is for customer information.
This was my understanding. Decentralized Mega where you pay for your storage to the masternode network. I don't believe there is any plans to use this storage to host the blockchain.
 

TanteStefana

Grizzled Member
Foundation Member
Mar 9, 2014
2,863
1,854
1,283
Oh no, not yet. I am pretty sure it won't be much more than what most of us have at the moment. I mean, my hosting comes with 65 gb of storage. And it's inexpensive.

coingun, just one thing, I don't think most people will have to pay for anything, only if they go over a basic limit or something :) Otherwise, yah :)
 

greensheep

New Member
Jan 3, 2016
19
19
3
Do we know the exact Dash Evolution node hosting requirements? Have they been posted anywhere?
The "Scalability and performance" section of the evolution paper has some numbers on expected storage based on rate of network transactions. I think the idea is to not have fixed requirements but to require the MNs to scale as the network growths -- which is supposed to happen in lockstep with an increase in dash price and therefore MN revenue.
 

TanteStefana

Grizzled Member
Foundation Member
Mar 9, 2014
2,863
1,854
1,283
Just wanted to say that Evan confirmed that the Dash Drive is user data and Masternode data, no transactional data :)
 
  • Like
Reactions: TaoOfSatoshi

nightowl

Member
Dec 30, 2015
67
105
73
Hey Guys

Can I please ask that we keep VPS prices in mind when the main developers decide on harddrive space requirements. I am only here at Dash because I wanted to help the Bitcoin XT crowd, and then found out how much a VPS with a 60GB/80GB harddisk would cost. I realised I would run at a massive loss if I ran a node. Not all ISP connections in the world are great, and not all people can run their computers 24h a day.

For me a masternode is never about making a profit. But I love that it pays for itself. That's the beauty.
 

greensheep

New Member
Jan 3, 2016
19
19
3
Just wanted to say that Evan confirmed that the Dash Drive is user data and Masternode data, no transactional data :)
Does this also mean it dash will not use the DashDrive for preventing double spend? Can somebody remove this from the paper then? Can you confirm transaction locking will still happen the same way as described in the InstantX paper? eduffield
 
  • Like
Reactions: TaoOfSatoshi

TanteStefana

Grizzled Member
Foundation Member
Mar 9, 2014
2,863
1,854
1,283
No, where is that written in the paper? No, the DashDrive is simply a sharded storage system. It should start out with small requirements, and will grow over time.

The way transactions will be vetted, is the same way instantX transactions work, only instead of a single Masternode quorum being selected, virtually all the Masternodes will be randomly grouped into quorums of, I think it's decided now, 10 masternodes. They check and lock all the transactions that come their way, and those are the transactions the miners have to put into the blockchain. In this way, instead of traditional mining, where they can process 4-7 transactions per second, we will be able to conservatively process 4 X 350 transactions per second with the infrastructure we currently have, or a conservative 1400 transactions per second. To increase the number, we simply have to lower the collateral to own a masternode, halving it will pretty much double capacity.

So, with Evolution, miners will have no say as to what goes into the blockchain, rather Masternodes will put a lock all transactions and send them on to the miners.
 

greensheep

New Member
Jan 3, 2016
19
19
3
No, where is that written in the paper? No, the DashDrive is simply a sharded storage system. It should start out with small requirements, and will grow over time.

The way transactions will be vetted, is the same way instantX transactions work, only instead of a single Masternode quorum being selected, virtually all the Masternodes will be randomly grouped into quorums of, I think it's decided now, 10 masternodes. They check and lock all the transactions that come their way, and those are the transactions the miners have to put into the blockchain. In this way, instead of traditional mining, where they can process 4-7 transactions per second, we will be able to conservatively process 4 X 350 transactions per second with the infrastructure we currently have, or a conservative 1400 transactions per second. To increase the number, we simply have to lower the collateral to own a masternode, halving it will pretty much double capacity.

So, with Evolution, miners will have no say as to what goes into the blockchain, rather Masternodes will put a lock all transactions and send them on to the miners.
See section 7.4 from the dash evolution paper (quotes below). This made me assume the DashDrive is used instead of MN (Quorum) locks.
The miners would not have to keep the blockchain and verify the inputs are unspent. They'll instead have to verify that the transaction locks -- which will use some resources as well.

7.4 Double-Spend Prevention - Commit or Rollback
Double spending is not possible on the Dash Network due to the “Commit or Rollback” (COR) feature of DashDrive. After sending a transaction to the network, it will be written to DashDrive and the inputs will be reversed via usage of the filesystem.


CTransaction()

(

Input(1) => /dashdrive/inputs/hash1,

Input(2) => /dashdrive/inputs/hash2,

Input(3) => /dashdrive/inputs/hash3,

)


If at any point a write to DashDrive fails, the earlier writes will be reverted back to the state before the commit started. This allows two users (or one attacker) can attempt to write a transaction inputs, being that inputs are unique file locators on the network, only one will successfully be able to write the transaction commit. In addition to reserving resources on the network, DashDrive stores the whole transaction history across the shared filesystem, while awaiting archival in the permanent blockchain.​
 

GermanRed+

Active Member
Aug 28, 2014
299
109
113
May I suggest that our developers use the Kinetic Open Storage API for the dashdrive? It will save a lot of maintenance for the MN operators. And, it can be expanded easily. That's something really good. Trust me, spend some time to read it before you think it was just a drive connected to the network.


 
Last edited by a moderator:

GermanRed+

Active Member
Aug 28, 2014
299
109
113
Hey Guys

Can I please ask that we keep VPS prices in mind when the main developers decide on harddrive space requirements. I am only here at Dash because I wanted to help the Bitcoin XT crowd, and then found out how much a VPS with a 60GB/80GB harddisk would cost. I realised I would run at a massive loss if I ran a node. Not all ISP connections in the world are great, and not all people can run their computers 24h a day.

For me a masternode is never about making a profit. But I love that it pays for itself. That's the beauty.
It is possible to buy a Xeon-D machine and run it with a 1Gbps fiber at home to host 16+ MNs with 6~8 HDDs. It is really low cost. If someone is willing to pay me some DASH, I could write a guide on how to do this. Give me an offer and I will write it up if the offer is right.
 
Last edited by a moderator:

greensheep

New Member
Jan 3, 2016
19
19
3
I have been thinking some more about DashDrive.

As I have explained before, implementing a sharded consistent storage on top of trustless (master)nodes is hard. But we already have a well-known technology for this: The blockchain! Therefore, I think each masternode shard should have its own blockchain with transactions storing the contents of that shard of the DashDrive. This makes sure that writes to DashDrive are consistent, persistent and durable. Whenever a masternode is chosen to collect the masternode fee of the main chain, it broadcasts a transaction that roots the head of the blockchain of its shard into the main chain. This is a requirement to collect the masternode fee.

The masternodes need a incentive to actually add transactions to the side chain. Therefore, as on the main chain, storing data (transaction) in the DashDrive blockchain requires a fee. Mining on the sharded blockchains is done in pure PoS using the masternode's stake. The only block reward on the side chains are the storage fees which get transferred to main chain with the rooting transaction. A unfortunate side effect of this is that main chain miners will have to verify the side-chain transactions to make sure they are valid (or some selected masternode quorum could do that).

There is still the problem that someone could take over a majority of masternodes in a shard. But because the side chains are auditable and rooted in the main chain this is note so much of a problem:
- Even if a majority of the masternodes denies service, the other masternodes in the shard will still keep serving the shard's content and root new transactions. Also non-masternodes can keep track of side chains increasing the storage redundancy.
- Because side chains are rooted in the main chain it is not possible to alter the history of the side chain beyond the last rooted block even when owning a majority of the side chain masternodes.
So, the worst thing that could happen is that someone owning all masternodes of a side chain could refuse to accept new transactions. This per se does not have any benefit to the attacker unless he manages to force higher storage fees for this shard. This can be counteracted by making it easy to move one's data from one shard to another.

WDYT?
 

AndyDark

Well-known Member
Sep 10, 2014
353
705
163
greensheep I think that is Evan's current strategy, i.e. a second blockchain for DashDrive data. And just for the important data like user account data and communication / relationships between accounts, basically anything Evolution needs to function internally. DashDrive will be a new core feature though not part of the Evolution side me and the other web team members are working on.