Welcome to the Dash Forum!

Please sign up to discuss the most innovative cryptocurrency!

DarkCoin FPGA Mining Co-op?

Discussion in 'Projects' started by glamorgoblin, May 24, 2014.

  1. Sbatto

    Sbatto New Member

    Joined:
    Jun 2, 2014
    Messages:
    11
    Likes Received:
    0
    Trophy Points:
    1
    Does anyone know what the counters t0 and t1 are in the blake algo? https://131002.net/blake/blake.pdf, Glamorgoblin, have you got any of the algos working? maybe we can work on different algos and combine?
     
  2. glamorgoblin

    glamorgoblin New Member

    Joined:
    May 24, 2014
    Messages:
    20
    Likes Received:
    2
    Trophy Points:
    3
    I would imagine the counters are for keeping track of round iterations, but I haven't looked at Blake much yet. I've looked a little bit at the Skein hash and it has counters to keep track of rounds and to know when to mux in feedback data or new data. I've been caught up in end-of-school-year stuff lately and haven't done much other than peeking at Skein. There's a bit of a disconnect there because the University code for Skein (as well as the 10 other hashes I expect) does a good job of demonstrating the 256 bit implementation, but there aren't a lot of clues how to extrapolate to 512 for X11. There are also some "optional" implementation details for Skein that are unclear as to whether they exist in X11 or not. This is going to take a while.
     
  3. fusecavator

    fusecavator Member

    Joined:
    Jun 4, 2014
    Messages:
    40
    Likes Received:
    38
    Trophy Points:
    58
    The darkcoin code directly uses sphlib ( http://www.saphir2.com/sphlib/ (page isn't loading at the time of writing this, but it was working for me not too long ago, so probably just temp downtime)) for its hashes, so the documentation can likely clear up those issues. There actually is a warning about that on that page:
    I'm guessing darkcoin is using the updated version 3, but I'll compare the source later(don't have sphlib-3 on this comp, and can't dl it when the site is down, but I've got it stored elsewhere)
     
    • Like Like x 1
  4. atavacron

    atavacron Member

    Joined:
    Apr 27, 2014
    Messages:
    45
    Likes Received:
    16
    Trophy Points:
    48
    • Like Like x 1
  5. Sbatto

    Sbatto New Member

    Joined:
    Jun 2, 2014
    Messages:
    11
    Likes Received:
    0
    Trophy Points:
    1
    I haven't looked enough into Skien however Blake just ups all words to 64-bit from 32-bit for 512 and 256 respectively
     
  6. glamorgoblin

    glamorgoblin New Member

    Joined:
    May 24, 2014
    Messages:
    20
    Likes Received:
    2
    Trophy Points:
    3
    It looks like Skein's 512 (as well as all other Skein implementations) is based on repeated 64 bit adder entities. Going to 512 from 256 just doubles the number of adders. The only missing piece then is how to tie in the tweak calc (which remains the same size for all widths) to a wider round width. I'm getting there.
     
  7. glamorgoblin

    glamorgoblin New Member

    Joined:
    May 24, 2014
    Messages:
    20
    Likes Received:
    2
    Trophy Points:
    3
    atavacron, great find for the hash functions on github. I think the java implementations will translate much more easily to Verilog than C. I spend some time digging through there. Why don't you, Sbatto, and fusecavator put a crypt coin address in your forum signature so we can give you more than just likes for gems like this.
     
  8. crowning

    crowning Well-known Member

    Joined:
    May 29, 2014
    Messages:
    1,430
    Likes Received:
    2,009
    Trophy Points:
    183
    typedef unsigned long sph_u64;
    #define SPH_C64(x) ((sph_u64)(x ## UL))
    T0 = SPH_C64(0xFFFFFFFFFFFFFC00)
    T1 = 0xFFFFFFFFFFFFFFFF

    So they are just constants for our purpose here.
    Is that what you wanted to know?
     
    #38 crowning, Jun 7, 2014
    Last edited by a moderator: Jun 7, 2014
  9. crowning

    crowning Well-known Member

    Joined:
    May 29, 2014
    Messages:
    1,430
    Likes Received:
    2,009
    Trophy Points:
    183
    There's already a generic getwork->stratum proxy out there which could easily be used.
     
  10. glamorgoblin

    glamorgoblin New Member

    Joined:
    May 24, 2014
    Messages:
    20
    Likes Received:
    2
    Trophy Points:
    3
    So, I've poked through the X11 hashes enough now to get the feeling that it will take a LARGE FPGA to fit it all in. Even with a large FPGA it will probably take a fair amount of rolling or folding to squeeze everything down. That got me thinking about a "practical" FPGA board architecture for X11. If anyone is developing a custom board for X11 FPGA work consider this approach:

    One FPGA sized to fit just two instances of the largest hash machine in the X11 hashchain. Use one of the more recent FPGA's that support dynamic reconfiguration. Attach wide and fast DDR3 or equivalent memory externally. Connect a small microcontroller to the configuration port of the FPGA. Partition the FPGA into two dynamically reconfigurable hash spaces (slots A and B). The first micro programs the A slot with the Blake hash machine and loads the initial header. The Blake machine runs a sequence of nonces through the Blake machine storing the intermediate hashes in the external RAM. It should be able to store 2K hashes in the external RAM. While the A-Blake machine is running the processor programs the BMW machine into slot B. Once RAM is full, the B slot starts processing the hashes in memory and overwritting Blake hashes in external memory with BMW hashes. While BMW is running the processor reconfigures the A slot with the Groestl machine (where Blake used to be). After BMW is finished the A slot overwrites BMW hashes with Groestl hashes and B gets reconfigured with Skein. This continues until all the hashes have executed.

    This approach requires approximately 1/6 of the FPGA gates as a full implementation. It would run at about 1/10 the speed of a full single device implementation, but with the exponential price curve of FPGAs could wind up at 1/20 or 1/50 of the cost. You could make multiple instances of this and still come out dollars ahead for an equivalent hash rate.

    I'm going to target whatever X11 solution I get to my existing HW, but if anyone is developing custom X11 FPGA HW, please let me know. I'd be very interested in seeing how it goes.
     
  11. ray

    ray New Member

    Joined:
    Jun 11, 2014
    Messages:
    1
    Likes Received:
    0
    Trophy Points:
    1
    Hi there, saw you guys made some progress here !!! In the meantime could anyone point me some direction on implanting the keccak sha3 algo on my virtex-5. I'm total new to fpga just can't find the start. Thanks alot for any help !
     
  12. atavacron

    atavacron Member

    Joined:
    Apr 27, 2014
    Messages:
    45
    Likes Received:
    16
    Trophy Points:
    48
    Hi Ray,

    I'm in the same boat, trying to learn to program FPGAs. I ordered a Virtex-5 dev kit that should be here soon. If you find out how to do it please share. I'll do the same of course.
     
  13. Sbatto

    Sbatto New Member

    Joined:
    Jun 2, 2014
    Messages:
    11
    Likes Received:
    0
    Trophy Points:
    1
    that's perfect, thanks!
    I think you're right glamorgoblin, my full parallel Blake512 implementation took up a majority of my cyclone IV. I don't quite understand your idea, would the micros be there just to reprogram the FPGA between each algo?
     
  14. glamorgoblin

    glamorgoblin New Member

    Joined:
    May 24, 2014
    Messages:
    20
    Likes Received:
    2
    Trophy Points:
    3
    Right, the micro would have its own flash memory with a bunch of partial FPGA images in it. It would have to supervise the routine. It would sequence through the hashes and then parse through the resulting 2K hashes to look for hits to submit. Some new FPGA's can support partial, on-the-fly reprogramming. With those devices the micro could ping-pong images into the FPGA. While one is executing the other is programming ... then swap. The intermediate hash states are just stored in the external RAM and run through the new hash algo after reprogramming. I don't know of a simple eval board though that could support this though. It requires a processor, FLASH, FPGA, and dedicated FPGA external DRAM. Not expensive, but also not something you find lying around.

    Does your Blake512 implementation result in the same hash as that provided by fusecavator when given the same input? Are you using Verilog or VHDL? University code or your own? I'm almost done with Skien myself.
     
  15. glamorgoblin

    glamorgoblin New Member

    Joined:
    May 24, 2014
    Messages:
    20
    Likes Received:
    2
    Trophy Points:
    3
    Ray and Atavacron,

    There is a VHDL implementation of Keccak linked from the page at http://keccak.noekeon.org/. Look for the link on the right called "Hardware implementation in VHDL". This likely isn't the exact variant used in X11, but should be a great starting point for tweaking.
     
  16. atavacron

    atavacron Member

    Joined:
    Apr 27, 2014
    Messages:
    45
    Likes Received:
    16
    Trophy Points:
    48
    Thanks glamorgoblin. I recall skimming across that URL but missed it.
     
  17. Sbatto

    Sbatto New Member

    Joined:
    Jun 2, 2014
    Messages:
    11
    Likes Received:
    0
    Trophy Points:
    1
    How many cycles does it take to reprogram an FPGA?

    I haven't got the padding going but the hash it's correct for a 1024-bit message. I'm going to try and get groestl going so I can at least mine something in the mean time.

    EDIT: I forgot! It's my code in verilog. How about you?
     
    #47 Sbatto, Jun 11, 2014
    Last edited by a moderator: Jun 11, 2014
  18. glamorgoblin

    glamorgoblin New Member

    Joined:
    May 24, 2014
    Messages:
    20
    Likes Received:
    2
    Trophy Points:
    3
    It depends on the FPGA, but if there's an external DDR device you'll have quite a bit of time to work with. I did the math wrong by the way. 1Gb worth of external RAM can hold 2M hashes, not 2K. The micro would have as much time as it takes to fully address all of the external RAM to reprogram the offside slot. Hashes would finish in 2M blocks rather than at regular intervals, but the overall hash rate would average out at the pool.
    What can you mine with just groestl? I'm mining LTC with my FPGA rig in the meantime, but that's only marginally profitable. If there's something better I'll consider it too.
    Yes, Verilog. I hate VHDL, but that seems to be what all the universities like. Yuk. Such an inefficient language for digital logic. Sigh.
     
  19. flipme

    flipme New Member

    Joined:
    Apr 27, 2014
    Messages:
    17
    Likes Received:
    3
    Trophy Points:
    3
    #49 flipme, Jun 11, 2014
    Last edited by a moderator: Jun 11, 2014
  20. hyphenx

    hyphenx New Member

    Joined:
    Jun 12, 2014
    Messages:
    1
    Likes Received:
    0
    Trophy Points:
    1
    Its been a while since I've coded, but I've got a Xilinx Kintex-7 FPGA KC705 Evaluation Kit that I could make available for testing.
     
  21. Sbatto

    Sbatto New Member

    Joined:
    Jun 2, 2014
    Messages:
    11
    Likes Received:
    0
    Trophy Points:
    1
    I think you probably could fit it on that board, full parallel. Your only limitation would be propagation errors. Let us know what you get for the quote, I've got a feeling it's gonna be $8000. Thanks for the link!

    I get it know, you would run the same amount of hashes+DDR interfacing, that it would take to program the offline side. That's wicked, I'll have to give it a crack.
    There's Diamond coin and groestlcoin. They each have dismal net hashrates and volume however it's probably the most profitable move for us ATM. I've got my last Exam on Monday, so I'll smash out the Verilog for the hash on Tuesday. I imagine I will hit a wall at interfacing with the PC and, in turn, the network, especially running at 1H/cycle. Are you able to help out with that at all? I was thinking I would just use the bitcoin FPGA miner to do it.
     
  22. glamorgoblin

    glamorgoblin New Member

    Joined:
    May 24, 2014
    Messages:
    20
    Likes Received:
    2
    Trophy Points:
    3
    Sure, I can help. I've tinkered with the PC side scripts for BTC, LTC, and DOGE. I'm assuming though that you have a USB Blaster connected FPGA rig? That's what I'm most familiar with.
    It looks like the Groestl mining pools support the older getwork protocols which is a plus too, since the scripts will port more easily. Once you get it all working cleanly with getwork, you can either call it a day or try to get one of the stratum proxies to work with it. I'm a little leery of the mining proxies. Seems like a perfect opportunity for someone to write a .exe that gives you 95% of your shares and just happens to attribute the other 5% to the author's account without telling you. Why mine when you can write a proxy script and skim off of hundreds of other miners? Of course there's much much worse that an .exe downloaded from God knows where could do as well.

    PM me with details of GroestlCoin like header size/format. Also, if you're using the USB Blaster you'll need to insert Groestl specific probes/sources as virtual wires. Let me know the format of those in your Verilog and I'll see what I have that might match.
     
  23. flipme

    flipme New Member

    Joined:
    Apr 27, 2014
    Messages:
    17
    Likes Received:
    3
    Trophy Points:
    3
    Thanks, they just replied without a quote and offered to talk about whats really needed.
    How much memory would be required for each core ?
    As it has 13 FPGAs, would X13 fit on it also, if the main dispatcher runs an algo task aside?

    I'd like to run a calculation for a complete mining machine, based on that board.
    Another idea would be a combi-box: A miner with a masternode included. Plug and play.
     
  24. Sbatto

    Sbatto New Member

    Joined:
    Jun 2, 2014
    Messages:
    11
    Likes Received:
    0
    Trophy Points:
    1
    Not sure if you mean external or internal memory, I'll assume external. It matters how you implement it. If it's full combinatorial, you wouldn't need much memory at all (if any) apart from the controller chip. The main thing is that you would need a lot of logic elements per algo to do this, blake took 80,000 for me and it isn't the largest algo.
    If you did each algo pipelined, say computing x-amount of hashes at a time, then you would need enough memory to store x-amount of hashes.

    If you can fit x11 on that board, I'd say the additional 2 algos would also fit.

    That would be wicked if you could have a controller that just programs the FPGAs once their powered up.
     
  25. esuncloud

    esuncloud New Member

    Joined:
    May 31, 2014
    Messages:
    7
    Likes Received:
    1
    Trophy Points:
    3
    How about design an ASIC for darkcoin at the same time, any experienced ASIC designer interested here?
     
  26. Sbatto

    Sbatto New Member

    Joined:
    Jun 2, 2014
    Messages:
    11
    Likes Received:
    0
    Trophy Points:
    1
    Don't you need a fair bit of capital to develop ASICs?
     
  27. glamorgoblin

    glamorgoblin New Member

    Joined:
    May 24, 2014
    Messages:
    20
    Likes Received:
    2
    Trophy Points:
    3
    Depending on the technology used you might be able to get the costs down under $250K USD.
     
  28. alnoor1231

    alnoor1231 New Member

    Joined:
    Jun 16, 2014
    Messages:
    8
    Likes Received:
    0
    Trophy Points:
    1
    I`ve got an old spartan 6 fpga used to mine bitcoins. Is it possible to reprogram it to mine x11?
     
  29. esuncloud

    esuncloud New Member

    Joined:
    May 31, 2014
    Messages:
    7
    Likes Received:
    1
    Trophy Points:
    3
    We could do this with 0.11 um or even 0.18 um technology, and finish the design firstly.
    The MPW fee could be affordable with a small amount IPO and pre-order for full-mask in the future.
    However, the risk is still high, because the Darkcoin team may change the mining algorithm anytime.
     
    #59 esuncloud, Jun 16, 2014
    Last edited by a moderator: Jun 16, 2014
  30. esuncloud

    esuncloud New Member

    Joined:
    May 31, 2014
    Messages:
    7
    Likes Received:
    1
    Trophy Points:
    3
    Any update on the GroestlCoin FPGA miner, cause it looks like a good starting point of X11. However, it should be noted that GroestlCoin will switch to PoS after 150000 in a month.
    I am still working to upgrade the following Groestl Verilog code to 512 bit
    https://www.rcis.aist.go.jp/files/special/SASEBO/SHA3-ja/Grostl.zip
    Have you gotten a workable Groestl512 Verilog code integrated in the FPGA miner, meanwhile we may need another dump program of GroestlCoin, which used double Groestl512.
    Maybe fusecavator will be a better person who could do this for us?
     

Share This Page