ntang (ntang) wrote,

  • Music:

File sharing

Tried out BearShare, in part because it uses gnutella and I like the idea behind gnutella. The problem, though, is that the current gnutella design sucks. The network utilization grows exponentially as the number of hosts grow, and it has no concept (last I checked) of caching or anything else.

The idea was for a true p2p (peer to peer) network, but they failed to remember that in a true peer to peer network every single bloody peer has to talk to every other peer in order to achieve any useful results. So this means if you want to do a search on X, you've got to send that search request to every bloody machine on the network (gnutella does it in a web-like-fashion, where you connect to a certain number of hosts, they each connect to a few more, etc. I mean web as in spider web, not as in www.). This means that if 100 hosts connect up, and each does a query, you have 10000 queries (well, not quite, but you get the idea) ripping through the network at once. Not only that but machines are constantly doing queries back and forth to determine latency and to check on availability and everything else, connecting and reconnecting to other machines... it gets ugly.

It _should_ have happened in a more hierarchical fashion - you have, say, 4-5 tiers of servers, each fanning out to several below it and caching the results from the servers under it. There should be a more intelligent routing mechanism to determine the optimal path to a given host, and to determine the best host to download from (preferably the one with the combination of the fewest current downloads, closest path in the network, highest available bandwidth, etc.). I'm not saying I could design the protocol myself - I couldn't - but they really could have done better than they did.

I know they're constantly working on it and trying to improve it, but I wish they'd hurry. :P

There are now peers called "reflectors" out there that basically perform a similar function to what I discussed - they sit out there and "front" for several peers in the network, routing all requests through them and answering all of the requests. This helps cut down on that whole exponential growth bit. Still, it should be a concept that was designed in from the start and should be executed a little more gracefully. A random person should not be ABLE to connect to another random peer in the main network; there should be a form of authentication. I like the way Advogato's certification system works; a similar idea except for hosts would work nicely I think - i.e. you have "root" servers (excuse me from borrowing from DNS terminology) and they each branch out to 2nd level servers and down to 3rd level, etc., and each server can certify other servers as being at certain levels. You can connect at your certification level or lower - and you can only connect to peers which are one level above you or lower - which means random people just joining are guaranteed to be on the bottom of the totem pole, and they have to 'earn' their way up.

I like that.


The more I think about it, the more ideas I get about ways to improve efficiency along the network and such. Kinda nifty. I just wish the original protocol designers had done this sort of brainstorming. Sigh.

  • Where I am nowadays

    I haven't updated this in a million years... in case you're wondering why, it's because I've mostly moved on to other places. You can find my…

  • DSL

    I've been a loyal Megapath customer for years. (Something like 8 or 10, crazy, in that range...) They've had great service (and a great service -…

  • MySQL failover

    So we're running some MySQL at work, which is a little unusual for us, but is probably long overdue. (Specifically, it's for some Wordpress…

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded