-------- p2pMail --------

Technical Details

Basics

You all know filesharing networks like BitTorrent or eMule which are hated by the Music Industry because they are so hard to control. In these systems there is no essential central server that can be shut down to collapse the network. Why isn't such a server needed? The clients establish the connections between themselves directly using peer-to-peer network.

p2pMail basically works the same way. If you send an email to A, this email is not send to a server to be stored, but it is send to some nodes in the network close to A or even directly to A, if A is online. If A was offline and gets online A can ask the nodes close to her, if they have stored mails for her. If the mail is send to enough nodes close to A (about 20) the probability is very high, that one of those nodes is still on, when A comes and thus A will receive her email. Combined with a bit of encryption this should be a pretty secure and independent way to send emails.

Identification of Users

Each user needs to have an unique ID that identifies the user in the network and allows others to send the user emails. The probability of two users having the same ID needs to be very low. There are two ways to ensure this:

The id is generated as a 256-bit random number. With 2⁶⁴ users (which are very many users) the probability will still be very close to 0.
Can please someone verify that calculation and even put a probability? My calculations returned something close to 0 for 256 bits, but some pretty high probability for 128 and thus I'm not quite sure.
Non the less I do not think, that the Birthday Paradoxon has such a great affect, that it will grab here.
The user is able to chose a id by herself. The ID is then checked, if it is already in use.
But this solution is not failsave either, because nodes can be offline. Thus the user using already the ID might not be found and the ID incorrectly be declared available.

Both methods have problems with attacks, where some user uses an ID to identify as another person to grab the other persons emails. This (and other aspects) have to be contered with encryption. Because a 256-bit number easily can be computed as a hash-value of a public key used for encryption, option #1 was chosen.

Encryption

The current design leaves it open to use any specific public/private key encryption. If a special type of encryption is used, it must be quaranteed, that participating clients support the encryption. Currently RSA keys are generated. Signing is done with RSA and SHA-1, but there should be no problem to change this without breaking compability to older versions, because the encryption algorithm is included into the transferred packets.

The hash of the public key for the ID is generated using SHA-256, because this way it should be impossible to fake an ID.

Finding Users

Emails need to arrive at their recipient, so the recepient in the p2p-network has to be found. How can this be accomplished? First of all the p2p-network is a bunch of nodes with some ID's, that are somehow connected between each other.

somehow?

No. We can define how these nodes will connect between each other. It would be way cool, if we could have some kind of metric, so we can do some kind of binary search. We actually can define a metric, if we look closely at the ID's specified for the users and use the Hamming-distance. Because we can now measure distance, we have also defined, what the closest nodes to a given ID are.

Each user stores the (IP-)addresses of the closest nodes to her ID as her neighbors. Because the gratest Hamming-distance between two nodes is 256 (the bits are exactly complementary), the diameter of the network is maximally 256 (counting the number of intermediate nodes). So, if A tries to find the user with ID B. A simply looks at his neighbor C closest to B, asks C for C's closest neighbor to B and so on, until there are no closer nodes.

This works very fine, if you have the ID of the person you want to send the email to. That ID is 64 characters in hex - puhh. Luckily the infrastructure of the p2p network allows an DHT to be implemented. This DHT maps user specified names to ID's. Because these ID's are totally unrelated to the ID's of the users, it is much easier to fake those names. Because the name-ID relation is saved several times it will be still hard, though. I bet the current design leaves a lot of holes non the less.

Why doesn't A get stuck in a local extreme?

This needs some experience. TODO: Think about this, if it doesn't happen, write why. If it does happen, do something to fix it.

Sending an Email

Diagram showing graphically how an email via the p2p network is send.

Lets assume Alice wants to send an email to Bob. If Bob is online, this is no big deal. Alice just searches Bob in the network as explained above, finds him and sends him her mail directly. If Bob is not online, this is actually no big deal, too. Alice will do it, as anyone will do this in real live. She'll find some of Bob's neighbors (friends) and ask them to be so kind to deliver the email to Bob, when Bob asks for it. If she sends the mail to enough neighbors of Bob's, she can be pretty sure, that Bob will receive the email.

She'll of course has to encrypt her email, because she can't otherwise guarantee, that her mail doesn't get read. Fortunately she used Bob's public key to find him in the first place, so she can also be sure, that she is really using his public key. Alice can now go offline.

When Bob comes online, he asks his neighbors, if they have mails for him. If Alice send the email to enough neighbors, it will be very likely, that Bob receives her email. Bob may also write a short reply-email, so Alice can be sure, he really got the mail.

Communication, Packets

Currently communication is established via packets containing XML-markup. A DTD describing the XML can be found here. I'm not yet very experienced with XML stuff, so it may contain some syntactical errors (haven't tested it yet).

Packets are transmitted via UDP. If the packets are bigger (if you want to send big files) Tcp/Ip might be more clever, but I chose UDP, because it can also work with very restrictive firewalls. Imagine you can not (don't want to) open ports to the WAN for listening to packets. Then others will have a very hard time to contact you, because you are actually not allowing them to. But a p2p-network depends on users being contacted. So how can this dilemma be solved? It can be solved using a method called UDP hole punching. In a p2p-network this is simple, because if A wants to contact B, A has learned of B's existence by C, who has an connection to B. If A directly sends a packet to B, B rejects the packet, because B doesn't know A and never has send A a package, where B could expect an reply. To circumvent this, A tells C to tell B to send packages to A. Because A already send packages to B, B's packages are recognized by A's firewall as a response and are thus forwarded to A's p2pMail. Because B has now send packages, A can also reply to these.

Non the less are users needed, that can be contacted directly, because to bootstrap a node, the node cannot use UDP hole punching.

Test-cases about communication and mail-sending can be found here.

Programming Language

This is the least important part, because any kind of client should be allowed. But some client is needed to test this thing, so there is one ;) It is written in Java 5.0 and needs the JCE (for good encryption algorithms) or alternatively Java 6.0 (where the JCE is included). Some people might also need to enable the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files 5.0 (see bottom of the page).

Javadoc can be found here.

Back to Main Page