A world without certificate authorities

love locks
When networks began to expand and people saw the need for secure communication, they designed complex systems based on public key cryptography, that worked more or less. Problem: how do you trust that the key a server sent you is the right one? How can you make sure that it is not somebody else trying to impersonate that website?

Multiple solutions were proposed, and the most promising was a public directory of domain names and associated public keys, maintained by a peer to peer network named KeyCoin. It looked better than so called Web Of Trust solutions, because everybody could agree on what was the correct key for a given domain. As long as nobody hold 51% of the network, no change could happen without being validated by a lot of different peers. The network was maintained by 10000 enthusiast system administrators who took their task very seriously (after all, the security of the whole system depended on their honesty), and nobody had enough computing power to take over the network.

After a while, people began using the system, since it was directly integrated in their browsers, but they did not want to run a node on the network themselves. It was too bothersome, and they could trust the administrators. Also, they had to ask one of them to make a change everytime. The whole process was a bit artisanal.

In the meantime, some people demonstrated the 51% attack on networks of reduced size, and that worried people. They wanted a safe system, one that was not only relying on those sysadmins that could do anything. Who were they anyway? Running that system was still too complex for non technical too run it themselves anyway, so they did not worry enough. But some governments found that rewriting the truth of name/key matching was interesting. Maybe to catch pedophiles, terrorists, criminals. Or maybe to censor websites, I do not know, they told me it was for my own good.

Some smart person found a good solution: if controlling the whole system necessitated owning 51% of the system, the easiest way was to have a lot of machines, enough to counteract the sysadmins. That did not seem risky when people designed the system. Nobody could have enough computing power to take over the whole network, and there would be even more nodes every day.

Yet, that person got enough funding to install tens of thousands of machines and make them join the network. They even provided a nice enough interface for people and businesses to input their domain name and public key, as long as they paid some fees. The sysadmins welcomed him at first, since money coming in the system validated their ideas. Atfer a while, they started worrying, since none of them could keep up with the computing power, but that company asssured them it would never attain 51% of the network.

Other companies jumped on the bandwagon and started to profit from that new business opportunity. Governments started their own server farms to participate too. Problem: now that everybody (except the sysadmins) had a lot of computing power, nobody had enough to control the network entirely.

So they started making alliances. If a few major players work as a team, they can do whatever they want on the network. If one of them decided to try and replace a key on the ledger, others could help it. Of course, once they begun doing that, others wanted to participate. So they created a few rules to join their club. First, you needed to have enough machines. That was a good rule, because that made a big barrier to entry. You could not start as a small player. The other rules? You had to submit to an audit, performed by the other players. Yet another barrier to entry. And once they deemed you acceptable, you had to follow the requests of governments, which were arbitrarily refusing candidates.

Even with the big barriers to entry, a few hundred players came up, often backed by governments. Of course, all ended up in the same team, doing whatever they wanted, as long as nobody was complaining, because anytime one of them had something shady to do, all of them followed automatically.

Since building those big companies required money, they made their clients pay more and more, and to make it easier to accept, provided “premium” options where they show they trust you more, since they took the time to phone your company and ask a few questions.

Some found that big system too centralized, too obedient to states, and decided to fork it. There are separate public ledgers, but they do not come directly embedded in browsers, you need to integrate them yourself, and that’s bothersome. Also, most of those networks have a few hundred nodes at best.

From a nice, decentralized, home made system, we ended up with a centralized system controlled by corporations and governments.

Now let me tell you about that system I designed. It is based on a concept named certificate, a cryptographically signed file that links the public key to a domain name. Now here’s the catch: a certificate represents a key, and is signed by another key, which is represented by another certificate, and so on and so forth until a certificate that signs itself. That system is good, because you just have to embed the root certificate that your friends gives you, and you’ll be able to verify the key of his websites, even if those keys change. And this, without even asking the public ledger, so that is a truly decentralized and more anonymous system! Nothing could go wrong with that, right?

About these ads

Programming VS Mathematics, and other pointless debates

I do not know who started this argument a few days ago. It feels like something coming from HN. Do you need to know mathematics to be a good programmer?

There is a lot of differing opinions. Maybe programming is a subbranch of mathematics, or programming is using mathematics. Or learning programming is closer to learning a new language. For me, saying that programming is about languages is like saying that literature is about languages. Sure, you need words to indicate concepts, some languages are better suited than others for that, and some concepts are better expressed in other languages. It is more like a hierarchy to me: philosophy formalizes concepts used by authors to write in common languages. Mathematics formalize concepts used by programmers to create code in common languages.
But this is besides the point.

This debate sparks outrage, since it touches a central point of our education, and one that is often not taught very well. “Look, I do not use geometry while writing a loop, so maths are pointless for me”. A lot of developers will never learn basic algebra or logic and will never need it in their day job. And that’s okay.
Programming is not a single profession anymore. Each and every one of us has a different definition. A mechanical engineer working on bridges, another on metallic parts for cars and another one on plastic toys all have different needs, different techniques for their job, although the fundamental basis (evaluating breaking strength, time of assembly, production costs) is the same. That does not make one of these jobs worth more than the other.

The real problem is that we are still fighting among ourselves to define what our job is. The other pointless debate, about software being engineering, science or craft, is evidence of that. And it will stay hard to define for a long time.
We are in a unique position. Usually, when a new field emerges, either tinkerers are launching it and later, good practices are studied to make it engineering, or scientists create it, then means of production become cheaper and crafters take over.
Computers were started by scientists, but the ease of access gave crafters a good opportunity to take over. But that does not mean research stopped when people started coding at home. So now, in a relatively new field (less than a century), while we are still exploring, we have a very large spectrum of jobs and approaches, from the most scientific to the most artistic kind. And that is okay. More world views will help us get better at our respective jobs.

So, while you are arguing that the other side is misguided, irrealistic or unrigorous, take time to consider where they come from. They do not have the same job, and that job can seem pointless to you, but they can be good at it, so there is probably something good you can learn from their approach. The only thing you should not forgive from the other side is the lack of curiosity.

The network is the computer, the cluster is the RAM

There is a very weird part of web applications, where all the nice abstractions and syntax reasoning go wrong, at the interface between the code and a database. At best, there is a leaky abstraction of the database with an ORM, and you have to think about what methods to apply to get the underlying SQL query you need, at worse, you write queries and deserialize manually.

This happens because at one point, applications needed to manipulate more data than their host’s memory could handle. This required a good abstraction over storage, efficient data walking algorithms and fine tuned caching. This also required thousands of hours of engineering, to get a database that is at least bearable to use. Since so much work was put in those databases, you might as well implement as many features as possible, to reuse all this fine engineering.

To work efficiently with these data warehouses and offload a part of the selection work from the application, query languages inspired from logic programming were invented. Basically, they make it easy to work with relations: entity/attribute/value triplets like RDF, or tabled data. Those query language are voluntarily not Turing complete: they do not include loops, negation or unbounded recursion. This helps a lot in optimizing the queries.

Unfortunately, this query language is the barrier between an application and its data. Instead of reasoning about what is in memory, the code must be transformed to load data from the database through a query, deserialize it, compute, reserialize data and put it in the database. Even worse, for efficiency’s sake, some developers push more and more logic to the database, with even more complex queries, views and stored procedures.

What if we could reason directly on a cluster of data as if it was already in the memory? I do not want to create a structure from a deserialized row, change a value then put that row “where id = $myId”. I want to access a structure that is already there in memory, and change the value directly (or clone it and change my index, but that talk is for another blog post).

“No, you cannot access directly data that is not already in your memory”. Sure I can. We already have powerful tools for that. L1 and L2 caches are using that principle, to load data from the RAM and make it available faster to the CPU. Memory mapped file can be lazily loaded page by page in the virtual memory. Imagine loading data lazily from the network, in your address space… Nowadays, we can index data on 64 bits, enough to address the whole world!

“But this totally breaks your security model”. No, it does not. First, most database clusters already assume they’re running on a trusted network. Second, since I see the cluster as a part of my hardware, I think adding a MMU to the lot would work quite well.

“It does not work, because of concurrent accesses”. This already happens in databases, and this is where their powerful query language gets things wrong: if you have a powerful way to access multiple rows at the same time, you have to lock huge parts of the database at once in a transaction to run your mutating query. For virtual memory, we have a lot of interesting tools. Memory pages can be read-write or read-only. Locking through mutexes or Software Transactional Memory could also be implemented at a cluster’s scale. But concurrency is a hard problem, that is often better solved through good data architecture. Immutable data, colocating related data, append-only datastructures, all work as well in memory as on a networked cluster.

This is of course a very big gap to jump, from our traditional databases to a total abstraction in memory, but I think it is an interesting alternative to consider.

There is another model to consider here, one that is currently adopted by large distributed databases: since an application cannot do all the work by just loading data in its memory space, let’s push code onto the data, and run a Turing complete query language on the cluster. This is actually the same kind of model, with worker threads running on your data while you wait for a “work done” message, but you still need to interface your app with the query language. Maybe someday I’ll be able to send a compiled function to run on a cluster.

All in all, the big tools that people built to fight the inefficiencies of yesterday’s technology have to be questioned today. By removing the complex abstractions and their obsolete limitations, we could obtain powerful and simple model to write our future code.

Revisiting Zooko’s triangle

In 2001, Zooko Wilcox-O’Hearn conjectured that a key value system, in the way the keys address those values, must make a compromise between three properties. Those properties are:

  • distributed: there is no central authority in the system (and the other nodes of the system are potentially untrusted)
  • secure: a name lookup cannot return an incorrect value, even in the presence of an attacker
  • human usable keys: keys that a human will be able to remember and write without errors

The conjecture stated that you may have at most two out of these three properties in the system.

Examples of those properties at work:

  • the DNS is secure (one possible record for a given name) and human usable (domain names can be learned), but not distributed (the server chain is centralized)
  • the PGP mapping between email and public keys is distributed (no central authority defining which email has which key) and human usable (public keys are indexed by emails) but not secure (depending on your view of the web of trust, identities could be falsified)
  • Tor .onion addresses are secure (the address is the hash of the public key) and distributed (nobody can decide that an address redirects to the wrong server) but not human meaningful (have you seen those addresses?)

Some systems, like NameCoin, emerged lately and exhibited those three properties at the same time. So is the conjecture disproved? Not really.

When considering that triangle, people often make a mistake: assuming that you have to choose two of these properties and will never approach the third. These properties must not be seen as absolute.

Human usability is easily addressed with Petname systems, where a software assistant can help you add human-meaningful names to secure and distributed keys (at risk of collision, but not system wide).

You can add some distributed features in a centralized system like SSL certificate authorities by accepting multiple authorities.

NameCoin is distributed and human meaningful, and creates a secure view of the system through the blockchain.

And so on.

The three properties should be redefined with what we know ten years later.

Human usable names

We now understand that a human meaningful name is a complex concept. We see people typing “facebook” in a search engine because they would not remember “facebook.com”. Phone numbers are difficult to remember. Passwords, even when chosen through easy to remember methods, will be repeated at some point. And we could also talk about typosquatting…

Here is a proposition: a human usable key is an identifier that a human can remember easily in large numbers and distinguish as easily from other identifiers. I say identifier because it could be a string, a picture, anything that stands out enough to be remember and for which differences could be easily spotted.

That definition is useful because it is quantifiable. We can measure how many 10 digits phone numbers a human can remember. We can measure how many differences a human can spot in two strings or pictures.

Decentralized and/or secure names

The original text indicates distributed as “there is no central authority which can control the namespace, which is the same as saying that the namespace spans trust boundaries”. The trust boundary is defined as the space in which nodes could be vulnerable to each other (because they trust each other). In a fully distributed system, a node may not be able to force any other node to have a different view of the system.

Secure means that there is only one correct answer for a name lookup, whoever does the lookup. This is the same as saying that there is a consensus (at least in a distributed system).

We can see that the decentralized and secure properties are at odds with the consensus problem in a distributed system. This is where systems like NameCoin make the compromise. They exchange an absolute secure view of the system for an eventual consistency, and fully distributed trust boundaries for byzantine problems. They guarantee that, at some point, and unless there are too many traitors in the systems, there will be a universal and unique mapping for a human readable name.

So, we could replace the “secure” property by: the whole system recognizes that there is only one good answer for a record at some point in the future. Again, this is measurable, and it also addresses the oddity in DNS that records have to propagate (so sometimes the view can differ). Systems where identifiers do not give unique responses will not satisfy that definition.

The last property, “decentralized”, can be formulated like this: a node’s view of the names cannot be influenced by a group of unauthorized nodes. In a centralized system, where there is essentially one node, this definition cannot apply. In a hierarchy of nodes, like DNS, we can easily see that a node’s view can be abused, since there is only one node you need to compromise to influence it, its first nameserver (this is used, for good and for bad reasons, in a lot of companies). In a distributed system, this definition becomes quantifiable and joins the byzantine problem: how many traitor nodes do you need to modify a node’s view of the system?

Here, we have three new definitions that are more nuanced, allow some compromise and modulation of the needs, and for which the limits can be measured or calculated. These quantified limits can be useful to describe and compare the future naming system propositions.

How to choose your secure messaging app

Since WhatsApp announced its acquisition, a lot of people started to switch to alternatives, trying to escape from Facebook. Some of them then discovered my article about Telegram, and a common answer was “hey, at least, it is better than WhatsApp, because it is open source, faster and it has encryption”.

This is a very bad way to decide what application you should use. If you choose a secure messaging app, it must be because you need it, not just because you want to avoid Facebook.

Those are not good enough requirements:

  • independent from Facebook
  • fast
  • multi platforms
  • open source

Yes, even open source, because it does not magically make software safe.

So, what are goods requirements? Well, I already have a list of what a secure messaging app should meet to be considered. If an app does not follow those requirements, it may not be a good idea to use it.

But it still does not mean the app will fit your use case. So you must define your use case:

  • Why do you need it?
  • With whom will you communicate?
  • Who is the adversary?
  • What will happen if some of your information is revealed to the adversary?
  • Does it need to be always available?
  • For how long will it be used?

This is part of what I mean when I insist on having a threat model: you cannot choose correctly if you do not know the risks.

Here are a few examples that you could consider.

The activist in a protest

The activist must be able to communicate quickly in the crowd. Identifying info might not be the most important part, because she can use burner phones (phones that will be abandoned after the protest). The most important feature is that it should be always available. Phone networks were often used to disrupt activist communication, so a way to send message through WiFi our bluetooth might be useful. The messages can be sent to a lot of different people, so being able to identify them might be important. If it is large enough to be infiltrated easily, then having no way to identify people is crucial.

Being able to send photos is important, because they might be the only proof of what happened in the protest. Here, I have in mind the excellent ObscuraCam app, which is able to quickly hide the faces of people in photos before sending them.

The application should not keep logs, or provide a way to quickly delete them, or encrypt them by default, because once someone is caught, the police will look through the phone.

The crypto algorithms and protocols should be safe and proven for that use case, because the adversaries will have the resources to exploit any flaw.

No need for a good update system if the devices will be destroyed after use.

The employee of a company with confidential projects

The adversaries here are other companies, or even other countries. The most important practice here is the “need to know”: reduce the number of persons knowing the confidential information. that means the persons communicating between themselves is reduced, and you can expect that they have a mean of exchanging information securely (example: to verify a public key).

Identifying who talks with whom is not really dangerous, because it is easy to track the different groups in a company. You may be confident enough that the reduced group will not be infiltrated by the adversary. The messages should be stored, and ideally be searchable. File exchange should be present.

There could be some kind of escrow system, to reveal information if you have a certain access level. Authentication is a crucial point.

The crypto may be funnier for that case, because the flexibility needed can be provided by some systems, like identity based encryption.Enterprise policies might be able to force regular uodates of the system, so that everybody has the same protocol version at the ame time, and any eventual flaw will be patched quickly.

The common user

It is you, me, anyone wanting to exchange private messages with friends or family. Here, trying to protect against the NSA is futile, because most of the contacts might not have the training needed. Trying to hide the contacts list from Facebook is futile too: even if someone protects the information, one of the contacts may not. The adversary you should consider here: crooks, pirates, anyone that could exploit the private messages for criminal ways (stealing bank info, blakcmailing, sending malware, etc).

An application fitting this use case should encrypt messages, preferably end to end, to limit problems when the exchange server is compromised. The service might not provide any expectation of anonymity. Messages should be stored, but encrypting them is a good option, in case the device is lost or stolen.

The crypto does not need to be very advanced, but it should use common, well known designs.

There should be a good update system, a way to negotiate protocol versions (and forbid some unsafe versions), because you will never be sure that everybody has performed all the needed updates.

Your use case here

Those were some common situations, for which some solutions exist, but there are a lot more possible use cases. If you are not sure about yours and need help defining your threat model, do not hesitate to ask for help, and do not jump on a solution because the marketing material says it is safe.

A good security solution will not only tell you what is protected, and how, but also what is not protected, and the security margins you have. It will also teach you the discipline you need to apply to get the most out of it.

The problem with meritocracy

Everytime people discuss the hacker community and its diversity, I see someone waving the “meritocracy” argument. “It is not our fault those minorities are not well represented, if they knew more stuff or did more stuff, they would have a better status”.

It is easy to see how that argument would be flawed, as meritocracy is a power structure, and whenever a power structure is created, after some time it tends to reinforce its own community. But that is not my point right now.

I realized that the idea of meritocracy is so deeply ingrained in the hacker mindset that we lost sight of what was important. I can see how that idea is appealing. Once you prove you know stuff, people will recognize you, and that will be enough to motivate you to learn. Except it is not. The meritocracy is just another way to exclude people. Once you consider someone’s status by how much you perceive they know, things go downhill.

Some are good at faking knowledge. Some know their craft, but do not talk that well. Some are not experts, but have good ideas. Some would like to learn without being judged. Everytime you dismiss someone’s opinion because of their apparent (lack of) knowledge, everytime you favor someone’s opinion because of their apparent knowledge, you are being unscientific and unwelcoming. You are not a hacker, you are just a jerk.

Somewhere along the way, people got too hung up on meritocracy, and forgot that you hack for knowledge and for fun, not for status. It is all about testing stuff, learning, sharing what you learned, discussing ideas and helping others do the same, whatever their skills or their experience. Status and power structures should have nothing to do with that.

Guess what? I pointed out that bad behaviour, but I am guilty of it too. I have to constantly keep myself in check, to avoid judging people instead of judging ideas. That’s alright. Doing the right thing always requires some effort.

Criterions for a crypto app

Following the previous article, people have asked me what I would consider as a good secure system, and others asked me to review their app, so I think it will be interesting to expose my process when studying those projects.

Threat modeling

The most important point I look for in a project is the threat model. This is the document that will explain for whom the project was created, who are the adversaries, what they are trying to obtain, and which of these threats you are addressing.

Without that document, I cannot know if you considered all the possible actors, and I must infer it from the protocol, which is relatively easy, but my view of the threat model might not correspond to what you expected.

With a good threat model, I can know right away what is your target market (ex: sexting for teens, or secure reporting for journalists in war environments), see if your users will understand the implications, if it will need training, and more importantly, if your system can be safe for that context.

You cannot create a project and say that it will solve all of the privacy problems with some magical crypto algorithm, against all adversaries, even the state actors. I would prefer a useful tool for a niche with real and well defined needs.

Prior art

As you have probably seen, the secure messaging space is already very crowded. If you come up with a new solution to an already solved problem, you need to justify it. Why didn’t you improve an existing project? Couldn’t you adapt someone else’s code, add a better UI?

the NIH syndrome is at the heart of innovation, so I am not against it. But in the case of crypto applications, it might be a good idea to employ already existing (and already audited) code, instead of writing a whole new protocol or algorithm from scratch.

Otherwise, if you are working on an unsolved problem, or improving on current solutions, be prepared to justify it, and a lot, if you employ unusual systems. I am not telling you to avoid funny stuff like Pailler’s cryptosystem, PIR or pairing based cryptography. Just be aware that people will ask you about these.

Publications

That part is fundamental: if you are providing a new protocol or algorithm, you should publish it and ask for review before you start coding and get users. I am not advising you to start up LaTeX and write a paper in ACM format. Just explaining your system on a webpage is fine. The crypto community is full of nice people that will be able to point out if there is any problem (and if you use the academic way of publishing, you might even profit from other people’s funding to get reviews :p).

Some said that the crypto community is full of bitter people eager to hit any new project, following the whole Telegram debacle. That tends to happen when you make a big announcement to get users, telling that it will solve any security problem, and dismiss the opinions of experts, without having asked for review previously.

Note that some of those experts have worked for years on a project before even thinking of communicating about it. As examples, check out Briar, Pond or Cryptosphere: those are quiet but interesting projects. They are not trying to get a lot of users quickly or profit from the post Snowden panic. They have been at it for a long time.

So, publish, ask for review, fix flaws, publish again, fix stuff, and repeat again and again. That is the smartest way to spend your time and money on your project. Once everything is developed and deployed, you will have a hard time trying to plug the holes.

Protocol design

Once we get in the technical stuff, the protocol design is interesting to get a high level view of what you want to achieve. I’ll ask questions like:

  • Is it server centric or P2P? (note: a network of server introduces routing, but is not P2P)
  • Does it include authentication?
  • Is it encrypted end to end?
  • How do you protect against DoS?
  • Is it versioned? Do you allow for protocol version negotiation? Are the algorithms negotiated?
  • Can you revoke keys or identities?

Often, the protocol show what you want to achieve with your system, and it is often answering more threats than the crypto algorithms themselves. A good way to present your protocols is to use diagrams and present the message contents.

Do not insist on algorithms at this point: use general words to describe the primitive you need, like authenticated cipher, public key, key derivation function, MAC. You might change the algorithms later, so stating the properties you need will help reviewers understand what you want to achieve.

A specific note on server VS peer to peer: it is a very understandable feeling for geeks that P2P architectures look better, because they’re decentralizing everything, etc. But they can introduce other problems (like hole punching or sybil attacks), and in some case, you will not be able to avoid servers (for message routing and retries, for mobile systems, etc). Both types of systems are fine, just be aware of their shortcomings.

Cryptographic constructs

Cryptographic algorithms are not enough, you need to apply them correctly. I will have no pity if you say you use “military grade AES 256 encryption” but do not know what is a block cipher mode or Encrypt-Then-MAC. A lot of ugly details can hide here, so do not try to be clever, use battle tested systems:

  • add a separate authentication layer to Diffie-Hellman key exchanges
  • use an authenticated encryption mode
  • use RSA-OAEP instead of PKCS1 padding
  • know well if you need a nonce, an unpredictable number or a time based ID
  • etc.

This is one of the parts where crypto experts will ask annoying questions, because a lot of bugs come from there. They can also propose better solutions (safer, more performant, etc), so listen to them.

If you are employing an unusual scheme here, be prepared to justify it. It might be ok for you, but if the design looks weird to cryptographers, that will raise alarms. Your scheme could be safe, but if it has never been proven right, you are taking a risk, and your users will take that risk too. Is it worth it? Hint: your weird design should provide a unique property that no other algorithm has.

Choice of algorithms

Yes, I do not worry about algorithms until I am already deep in the system. It is not that hard to make correct choices there. Just listen to the recent attacks (ie, avoid RC4) choose large enough keys, choose correct elliptic curves.

Every algorithm has parameters that you need to get right, so be sure to document yourself on your algorithm choices:

  • AES-CBC needs an initialization vector, but AES-CTR uses an incremented nonce
  • RSA needs a good exponent
  • Some elliptic curves work better for some operations

Even if you choose dubious algorithms, if your protocol was well designed, you will be able to move to better algorithm. Be careful with algorithm negotiation, though, a lot of smart people were bitten before.

The implementation

This is probably the part that I will skip, because I do not have the time nor the funding to audit thoroughly the code of every new projects. I will often grep a bit through the code, look for some important points, but this is not something that should be done quickly. This is where the protocol review shows its limits.

Even with a good design, a lot of vulnerabilities can be present in a flawed implementation. Crypto projects should undergo a careful audit like the one Least Authority performed recently on Cryptocat. And that is why you should not communicate about your project before it has been reviewed.

There are things you should always look for in your software projects:

  • encrypting data at rest: if you worry about stolen data, know that a mobile phone or laptop can be stolen
  • random number generation: you should use a CSPRNG, with a good source, and probably some user or device specific data
  • data backup: is it possible? is it safe?
  • software updates: are they downloaded from a secure source? Are the updates verified?
  • Do you use public key pinning?
  • How long are they private keys stored as plaintext in memory?

The implementation details are as important as the whole protocol. You can have a good protocol, but a small error in the code could greatly affect your users. Nevertheless, specifying your protocol is useful, because people can provide better implementations, or make it interoperate with other software. Having other implementations is a good thing: you will not control those versions, but they will be able to construct cool stuff around your system, and make a part of your PR.

User interface

this part is more and more important, because we have been able to create safe systems for years, but often at the price of usability. The user experience of crypto apps needs a lot of innovation, and I’ll follow closely any interesting idea in that space: onboarding experience, useful alerts, user decision making, etc. People should be able to understand when there is a security problem.

I’ll state it once more: if you create a new crypto software, you HAVE to make it easy to use and understand. Some complexity is acceptable, but it must be compensated by documentation (with screenshots, etc) or training.

Other criterions

There are two others that I could think of, but they do not matter that much.

The first is the team. I have been accused of making fun of Telegram for waving around their team of PhDs, but the truth is that I was hopeful: a team full of smart people can come up with interesting design and solve complex problems. If they do not deliver on that, I could be less indulgent. That does not mean I will think less of people without big diplomas. I know too many smart people that dropped out of school to make that mistake. Ultimately, the important thing to judge is the design.

The last parameter is attitude. It is normal to be defensive when someone else reviews your work, but that does not justify denial and dishonesty. People are often taking time off of their job to study your system, so they will be quick and get to the point. If you do not answer or refuse to explain your decisions, it will smell fishy. Even more if you did not ask for a review before communicating about your project. But it does not matter that much. If you are humble and quick to answer, people may help you out of good will, but if you anger cryptographers, you may just have won a free thorough audit :D

 

Telegram, AKA “Stand back, we have Math PhDs!”

Here is the second entry in our serie about weird encryption apps, about Telegram, which got some press recently.

According to their website, Telegram is “cloud based and heavily encrypted”. How secure is it?

Very secure. We are based on a new protocol, MTProto, built by our own specialists, employing time-tested security algorithms. At this moment, the biggest security threat to your Telegram messages is your mother reading over your shoulder. We took care of the rest.

(from their FAQ)

Yup. Very secure, they said it.

So, let’s take a look around.

Available technical information

Their website details the protocol. They could have added some diagrams, instead of text-only, but that’s still readable. There is also an open source Java implementation of their protocol. That’s a good point.

About the team (yes, I know, I said I would not do ad hominem attacks, but they insist on that point):

The team behind Telegram, led by Nikolai Durov, consists of six ACM champions, half of them Ph.Ds in math. It took them about two years to roll out the current version of MTProto. Names and degrees may indeed not mean as much in some fields as they do in others, but this protocol is the result of thougtful and prolonged work of professionals

(Seen on Hacker News)

They are not cryptographers, but they have some background in maths. Great!

So, what is the system’s architecture? Basically, a few servers everywhere in the world, routing messages between clients. Authentication is only done between the client and the server, not between clients communicating with each other. Encryption happens between the client and the server, but not using TLS (some home made protocol instead). Encryption can happen end to end between clients, but there is no authentication, so the server can perform a MITM attack.

Basically, their threat model is a simple “trust the server”. What goes around the network may be safely encrypted, although we don’t know anything about their server to server communication, nor about their data storage system. But whatever goes through the server is available in clear. By today’s standards, that’s boring, unsafe and careless. For equivalent systems, see Lavabit or iMessage. They will not protect your messages against law enforcement eavesdropping or server compromise. Worse: you cannot detect MITM between you and your peers.

I could stop there, but that would not be fun. The juicy bits are in the crypto design. The ideas are not wrong per se, but the algorithm choices are weird and unsafe, and they take the most complicated route for everything.

Network protocol

The protocol has two phases: the key exchange and the communication.

The key exchange registers a device to the server. They wrote a custom protocol for that, because TLS was too slow and complicated. That’s true, TLS needs two roundtrips between the client and the server to exchange a key. It also needs x509 certificates, and a combination of a public key algorithm like RSA or DSA, and eventually a key exchange algorithm like Diffie-Hellman.

Telegram greatly simplified the exchange by requiring three roundtrips, using RSA, AES-IGE (some weird mode that nobody uses), and Diffie-Hellman, along with a proof of work (the client has to factor a number, probably a DoS protection). Also, they employ some home made function to generate the AES key and IV from nonces generated by the server and the client (server_nonce appears in plaintext during the communication):

  • key = SHA1(new_nonce + server_nonce) + substr (SHA1(server_nonce + new_nonce), 0, 12);
  • IV = substr (SHA1(server_nonce + new_nonce), 12, 8) + SHA1(new_nonce + new_nonce) + substr (new_nonce, 0, 4);

Note that AES-IGE is not an authenticated encryption mode. So they verify the integrity. By using plain SHA1 (nope, not a real MAC) on the plaintext. And encrypting the hash along with the plaintext (yup, pseudoMAC-Then-Encrypt).

The final DH exchange creates the authorization key that will be stored (probably in plaintext) on the client and the server.

I really don’t understand why they needed such a complicated protocol. They could have made something like: the client generates a key pair, encrypts the public key with the server’s public key, sends it to the server with a nonce, and the server sends back the nonce encrypted with the client’s public key. Simple and easy. And this would have provided public keys for the clients, for end-to-end authentication.

About the communication phase: they use some combination of server salt, message id and message sequence number to prevent replay attacks. Interestingly, they have a message key, made of the 128 lower order bits of the SHA1 of the message. That message key transits in plaintext, so if you know the message headers, there is probably some nice info leak there.

The AES key (still in IGE mode) used for message encryption is generated like this:

The algorithm for computing aes_key and aes_iv from auth_key and msg_key is as follows:

  • sha1_a = SHA1 (msg_key + substr (auth_key, x, 32));
  • sha1_b = SHA1 (substr (auth_key, 32+x, 16) + msg_key + substr (auth_key, 48+x, 16));
  • sha1_с = SHA1 (substr (auth_key, 64+x, 32) + msg_key);
  • sha1_d = SHA1 (msg_key + substr (auth_key, 96+x, 32));
  • aes_key = substr (sha1_a, 0, 8) + substr (sha1_b, 8, 12) + substr (sha1_c, 4, 12);
  • aes_iv = substr (sha1_a, 8, 12) + substr (sha1_b, 0, 8) + substr (sha1_c, 16, 4) + substr (sha1_d, 0, 8);

where x = 0 for messages from client to server and x = 8 for those from server to client.

Since the auth_key is permanent, and the message key only depends on the server salt (living 24h), the session (probably permanent, can be forgotten by the server) and the beginning of the message, the message key may be the same for a potentially large number of messages. Yes, a lot of messages will probably share the same AES key and IV.

Edit: Following Telegram’s comment, the AES key and IV will be different for every message. Still, they depend on the content of the message, and that is a very bad design. Keys and initialization vectors should always be generated from a CSPRNG, independent from the encrypted content.

Edit 2: the new protocol diagram makes it clear that the key is generated by a weak KDF from the auth key and some data transmitted as plaintext. There should be some nice statistical analysis to do there.

Edit 3: Well, if you send the same message twice (in a day, since the server salt lives 24h), the key and IV will be the same, and the ciphertext will be the same too. This is a real flaw, that is usually fixed by changing IVs regularly (even broken protocols like WEP do it) and changing keys regularly (cf Forward Secrecy in TLS or OTR). The unencrypted message contains a (time-dependent) message ID and sequence number that are incremented, and the client won’t accept replayed messages, or too old message IDs.

Edit 4: Someone found a flaw in the end to end secret chat. The key generated from the Diffie-Hellman exchange was combined with a server-provided nonce: key = (pow(g_a, b) mod dh_prime) xor nonce. With that, the server can perform a MITM on the connection and generate the same key for both peers by manipulating the nonce, thus defeating the key verification. Telegram has updated their protocol description and will fix the flaw. (That nonce was introduced to fix RNG issues on mobile devices).

Seriously, I have never seen anyone use the MAC to generate the encryption key. Even if I wanted to put a backdoor in a protocol, I would not make it so evident…

To sum it up: avoid at all costs. There are no new ideas, and they add their flawed homegrown mix of RSA, AES-IGE, plain SHA1 integrity verification, MAC-Then-Encrypt, and a custom KDF. Instead of Telegram, you should use well known and audited protocols, like OTR (usable in IRC, Jabber) or the Axolotl key ratcheting of TextSecure.

Handling IO failure

Let’s talk a bit about IO programming. Filesystem, network, GUI… You cannot write useful code without doing IO these days. So why is it so damn hard to do in “safe” languages like Haskell?

Well, in Haskell, you isolate the unsafe parts to be able to reason safely on the rest of the code. What does “unsafe” mean in that context? Mostly, unsafe code is unpredictable, non deterministic and will fail for a number of reasons independent from your program’s code.

Surely, you might think “come on, it cannot be that hard, I’m doing HTTP requests everywhere in my code, and there is no problem”. Well, let’s see how a simple HTTPS request could fail:

  • you are disconnected from the network (you have no IP)
  • you are connected to the network, but it is not connected to anything
  • your network is connected to Internet, but routers are dropping packets
  • your network is connected to Internet, but very slow
  • your DNS server is unreachable
  • your DNS server drops your packets
  • your DNS server cannot parse your request
  • your DNS server cannot contact other server to get your answer
  • your DNS server sends back an invalid response
  • your DNS server sends back an outdated response
  • you cannot reach the web server’s IP from your network
  • the web server drops your packets silently before connecting
  • the web server connects, then drops the connection silently
  • the web server rejects your connection
  • the web server cannot parse your packets, and so, rejects them
  • the web server timeouts
  • the server’s certificate is expired
  • the server’s certificate is not for the right subject name
  • the server’s certification chain has parts missing
  • the server’s certification chain has an unknown root
  • the server’s certificate was revoked
  • the packet’s signatures are invalid
  • your user agent and the server do not support the same versions of TLS
  • your user agent and the server do not have common cipher suites
  • the web server closes the connection without warning
  • the web server timeouts
  • the web server crashes
  • the web server cannot parse your HTTP request and rejects it
  • your request is too large
  • the web server parses your HTTP request correctly, but your cookie or OAuth token is invalid
  • the data you requested does not exist
  • the data you requested is elsewhere
  • your user agent does not support the mime type of the data
  • the data requested is too large for a simple response
  • the server only sends a part of the data, then drops the connection
  • your user agent cannot parse the response
  • your user agent can parse the data, but some way or another, it is invalid

If you have worked for some time with networks, all of those have probably happened to you at some point (and the list is not nearly exhaustive). What did you do in your code? Did you handle all these exceptions? Did you catch all the exceptions (see what I did there)? Do you check for all the error codes? Do you retry the requests where you need to?

Let’s face it: most of the network handling code out there is made of big chunks of procedural code, without much error handling, in blocking mode. In most cases, it is ok. But that is sloppy programming.

Safe languages do not allow you to write sloppy code like that. So, we are stuck between correct but overly complex code, and simple but failing code. Choose your weapons.

Personally, I prefer isolating unsafe code in asynchronous systems like futures or actors. I know failure will happen, I know threads will crash, I know I will make errors in my code. That is ok, it happens. So, let’s write robust code to handle failure.

For network errors, I just want to know if the server is unreachable. It is ok, I will try later. If my request’s authentication is rejected, I want to know, and must handle that failure. Some errors should be handled seriously, others must be put in the “ok, it failed, whatever” bin.

Even if languages like Haskell make it harder to perform IO safely, they are still good tools, because they let you isolate unsafe parts, to let you reason on safe, deterministic parts of the program.

P.S.: ok, the network case was maybe a bit too much. Surely, filesystem usage will be easier? Just for the fun, let’s list some possible failures when you want to open a file for reading and writing:

  • invalid path
  • correct path, but you do not have the permission
  • correct path, you have the permission, but the file does not exists
  • you do not have the permission to create the file
  • you check that the file does not exists, then you try to create it, but someone already created it in the meantime (fun security bug, that one)
  • the file exists, but someone is already writing on it, no concurrent access
  • you have the handle you want on the file, but someone just deleted it
  • not enough file descriptors available (oh, please, no)
  • someone is writing to the file at the same time
  • there are so many page faults that your program is slowed down
  • the disk is slow, blocking on a large operation
  • the disk is full
  • you checked that you have enough room, but someone is filling the disk at the same time
  • the file is on a networked file system, and it is slow
  • the file is on a remote disk, and the network just failed
  • hardware failure in the disk
  • hardware failure in the RAID array (and for some reason, redundancy was not enough, you lost the data)
  • the file is on a USB card that someone just unplugged

Basically, IO is a nightmare. Please wake me up now.

SafeChat, P2P encrypted messages?

For the first article in the new post serie about “let’s pick apart the new kickstarted secure decentralized software of the week”, I chose SafeChat, which started just two days ago. Yes, I like to hunt young preys :p

A note, before we begin: this analysis is based on publicly available information at the time of writing. If the authors of the project give more information, I can update the article to match it. The goal is to assert, with what little we know about the project, if it is a good idea to give money to this project. I will only concentrate on the technical parts, not on the team itself (even if, for some of those projects, I think they’re idiots running with scissors in hand).

What is SafeChat?

Open source encryption based instant messaging software

SafeChat is a brilliantly simple deeply secure instant messaging system for mobile phones and computers

SafeChat is an instant messaging software designed by Commercial Free. There is no real indication about who really works there, and where the company is based, except for David Crawford, who created the Kickstarter project and is based in Montreal in Canada.

Note that SafeChat is only a small part of the services they want to provide. Commercial Free will also have plans including an email encryption service (no info about that one) and cloud storage.

Available technical information

There is not much to see. They say they are almost done with the core code, but the only thing they present is some videos of what the interaction with the app could be.

Apparently, it is an instant messaging application with Android and iOS applications and some server components.  Session keys are generated for the communication between users. They will manage the server component, and the service will be available with a yearly subscription.

It seems they don’t want to release much information about the cryptographic components they use. They talk about “peer to peer encryption” (lol) which is open source and standard. If anyone understands what algorithm or protocol they refer to, please enlighten me. They also say they will mix in some proprietary code (so much for open source).

I especially like the part about NIST. They mock NIST, telling that they have thrown “all standard encryption commonly used today out the window”. I am still wondering what “open source and standard peer to peer encryption” means.

Network protocol

The iOS and Android applications will apparently provide direct communication between users. I guess that from their emphasis on P2P, but also from the price they claim: $10 per user per year would be a bit small to pay for server costs if they had to route all the messages.

P2P communication between phones is technically feasible. They would probably need to implement some TCP hole punching in their solution, but it is doable.

Looking athe the video, it seems there is a key agreement before communication. I do not really like the interaction they chose to represent key agreement (with the colors and the smileys). There are too many different states, while  people only need to know “are we safe now?”

I am not sure if there is a presence protocol. The video does not really show it. If there is no presence system, are messages stored until the person is online? Stored on the server or on the client? Does the server notify the client when the person becomes available?

Cryptography

By bringing together existing theories of cryptography and some proprietary code to bind them together, we are making a deeply encrypted private chatting system that continues to evolve as the field of cryptography does.

Yup, I really feel safe now.

Joke aside, here is what we can guess:

  • session keys for the communication between users. I don’t know if it is a Diffie-Hellman based protocol
  • no rekeying, ie no perfect forward secrecy
  • no info on message authentication or integrity verification
  • I am not sure if the app generates some asymmetric keys for authentication, if there is trust on first use, or whatever else
  • the server might not be very safe, because they really, really want to rely on German laws to protect it. If the crypto was fully managed client side, they would not care about servers taken down, they could just pop another somewhere.

There could be a PKI managed by Commercial Free. That would be consistent with the subscription model (short lived certificates is an easy way of limiting the usage of a service).

Threat model

Now, we can draw the rough threat model they are using:

What we want to do is make it impractical for an organization to snoop your communications as it would become very hard to find them and then harder still to decrypt them.

Pro tip: a system with a central server does not make it hard to find communications.

Attacker types:

  • phone thief: I don’t think they use client side encryption for credentials and logs. Phone thiefs and forensics engineers won’t have a real problem there
  • network operator: they can disrupt the communication, but will probably not be able to decrypt or do MITM (I really think the server is managing the authentication part, along with setting up the communication)
  • law enforcement: they want to rely on German laws to protect their system. At the same time, they do not say they will move out to Germany to operate the system. If they stay in Canada, that changes the legal part. If they use a certificate authority, protecting the server will be useless, because they can just ask the key at the company.
  • server attacker: the server will probably be Windows based (see the core developer’s skills). Since that design is really server centric, taking down the server might take down the whole service. And attacking it will reveal lots of interesting metadata, and probably offer MITM capabilities
  • nation state: please, stop joking…

So…

Really, nothing interesting here. I do not see any reason to give money to this project: there is nothing new, it does not solve big problems like anonymous messaging, or staying reliable if one server is down. Worse, it is probably possible to perform a MITM attack if you manage the server. Nowadays, if you create a cryptographic protocol with client side encryption, you must make sure that your security is based on the client, not the server.

Alternatives to this service:

  • Apple iMessage: closed source, only for iOS, encrypted message, MITM is permitted for Apple by the protocol, but “we have not architected the server for this”. Already available.
  • Text Secure by OpenWhisperSystems: open source, available for iOS and Android, uses SMS as a transport protocol, uses OTR (Off the Record protocol) to protect the communication, no server component. Choose Text Secure! It is really easy to use, and OTR is well integrated in the interface.

The rules of security by obscurity

The first rule of security by obscurity is: DON’T

There, I said it. Now you can stop reading. Or you can continue. But watch where you step.

Security by obscurity is generally frowned upon, because people have relied on it as their only layer of defense. When you think that nobody will be able to reverse engineer your clever code, or find that specific file holding all the secrets, you have already lost. People got quite efficient at reverse engineering and finding secrets.

That’s why it is recommended to rely on safer algorithms and techniques to protect your system. They have been tested, and were created especially for that purpose. That can be encryption algorithms, authorization systems, etc.

The second rule of security by obscurity is: you should not need it

Defense is hard because of information asymmetry: the attacker often has more information than you to approach the system. That could be 0-day vulnerabilities, or the knowledge that you misconfigured one of your servers. The attacker has basically more time and more money than what you can spend on security.

The tools you have at hand? Patching the code, controlling authorizations, verifying your logs… You get a system that should not be easy to attack even if the attacker is familiar with the underlying software.

Except that the attacker might be well informed on the problems of deploying that particular CMS. Or there’s a specific vulnerability you don’t know about that is exploited automatically by botnets…

The third rule of security by obscurity is: DON’T, but…

The goal of your defense layers is to drive up the cost of an attack. A really motivated attacker will always find a way (with enough ressources, they could break in your office to steal data directly). So, what you want is to make an attack costly, to drive off attackers with less ressources.

And what will security through obscurity provide you? Time! They will provide you with time to defend, and waste the attacker’s time. It is in no way enough to protect you, but it can give you a lot of benefits:

  • confuse automated tools: moving some specific files or pages of your CMS out of their default location will prevent most automated attacks. That will not stop human attackers, but they might need to modify their scripts, and we all know that’s annoying :p
  • slow down information gathering: removing the server headers is a pretty standard practice.  Some more sophisticated tools might be able to guess server version and/or framework type in other ways, but basic tools will not.
  • lie to the attackers to send them through a honeypot. Then, you can observe them, learn about their process and prepare for other attacks.
  • detect suspect behaviour, like bruteforce queries, and send bogus data instead of rejecting them. That means more data to analyze manually for the adversary.
  • Are you trying to send encrypted data? Do you really need a public handshake protocol, displaying the whole algorithm negotiation? Sometimes, communicating directly with a pre shared key and pre negotiated algorithms will work just fine, and only show garbage to the attacker.

See the pattern there? We already know ressourceful attackers will get past those false defenses. But the point is to make them waste time in basically three ways:

  • forbid automated attacks, they should ressort to manual ways
  • make them work to get useful information
  • mess with their heads

Security by obscurity is basically the fun side of defense, because you’re always looking for ways to annoy the adversary ;)

The fourth rule of security by obscurity is: you are not the attacker

While those defense techniques can be useful, be aware that they can significantly hamper your day to day work. They should not prevent you from managing your system correctly. Every “WTF” moment for an attacker might be a “WTF” moment for one of your developers, system administrators, or even worse, one of your users.

Worse, sometimes, it might annoy you, but will be bypassed easily by the adversary (do not underestimate their ability to write clever automated tools).

So, be careful with security by obscurity, do not rely only on it, and have fun annoying the attackers :)

Theoretical definitions for crypto wannabes

Every week, I hear about a new secure software designed to protect your privacy, thwart the NSA/GCHQ and save kittens. Most of the time, though, they’re started by people that are very enthusiastic yet unskilled.

They tend to concentrate directly on choosing algorithms and writing code, instead of stepping back and thinking a bit about what they want to develop.

Sure, they probably spent some time saying things like:

  • that piece of data should absolutely be encrypted
  • users will all have key pairs to authenticate themselves
  • we should use AES, that’s the safest choice (what are theses “modes” you’re talking about?)

That is not how you design a protocol. That is not how you design a software using encryption. And that is not how you will design the next secure distributed social network.

To design your system, you need three things:

  • a good threat model
  • theoretical tools addressing the threats
  • algorithms implementing these theoretical tools

As you see, most of the projects only have the third item, and that’s insufficient to design a correct system. If you don’t have a good threat model, you don’t have a good mental model of your users and attackers, their means and their objectives. If you don’t have the theoretical tools, you will try to shoehorn your favorite algorithm on the problem without knowing if it really fits (example: using hash algorithms to store passwords :p).

So, in this post, I’ll provide those (simplified) theoretical definitions. You will probably recognize some of them.

High level view

First, you need to forget notions like “privacy”, use any of these terms to describe the properties you want to achieve:

  • authentication: you can recognize which entity you are communicating with
  • authorization: the entity cannot get access to data it has no permission on (note that it is different from authentication)
  • confidentiality: the data should not be readable by an entity that has no permission on it (it can be protected by crypto, but also by policies in the code)
  • integrity: unauthorized modification of the data can be detected and marked as invalid
  • non repudiation: an entity cannot deny it has executed an action
  • deniability: an entity _can_ deny it has executed an action

Ok, now that we have some basic properties, let’s apply them: think for a long time about the actors of the system (users, malicious users, admins, sysadmins, random attacker on the network, etc), what authorizations they have, what they should not get access to, what data moves on the network and between whom.

You should now have a very basic threat model and a rough overview of your system or protocol: you know what part of the network communications should be confidential, you know where you would need to authenticate.

You will now need some ideas about the type of attacks that could happen to your system, because you probably did not think of everything. Separate your systems in logical parts (like “client”, and “server”, etc), observe them, and observe how they communicate.

Common attacks

Security properties

Here are some security properties that will be useful when you will try to choose algorithms later:

  • Random oracle: a system that answers deterministically every question with a random answer from its answer space. You cannot predict what it will answer, but if you send the same question twice, it will answer the same both times.
  • Perfect secrecy: for any two plaintext messages of same size, an attacker cannot distinguish which plaintext maps to which ciphertext. Basically, the adversary learns nothing from the ciphertext only.
  • Semantic security: same thing as perfect secrecy, except what the adversary learns on the plaintext is negligible. example: One Time Pad

Security tests

those properties can be tested by creating a “game”, where the attacker tries to guess information on the data:

  • IND-CPA (indistinguishability under the chosen plaintext attack) test: the adversary can generate as many ciphertexts as he wants. Then, he chooses two messages m0 and m1 of same length, those messages are encrypted, and one of the ciphertexts is sent back to him. The adversary should not be able to guess which message was used to generate that ciphertext (note that this is just one way of testing for CPA, there are many other schemes, some with stronger properties)
  • IND-CCA (indistinguishability under the chosen ciphertext attack): the attacker can get the decryption of arbitrary ciphertexts, but should not, from this, be able to decrypt any other ciphertext.

Attack patterns

Here are some common attack types that can be applied to crypto protocols. The list is not exhaustive, and covers only crypto attacks: there are many more ways to attack a system.

  • Replay attacks: the attacker has observed some valid (encrypted or not) data going on the wire, and tries to send it again. Obviously, it should not be accepted
  • MITM (Man In The Middle): the attacker can observe and modify live data running between two actors of the system. In the worst case, the attacker should not be able to forge valid data, decrypt data, or impersonate one of the users. In the best case, it should be detectable.
  • Oracle attack: when an algorithm or protocol has a part that can act as an oracle (can be asked something and give answer, like a server), an attacker could exploit flaws in the algorithm to get useful information on the data (then the oracle is not a random oracle). Timing attacks are part of this type of attack. See also padding oracle attacks, or the recent BREACH attack on TLS.
  • Offline attack: the attacker got access to some encrypted data (on the wire, or by accessing a disk somewhere), stored it, and tries to decrypt it for an amount of time

You should now have a better view of the system: what are the parts of the system that need protection, what attacks they must resist, and what properties they should have.

That means we can go to the next part: choosing the tools to implement the solution.

General cryptographic functions

No, we will not choose algorithms right now. That would be too easy :D

We will choose from a list of cryptographic constructions that implement some of the security properties of the system, and combine them to meet all the needed properties:

  • Secure Pseudo Random Function: function from spaces K (keys) and X (message) to Y (other message space). The basic definition is that if you choose randomly one of these functions (like choosing randomly a k from K), its output will appear totally random (testable with IND-CPA).
  • Pseudo Random Permutation: this is a PRF where X and Y are the same space. It is bijective (every y from Y maps to exactly one x from X, and every y is an output of the function), and there exists an efficient inversion function for it (from Y to X). Example: AES (in ECB, which is unsafe for common use)
  • Message Authentication Code: defines a pair of algorithms. One of them takes a key k and a message m and outputs a code c. The other algorithm takes k, m, c and outputs True or False. An attacker should not be able to forge a valid c without knowing k. A PRF could be used to construct a MAC system. A hash function too. Example: HMAC
  • Authenticated encryption: an encryption function (with semantic security under the CPA) where the attacker cannot forge new ciphertexts that decrypt correctly. Example: AES-GCM.
  • Hash function: collision resistance (cannot find different messages m1, m2 so that H(m1) == H(m2), with different collision levels). Not easily invertible. Usually, they are fast. Example: SHA2.
  • Trapdoor function: a function that is easy to compute, for which finding its inverse is hard, unless you have specific information. (secure under IND-CCA). examples: RSA, DSA.
  • Zero knowledge proof: a way to prove something to the other party in a communication, without giving her any info, except the proof.

There are a lot of other constructions, depending on your needs, from low level algorithms like the Diffie Hellman key exchange to higher level protocols like OTR. Again, the constructions will depend on the security properties.

Choosing the algorithms

Can we do it now? YES! But there are rules. You must not choose an algorithm because it’s hype or because someone said so in an old book. Basically, you choose algorithms that implement the properties you need (like authenticated encryption), and you choose the parameters of the algorithm (key size, exponent, elliptic curve) depending on the strength you need. Basically, a key size can define how much time encrypted data should remain impossible to decrypt. Those parameters also define the performance of the algorithm. Don’t choose them without consulting experts, or you will face problems similar to those encountered by the projects that used low RSA exponents (it looks good from a performance standpoint, but it introduces very bad security).

Am I done now?

Nope. We have only define some very high level parts. Creating a protocol implies a lot of thoughts on:

  • how you establish the communications
  • protocol negotiation (version, algorithms, etc)
  • key exchange
  • authentication
  • nonce usage
  • storing sessions
  • handling lost connections
  • renegotiation
  • closing the connection
  • etc.

As you can see, designing a protocol involves a lot more than choosing a few algorithms. Note that this was only a very rough overview of what you would need to create a safe system. And we did not even start coding!

So, if you want to build the next privacy protecting system, please talk to experts. They don’t necessarily want to make you feel bad. They just have a lot of formal tools and the experience needed to see what will not work.

Group messaging crypto and the CAP theorem

I often play with group messaging ideas, and recently, an interesting perspective came to me, about the relation between these messaging systems and the constraints of the CAP theorem.

What is the CAP theorem?

Small theoretical background here, feel free to skip if you already know what it is about

Otherwise known as Brewer’s theorem, indicates that three properties are important in distributed systems:

  • Consistency (all nodes see the same data)
  • Availability (every request receives a response)
  • Partition tolerance (the system still works through network splits or node failure)

The CAP theorem tells us that a distributed cannot have the three properties at the same time. >It is not really a “two out of three” like most people tend to say, but more of a compromise you have to make. Some examples:

  • in traditional databases systems (with a master-slave model), consistency and availability are high (all nodes can answer with th same data), but partition tolerance is weak because the master is a SPOF.
  • in a fully distributed database (no master model), availability is high (all the nodes can answer), partition tolerance is high (a subset of the cluster could act as the whole database), but consistency is weak (data must be replicated to all the relevant nodes, and that can take time, so the nodes may not see the same data at the same time)
  • in Bitcoin, the consistency is good (all nodes must agree to the same block chain) and partition tolerance is good (for downloading the block chain), but availability is weak for writing, because a new transaction must first propagates to a large enough set of nodes, then a block must be calculated for the transaction to be stored

As you can see, you can shape those properties depending on how you want your system to behave. Maybe you want fast reads, or fast writes, or very strong replication, etc.

Group messaging constraints

This has always been a challenge. Traditionally, chat systems follow a client-server model, where the server redistributes all the messages. As we saw previously, that is bad against partition problems. The usual solution is to have multiple servers talking with each other, as we can see in IRC or XMPP.

For a fully distributed messaging system (if we forget for a moment all the nasty NAT problems),  communication becomes a routing problem: making sure the nodes get all the messages fast enough and in the right order. If you’re building a distributed Twitter, it’s not really a problem, but for an interactive chat system, this becomes really hard. You can try to send your messages directly to all the users in the chat room, but as more people join, sending and receiving all the messages takes more and more time, and so, you sacrifice availability and a bit of consistency (you will not necessarily receive all the messages).

Group messaging crypto

The group messaging problems seemed hairy? Let’s add crypto in the mix, just for fun! What security properties would we want to add to a messaging system?

  • Authentication: every user knows with whom he is talking
  • Confidentiality: an external observer cannot see the content of a message
  • Tampering proof: users can detect when a message has been modified and reject them
  • Perfect forward secrecy: compromising one key does not compromise the whole conversation or past conversations

There are others that we could want, but let’s just concentrate on these ones for now. We can separate the messaging in two phases: the discovery, and the transport. The discovery is when nodes begin talking to each other, authenticating each other, establishing session keys, managing key rotation, etc. Note that discovery can happen at any time, as a new node can appear after a long time. The transport is about sending messages safely and verifying messages.

I think we can assume that, once all the nodes have authenticated themselves and agreed on session keys, the transport is the easy part. Whatever the underlying transport system (broadcast, XMPP, SMTP, etc), once you can send, receive and verify messages, everything is easy.

The interesting part is the discovery. That is where the similarity with a distributed system is evident. You want a lot of nodes to start communicating with each other, you want to propagate information to all the nodes (like session keys), you must handle nodes that go up or down, and clusters splitting.

That’s where the CAP theorem is useful to understand the constraints of the system.

Group authentication is an availability and partition problem: if you cannot start the communication until all the nodes have authenticated each other, you are vulnerable to slow or failing nodes.

Tampering proof is a consistency problem: all the nodes must know the same signature keys to agree on whether a message is correct or not.

Confidentiality is a consistency and partition problem: if you must reach consensus on a group-wide session key, you have to wait for all the nodes to get the new key (consistency). That happens a lot if you want perfect forward secrecy, where you must change keys regularly. Moreover, in case of partition, the different groups will agree to different keys, and a conflict will appear once the split is over.

As you can see, a security model can help you choose which tools (cryptographic or not) you will use to ensure the safety of the system, but with the distributed system theory, you will be able to predict and recognize the behaviour of the system.

Some examples

Let’s see how some more or less known system handle that:

GPG and email

GPG+email is, at its heart, a secure group messaging system. If we analyze its properties, wen can see that it is very good against partition: SMTP handles splits quite well, it can resend messages if they did not pass, or store them for a few days until the next server is up.

For availability, it doesn’t fare very well at high loads if you want to encrypt messages, because you need to encrypt for every host. You cannot take advantage of SMTP’s architecture to reduce the load.

It also has very bad consistency. If you want every node to agree on the keys of each user, you have to do one to one offline meetings between each of the participants. In practice, users make a tradeoff here, so the security assumption is not completely true.

Multi party off the record messaging

There is a paper you can read about MP OTR, which builds on the previous OTR algorithm to provide secure multi party communication. That protocol relies on a heavy setup phase where all the nodes authenticate each other and generate a group encryption key.

This model will fare quite well once consensus has been reached: all the nodes know the ephemeral signature keys and the group encryption key.

Unfortunately, in case of a user joining or leaving the chatroom, the whole shutdown and setup process must happen, making the chatroom unavailable in the meantime.

What about yours?

I see a lot of new projects appearing lately, with the intention to “fix” chat systems or social networks by building a “secure” distributed system. Unfortunately, most of them do not have a serious background in security or distributed systems. The subject of this article, the CAP theorem, is a very small part of the way we think about distributed systems: people have been researching on the subject for years. Similarly, cryptography and protocols have improved a lot lately.

So, if you are building one of these, take your time, forget the hype, read up a bit about the theory, and think about your model before you make your technological choices. Please don’t repeat the worst mistakes.

My ideal job posting

This post is a translation of something I wrote in French for Human Coders. They asked me what would be the ideal job post from a developer’s standpoint:

How whould you write a job announcement attracting good developers? Most recruiters complain that finding the right candidates is an ordeal.

If you ask me, it is due to very old recruitment practices: writing job posts for paper ads (where you pay by the letter), spamming them to as many people as possible, mandating fishy head hunters… This has worked in the past, but things changed. A lot more companies are competing to recruit developers, and many more developers are available, so separating the wheat from the chaff is harder.

We can do better! Let’s start from scratch. For most candidates, a job posting is the first contact they’ll have with your company. It must be attrctive, exciting! When I read “the company X is the leader on the market of Y”, I don’t think that they’re awesome, I just wonder what they really do.

A job posting is a marketing document. Its purpose is not to filter candidates, but to attract them! And how do you write a marketing document?

YOU. TALK. ABOUT. THE. CLIENT. You talk about his problems, his aspirations, and only then, will you tell him how you will make things better for him. For a job posting, you must do the same. You talk to THE candidate. Not to multiple candidates, not to the head hunter or the HR department, but to the candidate. Talk about the candidate, and only the candidate. When a developer looks for a job, she doesn’t want to “work on your backend application” or “maintain the web server”. That is what she will do for you. This is what she wants:

  • getting paid to write code
  • work on interesting technologies
  • a nice workplace atmosphere
  • learn
  • etc.

A good job posting should answer the candidate’s aspirations, and talk about the carrer path. Dos this job lead to project management? Do you propose in-house training? Is there a career path for expertise in your caompany?

Do you share values with your candidate? I do not mean the values written by your sales team and displayed on the “our values” page of your website. I am talking about the values of the team where the candidate will end up. Do they work in pair programming? Do they apply test driven development? There is no best way to work, the job can be done in many ways, so you must make sure that the candidate will fit right in.

What problem are you trying to solve with your company? Do you create websites that can be managed by anyone? Do you provide secure hosting? Whatever your goal is, talk about it instead of taliking about the product. I do not want to read, “our company develops a mobile server monitoring tool”, because that is uninteresting. If I read “we spent a lot of time on call for diverse companies, so we understood that mobility is crucial for some system administrators, so we created a tool tailored for moving system administrators”, I see a real problem, a motivation to work, a culture that I could identify to.

By talking that way to the candidate, you will filter candidates on motivation and culture instead of filtering on skills. That can be done later, once you see the candidate You did not really think that a resume was a good way to select skilled people, do you?

Here is a good example of fictive job posting, from a company aggregating news for developers, looking for a Rails developer:

“You are a passionnate Ruby on Rails developers, you are proiud of you unit tests, and you enjoy the availability of Heroku’s services? That’s the same for us!

At Company X, we love developers: all of our services are meant for their fulfillment. We propose a news website about various technologies, higly interesting trainings and a job board for startups.

Our employees benefit fully from these services, and make talks in conferences all around France. By directly talking with other developers, they quickly get an extensive knowledge of current technologies.

Our news website is growing fast, so we need help to scale it. The web app uses Heroku and MongoDB, with a CoffeeScript front end. Are you well versed in Rails optimization? If yes, we would love to talk with you!”

Note that I did not talk about years of experience, or a city. I want to hire a Rails developer, not necessarily a french developer. I want someone with experience in optimization, not someone over 27.

With such a job posting, you will receive a lot more interesting employment applications. Now, are you afraid that it will incur a lot more work? The solution in a future post: how to target candidates efficiently? Get away from job boards!

HN Discuss on Hacker News

Filter Rails JSON input with route constraints

Following the recent YAML parsing vulnerabilities in Rails, I decided to act on an idea I had a few months ago: using route constraints to define strict API contracts in Rails.

Sadly, it does not protect against the YAML parsing problem (and the probable similar vulnerabilities we will see in the following months), because the request is interpreted before going through the route constraints. But it can protect from the mass assignment vulnerability., and probably some SQL injections.

Here is the idea: Rails 3 introduced the route constraints, a way to execute fonctions on the request before it is passed to the controllers. By combining it with the json-schema gem, we can filter the JSON input quite easily.

For the following example data:

{"echo" : "blah", "nb" : 1, "data": [1, 2, 3]}

We can define the following schema:

{
    "type": "object",
    "$schema": "http://json-schema.org/draft-03/schema",
    "id": "#",
    "required": false,
    "additionalProperties": false,
    "properties": {
        "data": {
            "type": "array",
            "id": "data",
            "required": false,
            "items": {
                "type": "number",
                "id": "0",
                "required": false
            }
        },
        "echo": {
            "type": "string",
            "id": "echo",
            "required": false
        },
        "nb": {
            "type": "number",
            "id": "nb",
            "required": false
        }
    }
}

(Use the JSON schema generator to create your own)

Save this schema to “data.schema” and add “json-schema” to your Gemfile. You will then be able to filter inputs with code like the following “config/routes.rb”:

require "json-schema"
class LolJSONConstraint
  def matches?(request)
    if(request.headers["CONTENT_TYPE"] == "application/json")
      JSON::Validator.validate("data.schema", request.headers["action_dispatch.request.request_parameters"])
    end
  end
end

Yamlvuln::Application.routes.draw do
  resources :posts, :constraints => LolJSONConstraint.new
end

The constraint will load the schema, and apply it to the incoming data, and return a 404 error if the JSON is invalid. The “additionalProperties” set to false in the schema is required to refuse the properties you didn’t define and protect the application from mass assignment.

If I tried, for example, to send the following JSON to the application, there would be an error:

{"echo" : "blah", "nb" : "UNION ALL SELECT LOAD_FILE(CHAR(34,47,101,116,99,47,112,97,115,115,119,100,34))", "data": [1, 2, 3]}

As I said before, it is not safe against the YAML parsing vulnerability. Also, I did not really test the performance of this. But it is still a nice and easy solution for API filtering.

Looking for big architectures and adventurous sysadmins

Last week, I wrote a post about SSL optimization that showed the big interest people have in getting the absolute best performance from their web servers.

That post was just a small part of the ebook on SSL tuning I am currently writing. This ebook will cover various subjects:

  • algorithms comparison
  • handshake tuning
  • HSTS
  • session tickets
  • load balancers

I test a lot of different architectures, to provide you with tips directly adaptable to your system (like I did previously with Apache and Nginx). But I don’t have access to every system under the sun…

So, if you feel adventurous enough to try SSL optimization on your servers, please contact me, I would be happy to help you!

I am especially interested in large architectures (servers in multiple datacenters around the world, large load balancers, CDNs) and mobile application backends.

And don’t forget to check out the ebook, to be notified about updates!

5 easy tips to accelerate SSL

Photo credit: TheKenChan - http://www.flickr.com/photos/67936989@N00/2678539087/

Update: following popular demand, the article now includes nginx commands :)

Update 2: thanks to jackalope from Hacker News, I added a missing Apache directive for the cipher suites.

Update 3: recent attacks on RC4 have definitely made it a bad choice, and ECDHE cipher suites got improvements.

SSL is slow. These cryptographic algorithms eat the CPU, there is too much traffic, it is too hard to deploy correctly. SSL is slow. Isn’t it?

HELL NO!

SSL looks slow, because you did not even try to optimize it! For that matter, I could say that HTTP is too verbose, XML web services are verbose too, and all this traffic makes the website slow. But, SSL can be optimized, as well as everything!

Slow cryptographic algorithms

The cryptographic algorithms used in SSL are not all created equal: some provide better security, some are faster. So, you should choose carefully which algorithm suite you will use.

The default one for Apache 2′s SSLCipherSuite directive is: ALL: !ADH:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP

You can translate that to a readable list of algorithms with this command: openssl ciphers -v ‘ALL:!ADH:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP’

Here is the result:

DHE-RSA-AES256-SHA      SSLv3 Kx=DH       Au=RSA  Enc=AES(256)  Mac=SHA1
DHE-DSS-AES256-SHA      SSLv3 Kx=DH       Au=DSS  Enc=AES(256)  Mac=SHA1
AES256-SHA              SSLv3 Kx=RSA      Au=RSA  Enc=AES(256)  Mac=SHA1
DHE-RSA-AES128-SHA      SSLv3 Kx=DH       Au=RSA  Enc=AES(128)  Mac=SHA1
DHE-DSS-AES128-SHA      SSLv3 Kx=DH       Au=DSS  Enc=AES(128)  Mac=SHA1
AES128-SHA              SSLv3 Kx=RSA      Au=RSA  Enc=AES(128)  Mac=SHA1
EDH-RSA-DES-CBC3-SHA    SSLv3 Kx=DH       Au=RSA  Enc=3DES(168) Mac=SHA1
EDH-DSS-DES-CBC3-SHA    SSLv3 Kx=DH       Au=DSS  Enc=3DES(168) Mac=SHA1
DES-CBC3-SHA            SSLv3 Kx=RSA      Au=RSA  Enc=3DES(168) Mac=SHA1
DHE-RSA-SEED-SHA        SSLv3 Kx=DH       Au=RSA  Enc=SEED(128) Mac=SHA1
DHE-DSS-SEED-SHA        SSLv3 Kx=DH       Au=DSS  Enc=SEED(128) Mac=SHA1
SEED-SHA                SSLv3 Kx=RSA      Au=RSA  Enc=SEED(128) Mac=SHA1
RC4-SHA                 SSLv3 Kx=RSA      Au=RSA  Enc=RC4(128)  Mac=SHA1
RC4-MD5                 SSLv3 Kx=RSA      Au=RSA  Enc=RC4(128)  Mac=MD5 
EDH-RSA-DES-CBC-SHA     SSLv3 Kx=DH       Au=RSA  Enc=DES(56)   Mac=SHA1
EDH-DSS-DES-CBC-SHA     SSLv3 Kx=DH       Au=DSS  Enc=DES(56)   Mac=SHA1
DES-CBC-SHA             SSLv3 Kx=RSA      Au=RSA  Enc=DES(56)   Mac=SHA1
DES-CBC3-MD5            SSLv2 Kx=RSA      Au=RSA  Enc=3DES(168) Mac=MD5 
RC2-CBC-MD5             SSLv2 Kx=RSA      Au=RSA  Enc=RC2(128)  Mac=MD5 
RC4-MD5                 SSLv2 Kx=RSA      Au=RSA  Enc=RC4(128)  Mac=MD5 
DES-CBC-MD5             SSLv2 Kx=RSA      Au=RSA  Enc=DES(56)   Mac=MD5 
EXP-EDH-RSA-DES-CBC-SHA SSLv3 Kx=DH(512)  Au=RSA  Enc=DES(40)   Mac=SHA1 export
EXP-EDH-DSS-DES-CBC-SHA SSLv3 Kx=DH(512)  Au=DSS  Enc=DES(40)   Mac=SHA1 export
EXP-DES-CBC-SHA         SSLv3 Kx=RSA(512) Au=RSA  Enc=DES(40)   Mac=SHA1 export
EXP-RC2-CBC-MD5         SSLv3 Kx=RSA(512) Au=RSA  Enc=RC2(40)   Mac=MD5  export
EXP-RC4-MD5             SSLv3 Kx=RSA(512) Au=RSA  Enc=RC4(40)   Mac=MD5  export
EXP-RC2-CBC-MD5         SSLv2 Kx=RSA(512) Au=RSA  Enc=RC2(40)   Mac=MD5  export
EXP-RC4-MD5             SSLv2 Kx=RSA(512) Au=RSA  Enc=RC4(40)   Mac=MD5  export

28 cipher suites, that’s a lot! Let’s see if we can remove the unsafe ones first! You can see at the end of the of the list 7 ones marked as “export”. That means that they comply with the US cryptographic algorithm exportation policy. Those algorithms are utterly unsafe, and the US abandoned this restriction years ago, so let’s remove them:
‘ALL:!ADH:!EXP:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2′.

Now, let’s remove the algorithms using plain DES (not 3DES) and RC2: ‘ALL:!ADH:!EXP:!LOW:!RC2:RC4+RSA:+HIGH:+MEDIUM’. That leaves us with 16 algorithms.

It is time to remove the slow algorithms! To decide, let’s use the openssl speed command. Use it on your server, ecause depending on your hardware, you might get different results. Here is the benchmark on my computer:

OpenSSL 0.9.8r 8 Feb 2011
built on: Jun 22 2012
options:bn(64,64) md2(int) rc4(ptr,char) des(idx,cisc,16,int) aes(partial) blowfish(ptr2) 
compiler: -arch x86_64 -fmessage-length=0 -pipe -Wno-trigraphs -fpascal-strings -fasm-blocks
  -O3 -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DMD32_REG_T=int -DOPENSSL_NO_IDEA
  -DOPENSSL_PIC -DOPENSSL_THREADS -DZLIB -mmacosx-version-min=10.6
available timing options: TIMEB USE_TOD HZ=100 [sysconf value]
timing function used: getrusage
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2               2385.73k     4960.60k     6784.54k     7479.39k     7709.04k
mdc2              8978.56k    10020.07k    10327.11k    10363.30k    10382.92k
md4              32786.07k   106466.60k   284815.49k   485957.41k   614100.76k
md5              26936.00k    84091.54k   210543.56k   337615.92k   411102.49k
hmac(md5)        30481.77k    90920.53k   220409.04k   343875.41k   412797.88k
sha1             26321.00k    78241.24k   183521.48k   274885.43k   322359.86k
rmd160           23556.35k    66067.36k   143513.89k   203517.79k   231921.09k
rc4             253076.74k   278841.16k   286491.29k   287414.31k   288675.67k
des cbc          48198.17k    49862.61k    50248.52k    50521.69k    50241.28k
des ede3         18895.61k    19383.95k    19472.94k    19470.03k    19414.27k
idea cbc             0.00         0.00         0.00         0.00         0.00 
seed cbc         45698.00k    46178.57k    46041.10k    47332.45k    50548.99k
rc2 cbc          22812.67k    24010.85k    24559.82k    21768.43k    23347.22k
rc5-32/12 cbc   116089.40k   138989.89k   134793.49k   136996.33k   133077.51k
blowfish cbc     65057.64k    68305.24k    72978.75k    70045.37k    71121.64k
cast cbc         48152.49k    51153.19k    51271.61k    51292.70k    47460.88k
aes-128 cbc      99379.58k   103025.53k   103889.18k   104316.39k    97687.94k
aes-192 cbc      82578.60k    85445.04k    85346.23k    84017.31k    87399.06k
aes-256 cbc      70284.17k    72738.06k    73792.20k    74727.31k    75279.22k
camellia-128 cbc        0.00         0.00         0.00         0.00         0.00 
camellia-192 cbc        0.00         0.00         0.00         0.00         0.00 
camellia-256 cbc        0.00         0.00         0.00         0.00         0.00 
sha256           17666.16k    42231.88k    76349.86k    96032.53k   103676.18k
sha512           13047.28k    51985.74k    91311.50k   135024.42k   158613.53k
aes-128 ige      93058.08k    98123.91k    96833.55k    99210.74k   100863.22k
aes-192 ige      76895.61k    84041.67k    78274.36k    79460.06k    77789.76k
aes-256 ige      68410.22k    71244.81k    69274.51k    67296.59k    68206.06k
                  sign    verify    sign/s verify/s
rsa  512 bits 0.000480s 0.000040s   2081.2  24877.7
rsa 1024 bits 0.002322s 0.000111s    430.6   9013.4
rsa 2048 bits 0.014092s 0.000372s     71.0   2686.6
rsa 4096 bits 0.089189s 0.001297s     11.2    771.2
                  sign    verify    sign/s verify/s
dsa  512 bits 0.000432s 0.000458s   2314.5   2181.2
dsa 1024 bits 0.001153s 0.001390s    867.6    719.4
dsa 2048 bits 0.003700s 0.004568s    270.3    218.9

We can remove the SEED and 3DES suite because they are slower than the other. DES was meant to be fast in hardware implementations, but slow in software, so 3DES (which runs DES three times) is slower. On the contrary, AES can be very fast in software implementations, and even more if your CPU provides specific instructions for AES. You can see that with a bigger key (and so, better theoretical security), AES gets slower. Depending on the level of security, you may choose different key sizes. According to the key length comparison, 128 might be enough for now. RC4 is a lot faster than other algorithms. AES is considered safer, but the implementation in SSL takes into account the attacks on RC4. So, we will propose this one in priority. Following recent researches, it appears that RC4 is not safe enough anymore. And ECDHE got a performance boost with recent versions of OpenSSL. So, let’s forbid RC4 right now!

So, here is the new cipher suite: ‘ALL:!ADH:!EXP:!LOW:!RC2:!3DES:!SEED:!RC4:+HIGH:+MEDIUM’

And the list of ciphers we will use:

DHE-RSA-AES256-SHA      SSLv3 Kx=DH       Au=RSA  Enc=AES(256)  Mac=SHA1
DHE-DSS-AES256-SHA      SSLv3 Kx=DH       Au=DSS  Enc=AES(256)  Mac=SHA1
AES256-SHA              SSLv3 Kx=RSA      Au=RSA  Enc=AES(256)  Mac=SHA1
DHE-RSA-AES128-SHA      SSLv3 Kx=DH       Au=RSA  Enc=AES(128)  Mac=SHA1
DHE-DSS-AES128-SHA      SSLv3 Kx=DH       Au=DSS  Enc=AES(128)  Mac=SHA1
AES128-SHA              SSLv3 Kx=RSA      Au=RSA  Enc=AES(128)  Mac=SHA1
RC4-SHA                 SSLv3 Kx=RSA      Au=RSA  Enc=RC4(128)  Mac=SHA1
RC4-MD5                 SSLv3 Kx=RSA      Au=RSA  Enc=RC4(128)  Mac=MD5 
RC4-MD5                 SSLv2 Kx=RSA      Au=RSA  Enc=RC4(128)  Mac=MD5

9 ciphers, that’s much more manageable. We could reduce the list further, but it is already in a good shape for security and speed. Configure it in Apache with this directive:

SSLHonorCipherOrder On
SSLCipherSuite ALL:!ADH:!EXP:!LOW:!RC2:!3DES:!SEED:!RC4:+HIGH:+MEDIUM

Configure it in Nginx with this directive:

ssl_ciphers ALL:!ADH:!EXP:!LOW:!RC2:!3DES:!SEED:!RC4:+HIGH:+MEDIUM

You can also see that the performance of RSA gets worse with key size. With the current security requirements (as of now, January 2013, if you are reading this from the future). You should choose a RSA key of 2048 bits for your certificate, because 1024 is not enough anymore, but 4096 is a bit overkill.

Remember, the benchmark depends on the version of OpenSSL, the compilation options and your CPU, so don’t forget to test on your server before implementing my recommandations.

Take care of the handshake

The SSL protocol is in fact two protocols (well, three, but the first is not interesting for us): the handshake protocol, where the client and the server will verify each other’s identity, and the record protocol where data is exchanged.

Here is a representation of the handshake protocol, taken from the TLS 1.0 RFC:

      Client                                               Server

      ClientHello                  -------->
                                                      ServerHello
                                                     Certificate*
                                               ServerKeyExchange*
                                              CertificateRequest*
                                   <--------      ServerHelloDone
      Certificate*
      ClientKeyExchange
      CertificateVerify*
      [ChangeCipherSpec]
      Finished                     -------->
                                               [ChangeCipherSpec]
                                   <--------             Finished
      Application Data             <------->     Application Data

You can see that there are 4 messages exchanged before any real data is sent. If a TCP packet takes 100ms to travel between the browser and your server, the handshake is eating 400ms before the server has sent any data!

And what happens if you make multiple connections to the same server? You do the handshake every time. So, you should activate Keep-Alive. The benefits are even bigger than for plain unencrypted HTTP.

Use this Apache directive to activate Keep-Alive:

KeepAlive On

Use this nginx directive to activate keep-alive:

keepalive_timeout 100

Present all the intermediate certification authorities in the handshake

During the handshake, the client will verify that the web server’s certificate is signed by a trusted certification authority. Most of the time, there is one or more intermediate certification authority between the web server and the trusted CA. If the browser doesn’t know the intermediate CA, it must look for it and download it. The download URL for the intermediate CA is usually stored in the “Authority information” extension of the certificate, so the browser will find it even if the web server doesn’t present the intermediate CA.

This means that if the server doesn’t present the intermediate CA certificates, the browser will block the handshake until it has downloaded them and verified that they are valid.

So, if you have intermediate CAs for your server’s certificate, configure your webserver to present the full certification chain. With Apache, you just need to concatenate the CA certificates, and indicate them in the configuration with this directive:

SSLCertificateChainFile /path/to/certification/chain.pem

For nginx, concatenate the CA certificate to the web server certificate and use this directive:

ssl_certificate /path/to/certification/chain.pem

Activate caching for static assets

By default, the browsers will not cache content served over SSL, for security. That means that your static assets (Javascript, CSS, pictures) will be reloaded on every call. Here is a big performance failure!

The fix for that: set the HTTP header “Cache-Control: public” for the static assets. That way, the browser will cache them. But don’t activate it for the sensitive content, beacuase it should not be cached on the disk by your browser.

You can use this directive to enable Cache-Control:

<filesMatch ".(js|css|png|jpeg|jpg|gif|ico|swf|flv|pdf|zip)$">
Header set Cache-Control "max-age=31536000, public"
</filesMatch>

The files will be cached for a year with the max-age option.

For nginx, use this:

location ~ \.(js|css|png|jpeg|jpg|gif|ico|swf|flv|pdf|zip)$ {
    expires 24h;
    add_header Cache-Control public;
}

Update: it looks like Firefox ignores the Cache-Control and caches everything from SSL connections, unless you use the “no-store” option.

Beware of CDN with multiple domains

If you followed a bit the usual performance tips, you already offloaded your static assets (Javascript, CSS, pictures) to a content delivery network. That is a good idea for a SSL deployment too, BUT, there are caveats:

  • your CDN must have servers accessible over SSL, otherwise you will see the “mixed content” warning
  • it must have “Keep-Alive” and “Cache-control: public” activated
  • it should serve all your assets from only one domain!

Why the last one? Well, even if multiple domains point to the same IP, the browser will do a new handshake for every domain. So, here, we must go against the common wisdom of separating your assets on multiple domains to profit from the parallelized request in the browser. If all the assets are served from the same domain, there will only be one handshake. It could be fixed to allow multiple domains, but this is beyond the scope of this article.

More?

I could talk for hours about how you could tweak your web server performance with SSL. There is alot more to it than these easy tips, but I hope those will be of useful for you!

If you want to know more, I am currently writing an ebook about SSL tuning, and I would love to hear your comments about it!

If you need help with your SSL configuration, I am available for consulting, and always happy to work on interesting architectures.

By the way, if you want to have a good laugh with SSL, read “How to get a certificate signed by multiple certification authorities” :)

PilotSSH: manage your server in a few touches

I just released Pilot SSH, a server administration application for iPhone. So, why another SSH application? Aren’t there dozens of these already?

I tried a lot of those shell applications, and they felt clunky on a phone. They are fine with a tablet (even more if you use a bluetooth keyboard), but writing commands on a phone’s keyboard is not really intuitive. Moreover, the small screen of a phone is not really usable to display the command results.

But I reaaaaally wanted to manage my servers from my phone. Because I am not always in front of my computer. Because I am too lazy to get the laptop from the bag, open it, plug the 3G key or find WiFi, connect over SSH and type a command to restart a crashed web server. Because it would be awesome to be in a bar and say “hold on, I have update my server”, open my phone and do it in 3 touches.

So, here it is, Pilot SSH, in all its glory, can out of the box:

  • display running processes, the memory they use, and kill them
  • show which websites are enabled or disabled in Apache
  • display the uptime, halt or reboot the server
  • show Apache logs
Pilot SSH process list

Pilot SSH process list

But there is more! The application is completely extensible, because it uses scripts stored on the server side, in your home, in ~/.pilotssh. You can totally replace the current scripts, download more from the Github repository, and make your own! The scripts can be written in any language, as long as they return a JSON string conforming to the API.

I already got a contribution soon after the launch, with a script to flush the caches. And I have a lot of ideas for new scripts:

  • upgrade a WordPress website
  • display the status of processes managed by Monit
  • create/remove users
  • Support nginx too
  • Display more logs

This is just the beginning, and I expect a lot of impressive ideas from the users of Pilot SSH. I can’t wait to see them!

Do you want to try Pilot SSH? Everything you need is on its website!

Harden WordPress using database permissions

Here is a small idea that I would like to throw into the world: most web applications use only one database user for most operations (installation, administration, common usage). Couldn’t we harness the database to protect a bit your data?

How to

This is how you could do it:

  • Create one user (called ‘user’) with full privileges on the database
  • Create another user with no privileges (let’s call him ‘read’)
  • Create a copy of wp-config.php that you will name wp-config-admin.php
  • Write the ‘read’ credentials in the wp-config.php and the normal credentials in wp-config-admin.php (don’t forget to use different auth, secure auth, logged in and nonce keys)
  • Create a copy of wp-load.php that you will name wp-load-admin.php
  • Replace in wp-load-admin.php the reference to wp-config.php by wp-config-admin.php
  • Replace in wp-login.php and wp-admin/* the references to wp-load.php by wp-load-admin.php
  • Now, you can use the admin interface, create posts, etc.
  • Grant some permissions to the ‘read’ database user: GRANT SELECT ON `db`.* TO ‘read’; GRANT INSERT, UPDATE ON `db`.`wp_comments` TO ‘read’;

That was a bit of work, but not that hard! So, what did we do here? We created a user for the admin interface with full privileges on the database (create/update posts, change the taxonomy, approve the comments, etc) and another one for the front end interface, with only read privileges on all tables (that bothers me too, but read on).

This means that SQL injections, either in plugins or in WordPress code (out of the admin panel) will be much harder to implement with this setup. Beware of the custom tables for some plugins. Those will require specific permissions. Depending on the plugin, some could be read only for common usage.

Going further

That’s nice, but not enough in my opinion. As I said, the full select permission for the ‘read’ user bothers me. Couldn’t we restrict a bit the permissions on wp_users? Some of the columns are needed, but do we need to access the user_pass column? Also, the “ALL PRIVILEGES” for ‘user’ is a bit too much. Do we really use the “FILE” privilege (out of SQL injections :D)?

Without further ado, here are the SQL commands you should use:

GRANT SELECT, INSERT, UPDATE ON `db`.`wp_comments` TO ‘read’;

GRANT SELECT ON `db`.`wp_commentmeta` TO ‘read’;

GRANT SELECT ON `db`.`wp_links` TO ‘read’;

GRANT SELECT ON `db`.`wp_options` TO ‘read’;

GRANT SELECT ON `db`.`wp_term_taxonomy` TO ‘read’;

GRANT SELECT ON `db`.`wp_usermeta` TO ‘read’;

GRANT SELECT ON `db`.`wp_terms` TO ‘read’;

GRANT SELECT ON `db`.`wp_term_relationships` TO ‘read’;

GRANT SELECT ON `db`.`wp_postmeta` TO ‘read’;

GRANT SELECT ON `db`.`wp_posts` TO ‘read’;

GRANT SELECT (user_activation_key, id, user_login, user_nicename, user_status, user_url, display_name, user_email, user_registered) ON `db`.`wp_users` TO ‘read’;

REVOKE ALL PRIVILEGES ON `db`.* from ‘user’;

GRANT SELECT, INSERT, DELETE, UPDATE, CREATE, DROP, ALTER, INDEX ON `db`.* TO ‘user’;

With these commands, ‘user’ can only manipulate tables. If you’re an evil DBA, you can even revoke the “CREATE, DROP, ALTER” permission after install, and reactivate them only for upgrades or plugin installation. The ‘read’ user has the same permissions as before on wp_comments, has “SELECT” on all tables except the wp_users. For wp_users, we grant “SELECT” on all columns except the user_pass one.

Thanks to this configuration, even a SQL injection in a plugin will not reach the password hashes! We also removed dangerous permissions like “FILE”. I’d like to prevent timing attacks like “SELECT BENCHMARK(5000000,ENCODE(‘MSG’,’by 5 seconds’));” but i did not figure out what is the right syntax for this (I tried variations around: “revoke execute on function benchmark from read”, without result).

Thankfully, WordPress mostly works with this configuration, and I think that a lot of other applications could be protected like this. Imagine: you could grant insert but not select on the credit card table in an e-commerce application, and process transactions with a background task with the right permissions.

Database privileges are indeed a powerful tool to protect your code from SQL injections. They might require some architectural changes, but the profits can be huge for your security.

VLC For Win8: building the new compatibility layer

As you can see, we are doing a KickStarter for the Windows 8 (WinRT) port of VLC media player. The goal is to take our existing code, which already works on Windows 8′s “desktop mode”, and make it run on WinRT, the “Metro” interface.

Porting code to WinRT offers significant challenges, mainly caused by the changes in the APIs. A lot of functions we were using, like LoadLibrary, are not available anymore, and replaced by slightly different functions (like LoadPackagedLibrary). Those might not be too hard to integrate into our current code base.

Other APIs, like sockets, were replaced by their COM interfaces counterpart (for instance Windows.Networking.Sockets). They are used to provide asynchronous interfaces for code running under WinRT. They got inspiration from mobile applications and the “always responsive” goal: a WinRT application should not use blocking code, and should go to sleep or wake up quickly if needed. With COM interfaces, the code polling the socket is executed in another process, and the data is provided through a callback. This changes a bit the usual networking code (connect->select->read->select>…) and we need to write a large layer of compatibility code.

After all of that, we still have to solve the issue of packaging. We already experimented a bit with side loading, which offers an easy way to distribute applications, but the real goal is to push VLC media player to the Windows Store.

So, why are we doing this? Windows 8 is happening, it is now preinstalled on most of the new computers.

Why am I excited to work on this port? This KickStarter will give us the opportunity to work full time on WinRT for a few months, and solve all these challenges for us, but also for other open source projects. We already know how to create a compatibility layer for different operating systems, so we will be able to build one for WinRT. We could also come up with guidelines on using free software toolchains to build WinRT applications.

Basically, we’re paving the way to WinRT for opensource libraries and applications, with the nice side effect on running VLC on Windows 8.

Help us make this project a reality! Please contribute to our Kickstarter for Windows 8!

Testing Android push without a server

Adding push support to an Android app is quite easy, but it can be cumbersome to test it if the server part is not ready yet.

For this, you only need your API key, and the registration ID for your device (you can get it from a call to GCMRegistrar.getRegistrationId). Also, you should have already called GCMRegistrar.register from your app, with your sender id.

Then, to send a push message to your application, use this code:

import java.io.IOException;

import com.google.android.gcm.server.*;

public class Main {

/**
* @param args
*/
public static void main(String[] args) {

String apiKey = “…”;

String deviceId = “…”;
Sender sender = new Sender(apiKey);
Message message = new Message.Builder().addData(“data1″, “hello”).build();

try {
Result result = sender.send(message, deviceId, 2);
System.out.println(“got result: “+result.toString());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

}

Warning your users about a vulnerability

Somebody just told you about a vulnerability in your code. Moreover, they published a paper about it. Even worse, people have been very vocal about it.
What can you do now? The usual (and natural) reaction is to downplay the vulnerability, and try to keep it confidential. After all, publishing a vuln will make your users unsafe, and it is bad publicity for your project, right?

WRONG!

The best course of action is to communicate a lot. Here’s why:

Communicate with the security researcher

Yes, I know, we security people tend to be harsh, quickly flagging a vulnerable project as “broken”, “dangerous for your customers” or even “EPIC FAIL“. That is, unfortunately, part of the folklore. But behind that, security researchers are often happy to help you fix the problem, so take advantage of that!

If you try to silence the researcher, you will not get any more bug reports, but your software will still be vulnerable. Worse, someone else will undoubtedly find the vulnerability, and may sell it to governments and/or criminals.

Vulnerabilities are inevitable, like bugs. Even if you’re very careful, people will still find ways to break your software. Be humble and accept them, they’re one of the costs of writing software. The only thing that should be on your mind when you receive a security report is protecting your users. The rest is just ego and fear of failure.

So, be open, talk to security researchers, and make it easy for them to contact you (tips: create a page dedicated to security on your website, and provide a specific email for that).

Damage control of a public vulnerability

What do you do when you discover the weakness on the news/twitter/whatever?

Communicate. Now.

Contact the researchers, contact the journalists writing about it, the users whining about the issue, write on your blog, on twitter, on facebook, on IRC, and tell them this: “we know about the issue, we’re looking into it, we’re doing everything we can to fix it, and we’ll update you as soon as it’s fixed“.

This will buy you a few hours or days to fix the issue. You can’t say anything else, because you probably don’t know enough about the problem to formulate the right opinion. What you should not say:

  • “we’ve seen no report of this exploited in the wild”, yet
  • “as far as we know, the bug is not exploitable”, but soon it will be
  • “the issue happens on a test server, not in production” = “we assume that the researcher didn’t also test on prod servers”
  • “no worries, there’s a warning on our website telling that it’s beta software”. Beta means people are using it.

Anything you could say in the few days you got might harm your project. Take it as an opportunity to cool your head, work on the vulnerability, test regressions, find other mitigations, and plan the new release.

Once it is done, publish the security report:

Warning your users

So, now, you fixed the vulnerability. You have to tell your users about it. As time passes, more and more people will learn about the issue, so they might as well get it from you.

You should have a specific webpage for security issues. You may fear that criminals might use it to attack your software or website. Don’t worry, they don’t need it, they have other ways to know about software weaknesses. This webpage is for your users. It has multiple purposes:

  • showing that you handle security issues
  • telling which versions someone should not use
  • explaining how to fix some vulnerabilities if people cannot update

For each vulnerability, here is what you should display:

  • the title (probably chosen by the researcher)
  • the security researcher’s name
  • the CVE identifier
  • the chain of events:
    • day of report
    • dates of back and forth emails with the researcher
    • day of fix
    • day of release (for software)
    • day of deploy (for web apps)
  • the affected versions (for software, not web apps)
  • the affected components (maybe the issue only concerns specific parts of a library)
  • how it can be exploited. You don’t need to get into the details of the exploit.
  • The target of the exploit. Be very exhaustive about the consequences. Does the attacker get access to my contacts list? Can he run code on my computer? etc.
  • The mitigations. Telling to update to the latest version is not enough. Some users will not update right away, because they have their own constraints. Maybe they have a huge user base and can’t update quickly. Maybe they rely on an old feature removed in the latest version. Maybe their users refuse to update because it could introduce regressions. What you should tell:
    • which version is safe (new versions with the fix, but also old versions if the issue was introduced recently)
    • if it’s open source, which commit introduced the fix. That way, people managing their own internal versions of software can fix it quickly
    • how to fix the issue without updating (example for a recent Skype bug). You can provide a quick fix, that will buy time for sysadmins and downstream developers to test, fix and deploy the new version.

Now that you have a good security report, contact again the security researchers, journalists and users, provide them with the report, and emphasize what users risked, and how they can fix it. You can now comment publicly about the issue, make a blog post, send emails to your users to tell them how much you worked to protect them. And don’t forget the consecrated formule: “we take security very seriously and have taken measures to prevent this issue from happening ever again”.

Do you need help fixing your vulnerabilities? Feel free to contact me!

Software security is like medicine

People don’t get software security. Between the legends around hackers, the scaremongering by salesmen and the technical level needed to practice it, you won’t even try to understand. It is hard to get a high level view of how it works, and how you can protect yourself. But there’s a nice little metaphore to understand security: it’s like medicine!

Medicine is about taking care of your body, fighting off illness, healing injuries. The body is a big machine that can fail in a lot of ways, depending on a lot of parameters. It is the same for software security: you want to prevent and repair any disruption that could fail at any time. Security, like medicine, is a very technical domain, and takes a lot of time to learn. Even after learning, you still don’t know everything, and you must keep up to date with recent research.

But basic body maintenance is easy. And protecting your software against basic attacks is easy too. You don’t need to study for 11 years to prevent basic SQL injections. Most of the time, the diseases and bugs that will affect you are common, and you don’t need Dr House to prevent them.

Bodies and software are problematic because they both decay through time, and are very sensitive to their environment. Change the environment, and you could catch something new. The world is full of germs and criminals. Is that a good reason to always stay at home? Obviously not. You must accept that you can get sick at any time. So, take steps to protect yourself. Wash your hands and sanitize your inputs. And beware of snake oil.

Feel free to extend on the metaphore in the comments, and use it to explain software security around you!

By the way, if you need someone to auscultate your applications, you can contact me at geoffroycouprie.com

Hiring gurus

I see a lot of posts on job boards asking for “gurus”, ninjas” or whatever. That is ridiculous at best, dangerous at worst.

Ridiculous, because you won’t find gurus on job boards. The real stars will not look for a job on these websites. They stay in companies that know their value and pay them well. Or they are directly contacted with offers tailored to them. They don’t need job boards.

That is also dangerous, because you will gather good or average developers with a very high idea of themselves. They think of themselves as awesome developers, so why would they learn anything? For your teams, you need team players, humble people, able to recognize their errors, and able to learn. You don’t need a lonesome cowboy.

Even worse, you will refuse future gurus because they’re not good enough right now. Some developers have a lot of potential, and can bloom in the right environment. Think of it: a developer cheaper than others who will deliver more than others in a year! You will also refuse useful people: an bad developer with a knack for debugging, a bad developer with a lot of knowledge about low level problems, a bad developer who will boost your team’s productivity, just because he’s good with people…

These job posts reinforce the cargo cult of the awesome geek wrangling tons of awesome code without sleep and with a lot of coffee. Enough with that! We geeks are not produced in chain, with A, B, or C quality level. We are real people, with unique qualities or defects, and you will miss out on awesome people while you wait for your prince charming.

Your data is precious

Following LinkedIn’s large password leak, I have seen a dangerous thought spread to friends and colleagues:
“so what if my LinkedIn password has leaked? What can they do? Look for a job for me?”

That is based on wrong assumptions about what an attacker wants and can do. And it is mistaking the low value you get from a service with the value of your data. Your data is PRECIOUS. Maybe not to you. But everything can be sold, and you’ll always find someone interested to buy it. Let’s see a few creative uses of your Linkedin account:

Analyze your data

You might think that what you share is of no use to anyone except potential recruiters, but by mixing your resume, shared links, private messages, all the data you put on the website, I could build a nice profile and sell it to advertisers. Did you put your address and phone number somewhere in your profile? Awesome! I have a lot of targeted advertisements for you!

Obtain access to your other accounts(email, Facebook, Twitter, Viadeo…)

With your email address and your password, I could probably guess the password for other services. Almost nobody has strong and different passwords for every service. Would you like to see your Facebook or Twitter account compromised? I don’t think so.

Oh, remember to use a strong password, or even two factor authenticatiob for email. A lot of password recovery systems sues emails, so if your mailbox is compromised, your accounts will be compromised.

Spam/SEO

Nothing ca be done with your account? oh, you have contacts. And maybe, a well referenced profile. I’d be able to send spam links to all your contacts with the user feed, and put them in your profile, to improve the ranking of my websites. Sure, there’s no harm to you, if you don’t care about losing credibility or annoying your contacts.

Using the contact list

Oh, yes, I could sell your contact list, that’s easy money!

While I’m at it, I could have fun with your friends and colleagues:

  • ask them for money, nude pictures, confidential information, etc.
  • tell them that your email account has been compromised, and that they must address their emails to another address controlled by me
  • obtain access to their accounts with social engineering
You may be insignificant, but that’s not necessary true of your contacts. In social networks, your network has a value, and you must protect it. It is your responsibility to make sure your friends and colleagues don’t get compromised through your account.
It reminds me of the 90s, when I often had this dialogue:
Me-You should put an antivirus and firewall on your computer.
You-Why should I? There’s nothing interesting on my computer, why would anyone want to infect it?
Me-I receive from you 10 emails a day, and all of them contain a virus.”