I often play with group messaging ideas, and recently, an interesting perspective came to me, about the relation between these messaging systems and the constraints of the CAP theorem.
What is the CAP theorem?
Small theoretical background here, feel free to skip if you already know what it is about
Otherwise known as Brewer's theorem, indicates that three properties are important in distributed systems:
- Consistency (all nodes see the same data)
- Availability (every request receives a response)
- Partition tolerance (the system still works through network splits or node failure)
The CAP theorem tells us that a distributed cannot have the three properties at the same time. It is not really a "two out of three" like most people tend to say, but more of a compromise you have to make. Some examples:
- in traditional databases systems (with a master-slave model), consistency and availability are high (all nodes can answer with th same data), but partition tolerance is weak because the master is a SPOF.
- in a fully distributed database (no master model), availability is high (all the nodes can answer), partition tolerance is high (a subset of the cluster could act as the whole database), but consistency is weak (data must be replicated to all the relevant nodes, and that can take time, so the nodes may not see the same data at the same time)
- in Bitcoin, the consistency is good (all nodes must agree to the same block chain) and partition tolerance is good (for downloading the block chain), but availability is weak for writing, because a new transaction must first propagates to a large enough set of nodes, then a block must be calculated for the transaction to be stored
As you can see, you can shape those properties depending on how you want your system to behave. Maybe you want fast reads, or fast writes, or very strong replication, etc.
Group messaging constraints
This has always been a challenge. Traditionally, chat systems follow a client-server model, where the server redistributes all the messages. As we saw previously, that is bad against partition problems. The usual solution is to have multiple servers talking with each other, as we can see in IRC or XMPP.
For a fully distributed messaging system (if we forget for a moment all the nasty NAT problems), communication becomes a routing problem: making sure the nodes get all the messages fast enough and in the right order. If you're building a distributed Twitter, it's not really a problem, but for an interactive chat system, this becomes really hard. You can try to send your messages directly to all the users in the chat room, but as more people join, sending and receiving all the messages takes more and more time, and so, you sacrifice availability and a bit of consistency (you will not necessarily receive all the messages).
Group messaging crypto
The group messaging problems seemed hairy? Let's add crypto in the mix, just for fun! What security properties would we want to add to a messaging system?
- Authentication: every user knows with whom he is talking
- Confidentiality: an external observer cannot see the content of a message
- Tampering proof: users can detect when a message has been modified and reject them
- Perfect forward secrecy: compromising one key does not compromise the whole conversation or past conversations
There are others that we could want, but let's just concentrate on these ones for now. We can separate the messaging in two phases: the discovery, and the transport. The discovery is when nodes begin talking to each other, authenticating each other, establishing session keys, managing key rotation, etc. Note that discovery can happen at any time, as a new node can appear after a long time. The transport is about sending messages safely and verifying messages.
I think we can assume that, once all the nodes have authenticated themselves and agreed on session keys, the transport is the easy part. Whatever the underlying transport system (broadcast, XMPP, SMTP, etc), once you can send, receive and verify messages, everything is easy.
The interesting part is the discovery. That is where the similarity with a distributed system is evident. You want a lot of nodes to start communicating with each other, you want to propagate information to all the nodes (like session keys), you must handle nodes that go up or down, and clusters splitting.
That's where the CAP theorem is useful to understand the constraints of the system.
Group authentication is an availability and partition problem: if you cannot start the communication until all the nodes have authenticated each other, you are vulnerable to slow or failing nodes.
Tampering proof is a consistency problem: all the nodes must know the same signature keys to agree on whether a message is correct or not.
Confidentiality is a consistency and partition problem: if you must reach consensus on a group-wide session key, you have to wait for all the nodes to get the new key (consistency). That happens a lot if you want perfect forward secrecy, where you must change keys regularly. Moreover, in case of partition, the different groups will agree to different keys, and a conflict will appear once the split is over.
As you can see, a security model can help you choose which tools (cryptographic or not) you will use to ensure the safety of the system, but with the distributed system theory, you will be able to predict and recognize the behaviour of the system.
Some examples
Let's see how some more or less known system handle that:
GPG and email
GPG+email is, at its heart, a secure group messaging system. If we analyze its properties, wen can see that it is very good against partition: SMTP handles splits quite well, it can resend messages if they did not pass, or store them for a few days until the next server is up.
For availability, it doesn't fare very well at high loads if you want to encrypt messages, because you need to encrypt for every host. You cannot take advantage of SMTP's architecture to reduce the load.
It also has very bad consistency. If you want every node to agree on the keys of each user, you have to do one to one offline meetings between each of the participants. In practice, users make a tradeoff here, so the security assumption is not completely true.
Multi party off the record messaging
There is a paper you can read about MP OTR, which builds on the previous OTR algorithm to provide secure multi party communication. That protocol relies on a heavy setup phase where all the nodes authenticate each other and generate a group encryption key.
This model will fare quite well once consensus has been reached: all the nodes know the ephemeral signature keys and the group encryption key.
Unfortunately, in case of a user joining or leaving the chatroom, the whole shutdown and setup process must happen, making the chatroom unavailable in the meantime.
What about yours?
I see a lot of new projects appearing lately, with the intention to "fix" chat systems or social networks by building a "secure" distributed system. Unfortunately, most of them do not have a serious background in security or distributed systems. The subject of this article, the CAP theorem, is a very small part of the way we think about distributed systems: people have been researching on the subject for years. Similarly, cryptography and protocols have improved a lot lately.
So, if you are building one of these, take your time, forget the hype, read up a bit about the theory, and think about your model before you make your technological choices. Please don't repeat the worst mistakes.
This post is a translation of something I wrote in French for Human Coders. They asked me what would be the ideal job post from a developer's standpoint:
How whould you write a job announcement attracting good developers? Most recruiters complain that finding the right candidates is an ordeal.
If you ask me, it is due to very old recruitment practices: writing job posts for paper ads (where you pay by the letter), spamming them to as many people as possible, mandating fishy head hunters... This has worked in the past, but things changed. A lot more companies are competing to recruit developers, and many more developers are available, so separating the wheat from the chaff is harder.
We can do better! Let's start from scratch. For most candidates, a job posting is the first contact they'll have with your company. It must be attrctive, exciting! When I read "the company X is the leader on the market of Y", I don't think that they're awesome, I just wonder what they really do.
A job posting is a marketing document. Its purpose is not to filter candidates, but to attract them! And how do you write a marketing document?
YOU. TALK. ABOUT. THE. CLIENT. You talk about his problems, his aspirations, and only then, will you tell him how you will make things better for him. For a job posting, you must do the same. You talk to THE candidate. Not to multiple candidates, not to the head hunter or the HR department, but to the candidate. Talk about the candidate, and only the candidate. When a developer looks for a job, she doesn't want to "work on your backend application" or "maintain the web server". That is what she will do for you. This is what she wants:
- getting paid to write code
- work on interesting technologies
- a nice workplace atmosphere
- learn
- etc.
A good job posting should answer the candidate's aspirations, and talk about the career path. Does this job lead to project management? Do you propose in-house training? Is there a career path for expertise in your caompany?
Do you share values with your candidate? I do not mean the values written by your sales team and displayed on the "our values" page of your website. I am talking about the values of the team where the candidate will end up. Do they work in pair programming? Do they apply test driven development? There is no best way to work, the job can be done in many ways, so you must make sure that the candidate will fit right in.
What problem are you trying to solve with your company? Do you create websites that can be managed by anyone? Do you provide secure hosting? Whatever your goal is, talk about it instead of talking about the product. I do not want to read, "our company develops a mobile server monitoring tool", because that is uninteresting. If I read "we spent a lot of time on call for diverse companies, so we understood that mobility is crucial for some system administrators, so we created a tool tailored for moving system administrators", I see a real problem, a motivation to work, a culture that I could identify to.
By talking that way to the candidate, you will filter candidates on motivation and culture instead of filtering on skills. That can be done later, once you see the candidate You did not really think that a resume was a good way to select skilled people, do you?
Here is a good example of fictive job posting, from a company aggregating news for developers, looking for a Rails developer:
"You are a passionnate Ruby on Rails developers, you are proiud of you unit tests, and you enjoy the availability of Heroku's services? That's the same for us!
At Company X, we love developers: all of our services are meant for their fulfillment. We propose a news website about various technologies, higly interesting trainings and a job board for startups.
Our employees benefit fully from these services, and make talks in conferences all around France. By directly talking with other developers, they quickly get an extensive knowledge of current technologies.
Our news website is growing fast, so we need help to scale it. The web app uses Heroku and MongoDB, with a CoffeeScript front end. Are you well versed in Rails optimization? If yes, we would love to talk with you!"
Note that I did not talk about years of experience, or a city. I want to hire a Rails developer, not necessarily a french developer. I want someone with experience in optimization, not someone over 27.
With such a job posting, you will receive a lot more interesting employment applications. Now, are you afraid that it will incur a lot more work? The solution in a future post: how to target candidates efficiently? Get away from job boards!
Following the recent YAML parsing vulnerabilities in Rails, I decided to act on an idea I had a few months ago: using route constraints to define strict API contracts in Rails.
Sadly, it does not protect against the YAML parsing problem (and the probable similar vulnerabilities we will see in the following months), because the request is interpreted before going through the route constraints. But it can protect from the mass assignment vulnerability., and probably some SQL injections.
Here is the idea: Rails 3 introduced the route constraints, a way to execute fonctions on the request before it is passed to the controllers. By combining it with the json-schema gem, we can filter the JSON input quite easily.
For the following example data:
[sourcecode language="javascript"]
{"echo" : "blah", "nb" : 1, "data": [1, 2, 3]}
[/sourcecode]
We can define the following schema:
[sourcecode language="javascript"]
{
"type": "object",
"$schema": "http://json-schema.org/draft-03/schema",
"id": "#",
"required": false,
"additionalProperties": false,
"properties": {
"data": {
"type": "array",
"id": "data",
"required": false,
"items": {
"type": "number",
"id": "0",
"required": false
}
},
"echo": {
"type": "string",
"id": "echo",
"required": false
},
"nb": {
"type": "number",
"id": "nb",
"required": false
}
}
}
[/sourcecode]
(Use the JSON schema generator to create your own)
Save this schema to "data.schema" and add "json-schema" to your Gemfile. You will then be able to filter inputs with code like the following "config/routes.rb":
[sourcecode language="ruby"]
require "json-schema"
class LolJSONConstraint
def matches?(request)
if(request.headers["CONTENT_TYPE"] == "application/json")
JSON::Validator.validate("data.schema", request.headers["action_dispatch.request.request_parameters"])
end
end
end
Yamlvuln::Application.routes.draw do
resources :posts, :constraints => LolJSONConstraint.new
end
[/sourcecode]
The constraint will load the schema, and apply it to the incoming data, and return a 404 error if the JSON is invalid. The "additionalProperties" set to false in the schema is required to refuse the properties you didn't define and protect the application from mass assignment.
If I tried, for example, to send the following JSON to the application, there would be an error:
[sourcecode language="javascript"]
{"echo" : "blah", "nb" : "UNION ALL SELECT LOAD_FILE(CHAR(34,47,101,116,99,47,112,97,115,115,119,100,34))", "data": [1, 2, 3]}
[/sourcecode]
As I said before, it is not safe against the YAML parsing vulnerability. Also, I did not really test the performance of this. But it is still a nice and easy solution for API filtering.
Last week, I wrote a post about SSL optimization that showed the big interest people have in getting the absolute best performance from their web servers.
That post was just a small part of the ebook on SSL tuning I am currently writing. This ebook will cover various subjects:
- algorithms comparison
- handshake tuning
- HSTS
- session tickets
- load balancers
- ...
I test a lot of different architectures, to provide you with tips directly adaptable to your system (like I did previously with Apache and Nginx). But I don't have access to every system under the sun...
So, if you feel adventurous enough to try SSL optimization on your servers, please contact me, I would be happy to help you!
I am especially interested in large architectures (servers in multiple datacenters around the world, large load balancers, CDNs) and mobile application backends.
And don't forget to check out the ebook, to be notified about updates!
Update: following popular demand, the article now includes nginx commands :)
Update 2: thanks to jackalope from Hacker News, I added a missing Apache directive for the cipher suites.
Update 3: recent attacks on RC4 have definitely made it a bad choice, and ECDHE cipher suites got improvements.
SSL is slow. These cryptographic algorithms eat the CPU, there is too much traffic, it is too hard to deploy correctly. SSL is slow. Isn't it?
HELL NO!
SSL looks slow, because you did not even try to optimize it! For that matter, I could say that HTTP is too verbose, XML web services are verbose too, and all this traffic makes the website slow. But, SSL can be optimized, as well as everything!
Slow cryptographic algorithms
The cryptographic algorithms used in SSL are not all created equal: some provide better security, some are faster. So, you should choose carefully which algorithm suite you will use.
The default one for Apache 2's SSLCipherSuite directive is: ALL: !ADH:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP
You can translate that to a readable list of algorithms with this command: openssl ciphers -v 'ALL:!ADH:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP'
Here is the result:
DHE-RSA-AES256-SHA SSLv3 Kx=DH Au=RSA Enc=AES(256) Mac=SHA1
DHE-DSS-AES256-SHA SSLv3 Kx=DH Au=DSS Enc=AES(256) Mac=SHA1
AES256-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(256) Mac=SHA1
DHE-RSA-AES128-SHA SSLv3 Kx=DH Au=RSA Enc=AES(128) Mac=SHA1
DHE-DSS-AES128-SHA SSLv3 Kx=DH Au=DSS Enc=AES(128) Mac=SHA1
AES128-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(128) Mac=SHA1
EDH-RSA-DES-CBC3-SHA SSLv3 Kx=DH Au=RSA Enc=3DES(168) Mac=SHA1
EDH-DSS-DES-CBC3-SHA SSLv3 Kx=DH Au=DSS Enc=3DES(168) Mac=SHA1
DES-CBC3-SHA SSLv3 Kx=RSA Au=RSA Enc=3DES(168) Mac=SHA1
DHE-RSA-SEED-SHA SSLv3 Kx=DH Au=RSA Enc=SEED(128) Mac=SHA1
DHE-DSS-SEED-SHA SSLv3 Kx=DH Au=DSS Enc=SEED(128) Mac=SHA1
SEED-SHA SSLv3 Kx=RSA Au=RSA Enc=SEED(128) Mac=SHA1
RC4-SHA SSLv3 Kx=RSA Au=RSA Enc=RC4(128) Mac=SHA1
RC4-MD5 SSLv3 Kx=RSA Au=RSA Enc=RC4(128) Mac=MD5
EDH-RSA-DES-CBC-SHA SSLv3 Kx=DH Au=RSA Enc=DES(56) Mac=SHA1
EDH-DSS-DES-CBC-SHA SSLv3 Kx=DH Au=DSS Enc=DES(56) Mac=SHA1
DES-CBC-SHA SSLv3 Kx=RSA Au=RSA Enc=DES(56) Mac=SHA1
DES-CBC3-MD5 SSLv2 Kx=RSA Au=RSA Enc=3DES(168) Mac=MD5
RC2-CBC-MD5 SSLv2 Kx=RSA Au=RSA Enc=RC2(128) Mac=MD5
RC4-MD5 SSLv2 Kx=RSA Au=RSA Enc=RC4(128) Mac=MD5
DES-CBC-MD5 SSLv2 Kx=RSA Au=RSA Enc=DES(56) Mac=MD5
EXP-EDH-RSA-DES-CBC-SHA SSLv3 Kx=DH(512) Au=RSA Enc=DES(40) Mac=SHA1 export
EXP-EDH-DSS-DES-CBC-SHA SSLv3 Kx=DH(512) Au=DSS Enc=DES(40) Mac=SHA1 export
EXP-DES-CBC-SHA SSLv3 Kx=RSA(512) Au=RSA Enc=DES(40) Mac=SHA1 export
EXP-RC2-CBC-MD5 SSLv3 Kx=RSA(512) Au=RSA Enc=RC2(40) Mac=MD5 export
EXP-RC4-MD5 SSLv3 Kx=RSA(512) Au=RSA Enc=RC4(40) Mac=MD5 export
EXP-RC2-CBC-MD5 SSLv2 Kx=RSA(512) Au=RSA Enc=RC2(40) Mac=MD5 export
EXP-RC4-MD5 SSLv2 Kx=RSA(512) Au=RSA Enc=RC4(40) Mac=MD5 export
28 cipher suites, that's a lot! Let's see if we can remove the unsafe ones first! You can see at the end of the of the list 7 ones marked as "export". That means that they comply with the US cryptographic algorithm exportation policy. Those algorithms are utterly unsafe, and the US abandoned this restriction years ago, so let's remove them:
'ALL:!ADH:!EXP:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2'.
Now, let's remove the algorithms using plain DES (not 3DES) and RC2: 'ALL:!ADH:!EXP:!LOW:!RC2:RC4+RSA:+HIGH:+MEDIUM'. That leaves us with 16 algorithms.
It is time to remove the slow algorithms! To decide, let's use the openssl speed command. Use it on your server, ecause depending on your hardware, you might get different results. Here is the benchmark on my computer:
OpenSSL 0.9.8r 8 Feb 2011
built on: Jun 22 2012
options:bn(64,64) md2(int) rc4(ptr,char) des(idx,cisc,16,int) aes(partial) blowfish(ptr2)
compiler: -arch x86_64 -fmessage-length=0 -pipe -Wno-trigraphs -fpascal-strings -fasm-blocks
-O3 -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DMD32_REG_T=int -DOPENSSL_NO_IDEA
-DOPENSSL_PIC -DOPENSSL_THREADS -DZLIB -mmacosx-version-min=10.6
available timing options: TIMEB USE_TOD HZ=100 [sysconf value]
timing function used: getrusage
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md2 2385.73k 4960.60k 6784.54k 7479.39k 7709.04k
mdc2 8978.56k 10020.07k 10327.11k 10363.30k 10382.92k
md4 32786.07k 106466.60k 284815.49k 485957.41k 614100.76k
md5 26936.00k 84091.54k 210543.56k 337615.92k 411102.49k
hmac(md5) 30481.77k 90920.53k 220409.04k 343875.41k 412797.88k
sha1 26321.00k 78241.24k 183521.48k 274885.43k 322359.86k
rmd160 23556.35k 66067.36k 143513.89k 203517.79k 231921.09k
rc4 253076.74k 278841.16k 286491.29k 287414.31k 288675.67k
des cbc 48198.17k 49862.61k 50248.52k 50521.69k 50241.28k
des ede3 18895.61k 19383.95k 19472.94k 19470.03k 19414.27k
idea cbc 0.00 0.00 0.00 0.00 0.00
seed cbc 45698.00k 46178.57k 46041.10k 47332.45k 50548.99k
rc2 cbc 22812.67k 24010.85k 24559.82k 21768.43k 23347.22k
rc5-32/12 cbc 116089.40k 138989.89k 134793.49k 136996.33k 133077.51k
blowfish cbc 65057.64k 68305.24k 72978.75k 70045.37k 71121.64k
cast cbc 48152.49k 51153.19k 51271.61k 51292.70k 47460.88k
aes-128 cbc 99379.58k 103025.53k 103889.18k 104316.39k 97687.94k
aes-192 cbc 82578.60k 85445.04k 85346.23k 84017.31k 87399.06k
aes-256 cbc 70284.17k 72738.06k 73792.20k 74727.31k 75279.22k
camellia-128 cbc 0.00 0.00 0.00 0.00 0.00
camellia-192 cbc 0.00 0.00 0.00 0.00 0.00
camellia-256 cbc 0.00 0.00 0.00 0.00 0.00
sha256 17666.16k 42231.88k 76349.86k 96032.53k 103676.18k
sha512 13047.28k 51985.74k 91311.50k 135024.42k 158613.53k
aes-128 ige 93058.08k 98123.91k 96833.55k 99210.74k 100863.22k
aes-192 ige 76895.61k 84041.67k 78274.36k 79460.06k 77789.76k
aes-256 ige 68410.22k 71244.81k 69274.51k 67296.59k 68206.06k
sign verify sign/s verify/s
rsa 512 bits 0.000480s 0.000040s 2081.2 24877.7
rsa 1024 bits 0.002322s 0.000111s 430.6 9013.4
rsa 2048 bits 0.014092s 0.000372s 71.0 2686.6
rsa 4096 bits 0.089189s 0.001297s 11.2 771.2
sign verify sign/s verify/s
dsa 512 bits 0.000432s 0.000458s 2314.5 2181.2
dsa 1024 bits 0.001153s 0.001390s 867.6 719.4
dsa 2048 bits 0.003700s 0.004568s 270.3 218.9
We can remove the SEED and 3DES suite because they are slower than the other. DES was meant to be fast in hardware implementations, but slow in software, so 3DES (which runs DES three times) is slower. On the contrary, AES can be very fast in software implementations, and even more if your CPU provides specific instructions for AES. You can see that with a bigger key (and so, better theoretical security), AES gets slower. Depending on the level of security, you may choose different key sizes. According to the key length comparison, 128 might be enough for now. RC4 is a lot faster than other algorithms. AES is considered safer, but the implementation in SSL takes into account the attacks on RC4. So, we will propose this one in priority. Following recent researches, it appears that RC4 is not safe enough anymore. And ECDHE got a performance boost with recent versions of OpenSSL. So, let's forbid RC4 right now!
So, here is the new cipher suite: 'ALL:!ADH:!EXP:!LOW:!RC2:!3DES:!SEED:!RC4:+HIGH:+MEDIUM'
And the list of ciphers we will use:
DHE-RSA-AES256-SHA SSLv3 Kx=DH Au=RSA Enc=AES(256) Mac=SHA1
DHE-DSS-AES256-SHA SSLv3 Kx=DH Au=DSS Enc=AES(256) Mac=SHA1
AES256-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(256) Mac=SHA1
DHE-RSA-AES128-SHA SSLv3 Kx=DH Au=RSA Enc=AES(128) Mac=SHA1
DHE-DSS-AES128-SHA SSLv3 Kx=DH Au=DSS Enc=AES(128) Mac=SHA1
AES128-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(128) Mac=SHA1
RC4-SHA SSLv3 Kx=RSA Au=RSA Enc=RC4(128) Mac=SHA1
RC4-MD5 SSLv3 Kx=RSA Au=RSA Enc=RC4(128) Mac=MD5
RC4-MD5 SSLv2 Kx=RSA Au=RSA Enc=RC4(128) Mac=MD5
9 ciphers, that's much more manageable. We could reduce the list further, but it is already in a good shape for security and speed. Configure it in Apache with this directive:
SSLHonorCipherOrder On
SSLCipherSuite ALL:!ADH:!EXP:!LOW:!RC2:!3DES:!SEED:!RC4:+HIGH:+MEDIUM
Configure it in Nginx with this directive:
ssl_ciphers ALL:!ADH:!EXP:!LOW:!RC2:!3DES:!SEED:!RC4:+HIGH:+MEDIUM
You can also see that the performance of RSA gets worse with key size. With the current security requirements (as of now, January 2013, if you are reading this from the future). You should choose a RSA key of 2048 bits for your certificate, because 1024 is not enough anymore, but 4096 is a bit overkill.
Remember, the benchmark depends on the version of OpenSSL, the compilation options and your CPU, so don't forget to test on your server before implementing my recommandations.
Take care of the handshake
The SSL protocol is in fact two protocols (well, three, but the first is not interesting for us): the handshake protocol, where the client and the server will verify each other's identity, and the record protocol where data is exchanged.
Here is a representation of the handshake protocol, taken from the TLS 1.0 RFC:
Client Server
ClientHello -------->
ServerHello
Certificate*
ServerKeyExchange*
CertificateRequest*
<-------- ServerHelloDone
Certificate*
ClientKeyExchange
CertificateVerify*
[ChangeCipherSpec]
Finished -------->
[ChangeCipherSpec]
<-------- Finished
Application Data <-------> Application Data
You can see that there are 4 messages exchanged before any real data is sent. If a TCP packet takes 100ms to travel between the browser and your server, the handshake is eating 400ms before the server has sent any data!
And what happens if you make multiple connections to the same server? You do the handshake every time. So, you should activate Keep-Alive. The benefits are even bigger than for plain unencrypted HTTP.
Use this Apache directive to activate Keep-Alive:
KeepAlive On
Use this nginx directive to activate keep-alive:
keepalive_timeout 100
Present all the intermediate certification authorities in the handshake
During the handshake, the client will verify that the web server's certificate is signed by a trusted certification authority. Most of the time, there is one or more intermediate certification authority between the web server and the trusted CA. If the browser doesn't know the intermediate CA, it must look for it and download it. The download URL for the intermediate CA is usually stored in the "Authority information" extension of the certificate, so the browser will find it even if the web server doesn't present the intermediate CA.
This means that if the server doesn't present the intermediate CA certificates, the browser will block the handshake until it has downloaded them and verified that they are valid.
So, if you have intermediate CAs for your server's certificate, configure your webserver to present the full certification chain. With Apache, you just need to concatenate the CA certificates, and indicate them in the configuration with this directive:
SSLCertificateChainFile /path/to/certification/chain.pem
For nginx, concatenate the CA certificate to the web server certificate and use this directive:
ssl_certificate /path/to/certification/chain.pem
Activate caching for static assets
By default, the browsers will not cache content served over SSL, for security. That means that your static assets (Javascript, CSS, pictures) will be reloaded on every call. Here is a big performance failure!
The fix for that: set the HTTP header "Cache-Control: public" for the static assets. That way, the browser will cache them. But don't activate it for the sensitive content, beacuase it should not be cached on the disk by your browser.
You can use this directive to enable Cache-Control:
<filesMatch ".(js|css|png|jpeg|jpg|gif|ico|swf|flv|pdf|zip)$">
Header set Cache-Control "max-age=31536000, public"
</filesMatch>
The files will be cached for a year with the max-age option.
For nginx, use this:
location ~ \.(js|css|png|jpeg|jpg|gif|ico|swf|flv|pdf|zip)$ {
expires 24h;
add_header Cache-Control public;
}
Update: it looks like Firefox ignores the Cache-Control and caches everything from SSL connections, unless you use the "no-store" option.
Beware of CDN with multiple domains
If you followed a bit the usual performance tips, you already offloaded your static assets (Javascript, CSS, pictures) to a content delivery network. That is a good idea for a SSL deployment too, BUT, there are caveats:
- your CDN must have servers accessible over SSL, otherwise you will see the "mixed content" warning
- it must have "Keep-Alive" and "Cache-control: public" activated
- it should serve all your assets from only one domain!
Why the last one? Well, even if multiple domains point to the same IP, the browser will do a new handshake for every domain. So, here, we must go against the common wisdom of separating your assets on multiple domains to profit from the parallelized request in the browser. If all the assets are served from the same domain, there will only be one handshake. It could be fixed to allow multiple domains, but this is beyond the scope of this article.
More?
I could talk for hours about how you could tweak your web server performance with SSL. There is alot more to it than these easy tips, but I hope those will be of useful for you!
If you want to know more, I am currently writing an ebook about SSL tuning, and I would love to hear your comments about it!
If you need help with your SSL configuration, I am available for consulting, and always happy to work on interesting architectures.
By the way, if you want to have a good laugh with SSL, read "How to get a certificate signed by multiple certification authorities" :)