Yet another authentication scheme

Recently, I was asked to design a new authentication protocol for a web service. I know that I shouldn't do reinvent the wheel, so I immediatly proposed OAUTH. It turns out that it can't be used in this situation. Here are the constraints:

-calls to the webservice must be authenticated: I can keep the tokens and signature from OAUTH here. The problem is: how do I get that token?

-calls are made from devices or applications without access to a webbrowser (embedded devices, phones, etc.). The redirection dance of OAUTH is not acceptable here

-communications are done over an untrusted network, without SSL.

-I can't use application keys and secrets to encrypt and sign the authentication process: clients include open source software and smartphone applications. You can't hide a secret key in these.

-the protocol has to be simple to implement, on a lot of languages

-the server must not store the password in cleartext (I shouldn't have to precise this...), the client must not store the password

Summing it up: no preshared keys, no browser, no SSL, untrusted networks, no passwords stored, and an OAUTH-like environment once the client is authenticated (tokens, authorizations, revoking, etc)

Apparently, I should just give up. But I like to play, so I'll try!

First, I must say that I am not an expert in security nor cryptography. But I'm really enthusiastic about these subjects, and my day job is at a company providing strong authentication solutions (no, this protocol is not related to my day job). So, I know a bit about the subject, and I know that I should ask for reviews, hence this post.

Rough ideas

We need a safe communication over an untrusted network. TLS immediatly comes to mind, but the targeted applications might not have access to a TLS implementation. I'd like to use SRP, but I don't think I'm able to implement it correctly (and it has to be SIMPLE). Using Diffie-Hellman to establish a shared key is another idea, but it is not safe against MITM.

Here's my idea: we don't need to generate a shared secret, we already have it. It's the password!

But how can I use the password if the server doesn't store it in cleartext?

The trick: key derivation functions

Decveoplers are finally understanding that they should not use MD5 nor SHA1 to store their passwords, even with a salt, because computing power is so cheap these days that anyone could crack easily a lot of passwords.

It is now recommended to use other functrions tro store passwords. The key derivation functions are a class of functions that create a key from a password. Basically, they do it by interating a lot of times. That makes them very slow, which is an interesting property if you want to stpre passwords: it is too expensive to "crack" the password. PBKDF2, bcrypt and scrypt are well known key derivation functions. They're simple to use and available in a lot of languages.

With these functions, I can safely store the passwords, and generate a key shared with the client.

In short: if I store kdf(password, N) with N the number of iterations, I can send any M > N to the client and ask him to compute the key, without compromising what I store.

Designing the protocol

Now that we have a way to use a shared key, we can look at what will go over the wire to establish it. If I use directly kdf(pass, M), anybody getting access to the client storage will be able to obtain the key for any L > M. So, the key establishment has to use a nonce. That way, the client will only use the password once and forget it, and store the derivated key.

I would rather use a truly random key that has no relation with the password. It could be given to the client, encrypted with the derivated key. The derivated key could then be thrown away. But I still do not know if it is really necesary.

The server still needs to authenticate the client. The client will make a second call to the web service, signing it with HMAC and the key.

That's it! It is really simple, so if there are flaws I did not see, you will surely catch them.

TL; DR

The protocol is based on key derivation functions, like PBKDF or bcrypt.

  • The server stores login and H = kdf(pass, N), with N integer
  • The client wants to authenticate and makes a call to the server with the login as argument
  • The server replies with M > N and i nonce
  • The client calculates k1 = kdf(kdf(pass, M)+i, 1)
  • The server calculates k2 = kdf(kdf(H, M-N)+i, 1)
  • The client calls the server with args "user=login&sign=".HMAC("user=login", k2)
  • If k1=k2. The signature matches and the client is authenticated.

The Geal test: extending the Joel Test

The Joel test was written by Joel Spolsky to provide a few very simple questions for developers to ask in an interview. Here they are:

The Joel Test

  1. Do you use source control?
  2. Can you make a build in one step?
  3. Do you make daily builds?
  4. Do you have a bug database?
  5. Do you fix bugs before writing new code?
  6. Do you have an up-to-date schedule?
  7. Do you have a spec?
  8. Do programmers have quiet working conditions?
  9. Do you use the best tools money can buy?
  10. Do you have testers?
  11. Do new candidates write code during their interview?
  12. Do you do hallway usability testing?

They seem basic, and that's the point: a company with a poor score doesn't give a nice environment to its developers.

While this test is still applicable, it was written in 2000, and software development has seen a lot of changes and innovation. So, I thought of a few other questions that you can ask your current or future employer:

The Geal Test

  1. Do you use agile development methods?
  2. Do you have unit tests?
  3. Do you perform code reviews?
  4. Do you use known technologies and frameworks (open source or not)?
  5. Do developers train and learn on office hours, or in their spare time?
  6. Do developers communicate with system administrators (deployment requests and bug reports don't count)?
  7. Do developers communicate with the client?
  8. Do developers retain copyright on the work done in their spare time?

That's it, 8 more questions, 1 point by positive answer.  Joel said that 11 or 12 for his test is ok. I'm nicer, so I'll say that 6 on my test is good enough.

1. Do you use agile development methods?

Agile methods have been there for a few years now, and they have proven useful for a lot of projects, especially when you have changing requirements or a very short time to market. Don't let your developers fight everyday against specifications written 5 years ago, let them adapt on the way.

2. Do you have unit tests?

This should be standard. There are a lot of libraries to write tests in every language, for specific functions, for APIs, for user interfaces, so this approach is well supported. Moreover, if you answered "yes" to Joel's question about daily builds, you can add tests to the loop, and run them right after the daily build. If you're not convinced about the usefulness of unit tests, or fear that it will take too much time: unit tests give you assurance that you won't break the code, they can validate the compliance with the specifications, and automated unit tests will save some time for development. You don't want to pay a developer to test manually over and over the same code, but you can buy a machine to do that.

3. Do you perform code reviews?

I know that this one is hard to implement in a team, but once the developers are past the "I'm too shy to show you my code" phase, this will help them spot mistakes, learn from the better developers and find new ways to improve the code.

4. Do you use known technologies and frameworks (open source or not)?

A lot of companies have custom frameworks that they develop and use for their products, that nobody else uses. Although it can be comforting to have your own technology, that you control and maintain, it has a few problems:

  • it reeks of NIH syndrome
  • it is a cost not directly linked to what you're selling
  • you have to train developers to use it
  • the expertise they build will be useless in future jobs

If you use known (and hopefully, recent) technologies, you don't have to maintain it (although you may need to pay for it), you profit from bugfixes for other clients, and you are more likely to attract and hire skilled developers. Seriously, I don't want to waste years to maintain your dead framework.

5. Do developers train and learn on office hours, or in their spare time?

Software development moves very fast and a developer needs to catch up often. If you don't allocate time in his schedule to read, try and learn, you can still assume that he will still train himself in his spare time. But you take the risk that your Java developer becomes a Ruby expert, because he will learn what he wants, not what you need. If you want your developers to become experts, help them.

6. Do developers communicate with system administrators (deployment requests and bug reports don't count)?

Too often, the only communication between developers and tech ops is through deployment request and bug reports. The consequence: they don't know each other, they don't trust each other, and when there's a problem, they don't work together. Obviously, this is not a good work environment. As a developer, I'm interested in how my code behaves in real conditions, and I would like helpful bug reports, instead of "it doesn't work, I rollback", and knowing and working with the sys admins can provide it.

7. Do developers communicate with the client?

Ok, this one will horrify some project managers and a few developers. I know that you want  to control all the interactions with the client, and that developers sometimes have poor communication skills. But if you put a few layers between the developer and the client, this is what the developer will see: specifications that don't make sense, and useless bug reports. Developers are problem analysis machines, so they can understand the needs of your client, and see right away what is implied in architecture, technology and performance. Use their insight, and they will feel useful, and produce better software. Also, if you can, send a developer to watch a  bit how the client works with the software. In 5 minutes, they will spot more bugs and usability problems than in 5 weeks of bug reports.

8. Do developers retain copyright on the work done in their spare time?

When you're passionate about development, you often have ideas, itches to scratch, and you may not be able to develop them at work. But a lot of contracts have non competition clauses and other clauses giving copyright of ALL your work to the employer. As an employer, it's a way to protect the company, but as a developer, it is scary: you can't use or sell the code written in your spare time. Let your developers work on what they want when they're not in the office, and you will profit from the experience they gained developing their side projects (but state clearly that they work for you on office hours).

Bonus

I also have 4 bonus questions. They're optional, but they don't hurt.

  • Do the developers participate in <LANGUAGE> user groups or developer meetups? In these meetups, they will learn a lot, and if they're experts and enjoy working for you, they will attract other developers.
  • Will I have a technical manager? It is reassuring for a developer to know that his manager has a clue about development, knows the difference between a good and a bad developer, can understand his problems and stabd up for his team.
  • Do you use a recent version control software? Joel already asked this, but it needs to be precised. Version control systems have improved a lot since his test so, if you can, avoid old stuff like CVS or (worse) SourceSafe. Subversion is fine for most setups now (even on Windows), and if you use Git or Mercurial, I will be reaaaally happy.
  • Do you accept remote working? There are a lot of tools to communicate online: mail, IM, VoIP, web project managers and bug trackers, and developers often know very well how to use them. It's comforting for the employer to know that the developer is always at arm's length, but this will not mean they're more productive. If you remove distractions from the work environment (phone, colleagues), the developer can be more productive (yes, there are actually LESS distractions at home). Also, they won't waste time in transport.

Now, if you're a developer, rate your own company or future job. If you're a manager, rate your team, and please, please, on behalf of all the developers out there, try to get the perfect score!

How much did you get?

Smalltalk for engineers

For more than a year, I have been playing with Smalltalk, and more specifically the Pharo project, and I had a lot of fun! Now, I'd like to share this experience. I saw a lot of introductions to Smalltalk, but they were all about its amazing features from a CS point of vue. I'm a software engineer, so I'll give you a more pragmatic look, with a few useful tips.

When you hear about Smalltalk, you imagine old bearded guys, clinging to their outdated language. In reality, this is what I saw: a small yet growing community full of nice and motivated people, enjoying development and innovating everyday. Even if the language is old, they're keeping it up to date with today's standards: JIT compiling, web development, iPhone port... I strongly encourage you to take a look and, maybe, participate!

First look: the interface

The first impression is the most shocking: you don't understand what you can/should do with that empty window. It is not the code editor you would expect. It is an entire world, full of living objects. The behaviour of these objects is described by code, but they're not programs with a beginning and an end. For a good impression of that, try closing the environment (don't forget to save) and starting it again: your windows are still open, exactly in the same place! Even selected text is still there! Your object's life doesn't end when you close the environment: they're serialized in the image file.

The environment is composed of a virtual machine executing the code, an image file containing the objects, and a source file, storing a part of the source code. And that's all. No files.

UPDATE: the GNU Smalltalk environment uses files to store source code.

In the previous picture, you will see the code browser, used for everyday development. It doesn't display files, but (from left to right) categories, classes, method categories, methods, and under that, the actual code for the method. The code editor is organized around the actual structure of the code, not some arbitrary folder tree. It can be confusing at first, but it's actually quite elegant. There's a drawback though: you can't use your favorite code editor to write Smalltalk code. Another nice side effect of the image: I store my environment on a USB key, and can use it to work seamlessly on Windows, Mac and Linux (using the one click pharo image).

Second look: the language

The language itself is another surprise: what are those ifTrue and whileTrue? You can't think that Smalltalk has a syntax used for control flow.  In Smalltalk, everything is an object. And the primary way of interacting with an object is sending it a message. The whole syntax of Smalltalk revolves around messages:

  • "1+2/3" is not equal to 5/3, but to 1, because you send to 1 the message "+" with argument 2, which gives you 3, and you send this result the message "/" with argument 3.
  • ifTrue is a message sent to a boolean, with a "block" as argument (a block is a piece of code). The block will be executed if the boolean is an instance of True.
  • you can't access directly the members of an object: you need to create messages to read and modify these members.

The methods are separated between the class side and the instance side. The classes are objects, so they have their own methods (think of it as static methods). They're used for a lot of things, like generating common instances (String crlf, Color blue, etc), or starting servers.

If you take a good look at a class like string, you will spot apparently redundant methods like displayOn:, displayOn:at:, displayAt:, displayOn:at:textColor:. They're not redundant: displayOn calls displayOn:at:, which calls displayOn:at:textColor:. This is actually very elegant, because it keeps methods small and readable.

Keep that in mind when you're developing in Smalltalk: readability is more important that speed, because the time you gain now will be wasted the next time you try to read your code.

Next: the tools

You saw the code browser, but there are other nice tools designed to help you every day.

Monticello is a distributed versioning system integrated in the environment. Nothing special here: it tracks your changes, create revisions, and supports local (folders) and remote (HTTP, FTP) repositories.

There is a test runner that displays all the tests loaded in your environment. You will see that there is a very good coverage, but it is not enough! Contribute a test or two if you have time.

Last but not least: the refactoring browser. It is an amazing piece of code which analyzes your classes, points out design mistakes, and in some cases, can correct them in your place.

OK, now, what can I develop?

You can do about anything in Smalltalk, like other languages: desktop applications, web applications, use databases, network protocols, REST APIs... It is particularly suited for big applications with complex object models.

For desktop applications, you will easily have cross platform code and UI, but you won't be able to use native windows (at least, not with Squeak or Pharo). For web applications, you can choose between these frameworks: Seaside, Iliad and Aida. Each one has a different philosophy, and different strengths, so try them all out!

Developing in Smalltalk has been an amazing experience: I learned a lot, and the concepts and habits I took are easily applied to other languages. now, I just need a way to work in Smalltalk for my day job :)

Gorgeous spammer wants to add you as a friend

Yesterday was introduced the new Facebook Messages interface. Huzzah! You get an @facebook.com email address, unification of IM and email, conversation history, etc. That sounds cool! And what is that new feature called "social inbox"? That's nice, messages from your friends will be prioritized and appear directly in your inbox, and other emails will go to the "others" box. Wait, what?

This feature is meant to help you waste time connect efficiently with your network. I won't go into the analysis of how your email contacts are not always friends, even the important and regular ones, how will Google react, or how will we send emails with no subject line. I'm sure someone will talk about that at length. Instead, let's talk about these nice people always interested in becoming our friends, sell us cheap software enlarge our pe bank account: spammers.

In the old world of regular email (yes, old, we're in Web 2.0, Gmail is soooo last week), we had spam filters. A lot of methods were developed to protect us: blacklistingm whitelisting, greylisting, bayesian filters, SMTP verification, CAPTCHAs, etc. They're not all efficient, but services like Gmail are really good at catching unwanted email. Spamming is an activity with a very low conversion rate: you have to send thousands of emails just to get one gullible person to click and buy. Thankfully, emails are cheap to send. But we could improve that conversion rate. Facebook just did it.

With new Messages, your Inbox will only contain messages from your friends and their friends. All other messages will go into an Other folder where you can look at them separately.

Put you in situation. A gorgeous woman/man/dog wants to be your friend on Facebook. Will you accept her/him? Let's say she has more or less the same tastes has you (that's surprinsingly easy to get the list of a band's fans, same for book, political views, etc). Not yet? Let's say you have friends more gullible than you. The hot woman is a friend of another friend and sends you a message in these terms: "Hi! We met at <gullible friend>'s party a few months ago, I had a really good time talking to you". You just accepted the friend request, admit it. And a few days/weeks later, she will begin sending you messages about great opportunities like ponzi schemes or nigerian scams. And YOU WILL CLICK! Because it will appear directly in your inbox. Because it comes from one of your friends, someone you more or less trust.

Facebook just gave spammers a direct access to your inbox, and offered them targeted advertising, thanks to all the groups, likes, music and book fan groups. Spammers are considered dumb, because they automate a lot. But thanks to Facebook's social features, they will learn to customize the mails, just for you. They will pay cheap workers to talk to you through the fake accounts, they will get you, your friends and your family, and will be a part of your great friends network.

Thanks to Facebook.