Handling IO failure · Unhandled Expression

Let's talk a bit about IO programming. Filesystem, network, GUI... You cannot write useful code without doing IO these days. So why is it so damn hard to do in "safe" languages like Haskell?

Well, in Haskell, you isolate the unsafe parts to be able to reason safely on the rest of the code. What does "unsafe" mean in that context? Mostly, unsafe code is unpredictable, non deterministic and will fail for a number of reasons independent from your program's code.

Surely, you might think "come on, it cannot be that hard, I'm doing HTTP requests everywhere in my code, and there is no problem". Well, let's see how a simple HTTPS request could fail:

you are disconnected from the network (you have no IP)
you are connected to the network, but it is not connected to anything
your network is connected to Internet, but routers are dropping packets
your network is connected to Internet, but very slow
your DNS server is unreachable
your DNS server drops your packets
your DNS server cannot parse your request
your DNS server cannot contact other server to get your answer
your DNS server sends back an invalid response
your DNS server sends back an outdated response
you cannot reach the web server's IP from your network
the web server drops your packets silently before connecting
the web server connects, then drops the connection silently
the web server rejects your connection
the web server cannot parse your packets, and so, rejects them
the web server timeouts
the server's certificate is expired
the server's certificate is not for the right subject name
the server's certification chain has parts missing
the server's certification chain has an unknown root
the server's certificate was revoked
the packet's signatures are invalid
your user agent and the server do not support the same versions of TLS
your user agent and the server do not have common cipher suites
the web server closes the connection without warning
the web server timeouts
the web server crashes
the web server cannot parse your HTTP request and rejects it
your request is too large
the web server parses your HTTP request correctly, but your cookie or OAuth token is invalid
the data you requested does not exist
the data you requested is elsewhere
your user agent does not support the mime type of the data
the data requested is too large for a simple response
the server only sends a part of the data, then drops the connection
your user agent cannot parse the response
your user agent can parse the data, but some way or another, it is invalid

If you have worked for some time with networks, all of those have probably happened to you at some point (and the list is not nearly exhaustive). What did you do in your code? Did you handle all these exceptions? Did you catch all the exceptions (see what I did there)? Do you check for all the error codes? Do you retry the requests where you need to?

Let's face it: most of the network handling code out there is made of big chunks of procedural code, without much error handling, in blocking mode. In most cases, it is ok. But that is sloppy programming.

Safe languages do not allow you to write sloppy code like that. So, we are stuck between correct but overly complex code, and simple but failing code. Choose your weapons.

Personally, I prefer isolating unsafe code in asynchronous systems like futures or actors. I know failure will happen, I know threads will crash, I know I will make errors in my code. That is ok, it happens. So, let's write robust code to handle failure.

For network errors, I just want to know if the server is unreachable. It is ok, I will try later. If my request's authentication is rejected, I want to know, and must handle that failure. Some errors should be handled seriously, others must be put in the "ok, it failed, whatever" bin.

Even if languages like Haskell make it harder to perform IO safely, they are still good tools, because they let you isolate unsafe parts, to let you reason on safe, deterministic parts of the program.

P.S.: ok, the network case was maybe a bit too much. Surely, filesystem usage will be easier? Just for the fun, let's list some possible failures when you want to open a file for reading and writing:

invalid path
correct path, but you do not have the permission
correct path, you have the permission, but the file does not exists
you do not have the permission to create the file
you check that the file does not exists, then you try to create it, but someone already created it in the meantime (fun security bug, that one)
the file exists, but someone is already writing on it, no concurrent access
you have the handle you want on the file, but someone just deleted it
not enough file descriptors available (oh, please, no)
someone is writing to the file at the same time
there are so many page faults that your program is slowed down
the disk is slow, blocking on a large operation
the disk is full
you checked that you have enough room, but someone is filling the disk at the same time
the file is on a networked file system, and it is slow
the file is on a remote disk, and the network just failed
hardware failure in the disk
hardware failure in the RAID array (and for some reason, redundancy was not enough, you lost the data)
the file is on a USB card that someone just unplugged

Basically, IO is a nightmare. Please wake me up now.

Related Posts

nom 5 is here 17 Jun 2019

FOSS is free as in toilet 27 Nov 2018

No, pest is not faster than nom 04 Oct 2018