My first involvement with HTTP and web came in 1992. Challenged to create a MUSH as a means of delivering online education, the zeitgeist of the time of information and Internet came through and I built a browser and web server. I had never seen or heard of web before, the closest i had seen was Veronica and Gopher and of course Archie. Archie was access via telnet, and was kind of far from graphical.
The HTTP 0.9 protocol was not yet known as that, and was exceptionally simple. You would telnet to port 80 on some host, type ‘GET /path‘, and it would return as-is. If you knew what to do w/ the result, you were good. Initially it was thought that only text would be used (no fonts, no css, no images), so this was fine.
In the system I built (CALVIN, Computer-Aided Learing Vision Information Network), a C++-based fork+serve web server managed the file serving. All files were treated equally, the path you gave was the path it served. An X-Windows + Motif-based client with a simple HTML widget was the other end of this, running on a Decstation 3100. While implementing this I had an idea. Why not guess, based on the file extension, the type? This way I could handle an image and invent some sort of image-tag for HTML The img-tag had not yet been invented (and the specs, such as they were, were nowhere easy to find), I think i chose <image path>
rather than <img src=path>
which was later standardised.
So I forged ahead. I did some sort of strtok()
on the file name, looked at the string after the dot, if it was jpg
or gif
or pnm
, would render appropriately. Life was simple then. Got the project done, did the presentation, got the grade, got out. The X interface leaked memory like a sieve so the demo was short 🙂
Fast forward to 2020. The standards evolved somewhat, and, a header called Content-Type
now exists for this purpose. The server is responsible for telling the client how to interpret content. And, a well behaved client should never guess what to do based on the extension (sorry 1992 me). You see, since 1992, the web had become a less simple, less safe space. Malicious actors discovered they could send active content to be evaluated by Internet Explorer’s aggressive mime-type-guessing algorithm, and thus gain control of the desktop.
HIstory suggests that, for each new security hole in HTTP, a new header is created. And, this flaw was no exception. Enter the X-Content-Type-Options header. In proper use, one adds:
X-Content-Type-Options: nosniff
to the HTTP response. The browser, on receipt, decides to listen to the server solely, and not its internal algorithm. Security achieved!
Fast forward to today. As an experiment in magic proxy forwarding zero-trust mumbo jumbo logic, I exposed my printer to the Internet (only for authenticated users with valid roles, stop accusing me of helping the Mirai botnets out). And, to my chagrin, it didn’t really work, all pages were blank. On diging into it I find that, to my simple-minded-printer, all mime types are text/html, see below.
curl -v http://printer/sws/util/cookie.js
* Trying 172.16.0.222:80...
* TCP_NODELAY set
* Connected to printer (172.16.0.222) port 80 (#0)
> GET /sws/util/cookie.js HTTP/1.1
> Host: printer
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Connection: close
< Content-Type: text/html
< Content-Length: 621
< Cache-Control: max-age=0, no-store, no-cache
<
function CreateCookie(name,value,days) {
var expires = "";
if (days) {
var date = new Date();
date.setTime(date.getTime()+(days*24*60*60*1000));
expires = "; expires="+date.toGMTString();
}
document.cookie = name+"="+value+expires+"; path=/";
}
function ReadCookie(name) {
var nameEQ = name + "=";
var ca = document.cookie.split(';');
for(var i=0;i < ca.length;i++) {
var c = ca[i];
while (c.charAt(0)==' ') c = c.substring(1,c.length);
if (c.indexOf(nameEQ) == 0) return c.substring(nameEQ.length,c.length);
}
return null;
}
function EraseCookie(name) {
CreateCookie(name,"",-7);
}
And, the web-application-firewall exposes it with the security headers set properly. So now we get into, how to do this securely.
Option 1, we delete the X-Content-Type-Options.
Option 2, we remap the individual files to their mime-types.
Option 3, we do the same mime-type-from-extension trick I did in 1992.
Option 4, uh, don’t put the printer on the Internet.