Monday, January 23, 2012

Call Me Rip Van Webble

Before I Fell Asleep

As best as I can tell, I fell asleep in the winter of 1998. The dot com craze was full of irrational exuberance and why shouldn’t it? In a few short years, the entire world went from being mainly disconnected with a few islands of lameness, to a global Internet where everybody was rushing to put up an online presence even if they weren’t entirely sure why that was a good thing. Clearly there was some there there though: through the magic of HTML forms you could click one button and get pizza, pet food and tickets to ride. Sure they looked a little janky and the web certainly wasn’t as friendly as a native app, but none of that really mattered. The choice wasn’t between native apps or the web, it was the choice between the web or nothing at all. HTML at first was a pretty simple beast. You could pick up the jist of how to do things in a matter of hours. But things were changing: Microsoft -- surprising for a big company -- sensed an existential threat and came out with Internet Explorer as its answer to Netscape. So ensued browser wars and with it the quest to one up each other. Back in the early days, CSS was barely being talked about and the web world wasn’t quite sure whether java plugins were the way to way to go, or something else. There was this funky thing called javascript, but it wasn’t what respectable people used because it was riddled with incompatibilities, was slower than molasses, and was generally useless. No, you did what nature intended: you built your pages on the web server. Duh. And you went to great pains to deal with the incompatibilities with browsers due to immaturity of the web and of course the politics of world domination. You also tried as hard as you could to make what got sent over the wire wasn’t too piggish: dial up modems were the norm even if some crazies were talking about using the forlorn dregs of the cable spectrum for something called broadband. Again javascript, like flash, was the target for disdain as being useless bloat. Life was simple in those days because expectations were so low: the amazing thing wasn’t that the pig sang well, it’s that the pig sang at all. The last thing I remember before nodding off to sleep was something about “push” being the Next Big Thing. It was that bit of hype, I’m convinced, that caused my eyes to glaze and then shut...

After I Awoke

When I awoke from my long sleep the web world had changed quite a lot. Many things were as familiar as ever, but many were different. My first introduction was through my project Phresheez. Phresheez is my app and site which does all kinds of fancy things to visualize your skiing/boarding days in new and innovative ways. Phresheez started as a trip to Kirkwood to ski in April of ‘08 with a Garmin hand held GPS in my pocket. When I got home, I uploaded all of those points and hacked together a crude bread crumb map using the Google Maps API. At the time I was rather confused as to exactly who should be creating the necessary javascript code: should it be in the context of the server PHP generating the javascript code or what? In retrospect that’s completely obvious, but it shows where I was before I went to sleep -- the server side did and drove everything. In any case, after seeing the Google map of my day I was hooked. This was seriously cool, and it could be done on any old browser! Wow! So off I went playing around with Phresheez mining the data sets from my last few days of skiing for all kinds of interesting goodies. At first, Phresheez was nothing more than a web site that required a rather onerous way to upload points via GPX files. But it was a start. Sites require all kinds of supporting logic for user management, admin, etc, etc so I went about designing those. Since I had fallen asleep, lots of things had been designed to make that kind of drudgery slightly less of a drudge. But I didn’t know about them, and when I asked around it seemed like the general consensus of people I asked was that “it’s best to roll your own” was much to my surprise the answer. That turns out to be not *entirely* correct, but I didn’t know any better. So to nobody’s surprise my first pages were ultra static server produced blobs that were as unwieldy as they were unbeautiful. If you took my pages to the Antiques Road Show, they would have correctly dated them circa 1995. There was a paradox on the site though: the map visualization page was quite alive in comparison, and I was constantly improving it: be able to see your avatar move around the resort? Sure! Be able to see your friends at the same time? No prob! Be able to add geotagged pics that pop up when your animated avatar gets there in time? Wow, that’s cool! So it was that the seeds were sown for a new to think about this. And it was all done because of, not in spite of javascript. While I was asleep, on the other hand, I had recurring nightmares of the horrors of browser incompatibility, buggy javascript, and above all else that javascript was dog slow at doing just about anything. But now that I was awake and was actually playing around it was remarkable to me how compatible everything actually was. Well, there was IE6 but who cares about an ancient piece of M$ crap anyway. So even with Firefox 1.5 things actually worked pretty well, and when I upgraded to Firefox 2.x the performance of javascript was noticeably faster. Very cool.

Smart Phones and Hybrid Apps

The other thing that happened around 2008 is that iPhone apps started really hitting the shelves and that it was obvious that this wasn’t a passing fad. And then Google announced Android which had very similar characteristics to the iPhone and it became pretty obvious to me at least that since M$ and RIM both had their heads up their butts, the players in the inevitable mobile death match were likely set. For Phresheez, iPhone & Android were a godsend because they both had GPS’s integrated in them: no more clunky uploading of GPX files; just send the points directly from the phone to the server like magic and enjoy. So off Phresheez went. At first the mobile apps were nothing more than glorified point uploaders with a bare bones set of things it would show you: server-side created charts and static maps with your tracks and a few other things. It became clear after using the app the winter of ‘08/’09 that the app had the potential for much more. The first foray into this was a friend finder. At first I thought about writing it using the native UI, but with iPhone and Android to support that seemed like a fool’s errand. So I learned about embedded web browser objects. After some fiddling around it turned out that not only were the embedded web browsers acceptable, they were pretty state of the art on top of it. So it was that the first bits of my hybrid web/native app were born: completely out of necessity, but a virtue as it turns out as well.

Just Using Javascript

At the same time, the features kept rolling in: being able to see your animations on the phone, being able to friend people and see what they were doing, etc, etc. Since I stole quite a bit of the ideas if not code from the main site’s javascript heavy Google Maps based pages, it became completely natural to just write all of the layout in javascript: why did the server side need to worry about that anyway? In fact, the server side was really sort of in the way: all of the data was demand loaded using AJAX so its piece in the display layout was pretty limited by necessity. The other thing that drove this is that since it was being built as the in-app user interface, you don’t have or want any of the normal browser chrome, so the web portion of the app needed to be both self-contained and long lived. At first I was very cautious about this because I had no idea just exactly how much javascript would cause the app to roll over and die. As it turns out, it must be quite a lot because I don’t think I’ve ever run into that limitation. So it was that the server side on most user facing pages was becoming nothing more than a boot loader for the javascript, CSS etc. And this was a Good Thing, though not any sort of virtue in my mind. Another curious thing was also happening: as I became more familiar with javascript, I found myself moving more and more layout from the server side into the javascript side generally. It was nice to have interactivity to the dowdy static pages. As I became more familiar, it was easier to envision just coding the content layer up in javascript rather than server-side. I’ll say that even to this day I still feel more comfortable starting out a new page as a server generated page. Why? Two reasons: 1) old patterns are hard to get past and 2) I still don’t have a framework where I can cut and paste a javascript/AJAX method template easier than print’ing ‘<html></html>’ server side. Both of these are my own failings though, not anything inherent.

HTML 5 Widgets

The main problem with a web app within a native app, however, is that it unsurprisingly needs to load it off the net if you want it to be demand loaded as opposed to statically included from the resources in the app. We wanted the dynamic load aspect because of our complete and justified aversion to dealing with the Apple App submission process and its twisty mazes of Fail. But start up times are seriously unhappy especially with crappy data service like you find in many ski resorts and most of San Francisco. Back then, we also had to contend with the iPhone 3G which didn’t have the ability to background apps, so each time you fired up the app, it would have to load its UI off the net. Fail. Something needed to be done about this. So on we go to the next season and I got wind of a thing that was floating around the HTML5 world: widgets. I really knew nothing about HTML5, but the widget idea was a simple one: package up all of the parts of a web page into a zip ball, write a manifest for it, and provide an interface to fetch it. Here was the solution to the nettlesome problem of not wanting to reload the UI fresh from the server each time (realize that browser caching on phones sucks for the most part). The other important part of the widget approach is that it buys a huge amount of agility. Going through the app approval process with Apple is nothing less than a Gothic horror. And even with Android, you still have to contend with the fact that if you put a new version out a sizable percentage of the user population won’t bother to upgrade. So the widget approach neatly solves these problems in a webby like way, but without a lot of the webby downsides. The other thing that was happening around this time is we discovered that what we were really building wasn’t a web app, per se. It was a replacement for the UI front end of a native app with a bidirectional RPC mechanism to affect change in both the web and native parts of the app. At first this was very tentative, but it has evolved to the point that almost the entire UI is dealt with in the webview which reaches out for native functionality as needed. It’s not a perfect RPC interface by any stretch of the imagination, so if you have something like a game or other needs that requires real-ish time components I really doubt that this webview architecture is appropriate. For now at least. But there are a huge number of perfectly interesting applications whose requirements are actually quite modest for which this architecture works just fine.

Where the World Actually Was

So I really had not much clue where the web world actually was. I had heard about things like Ruby on Rails but really didn’t understand how it would benefit me: it always seemed like a mental disconnect. Part of the reason I never investigated my disconnect was some good advice from Dan Scheinman at Cisco: Ruby programmers expensive, PHP programmers cheap. Yes, I knew that Rails runs on Python, PHP and probably most other languages these days but it didn’t occur to me to wonder whether I should care. So I didn’t. As it turns out, maybe I should have cared but not in an embrace “care” kind of way. The reason why it gave me such a disconnect is because I wasn’t approaching the Phresheez problem even remotely the same as the server-centric content generation way. Ruby and all of the other things like it grew up when javascript sucked and was something to use mostly at your peril, and then mostly to do flashy things that if they broke the site was still operable. To some people, the latter was and probably still is a big driver, but I didn’t care: get a semi-recent browser or get lost. With my mobile web app it was even less of a consideration because I knew what the capabilities and deficiencies were of the embedded browsers I had to deal with: webkit I <3 you. So where most of the world was dealing with server-side content layout generation, I was at this point doing nearly all of my content generation client side. Disconnect detected at least.

The World Has Changed

When I started thinking about this it finally became obvious why the world was the way it was: it was largely still operating on the assumptions from 7 or 8 years ago where javascript was an untrustworthy lout to be avoided if possible. This is hardly surprising as it takes a long time for new technology to percolate through and for people to grok that a shift has happened. I didn’t realize that it had happened because I was asleep and merrily went ahead using the web world as I found it because I expected change even if it took me a while to get it. But now I think I do. The old server-side layout pattern is a dead man walking. The speed of javascript has been increasing at an amazing clip, and the new goodies of HTML5 including the canvas tag’s vector graphics means that you can do a tremendous amount of work using somebody else’s CPU cycles: namely the client device. The server side in the new world is for two main purposes:
  1. Create skeletons of a page’s static programs and data This is like the web equivalent of a boot loader
  2. Serve up dynamic raw content in the form of json blobs from AJAX callbacks
The net of this is that world is coming back to something that is familiar and foreign at the same time: using javascript as the base programming language isn’t entirely different than using, oh say, Python with GTK. But it’s not really the same because there are all kinds of web and mobility considerations. In particular there is the nettlesome issue of startup performance which is especially acute for mobile apps. What occurs to me is that the widget approach is an instantiation of a particular kind of web based boot loader. We did this originally because it was pretty straightforward: zipping up a page with wget and unzipping it in an app is not terribly hard. It does require a native app context to store the elements of the page so the widget approach is practically limited to apps, typically of the mobile kind. That’s nothing to sneeze at, but HTML 5 widgets aren’t the only approach.

A New Approach to Local JS Modules

One of the nifty new features of HTML 5 which happens to be implemented on both the iPhone and Android (and mostly likely any other webkit based engine) is localStorage and openDatabase. So it seems quite possible that instead of creating a monolithic “webapp” zip ball that requires native app support to store and load, that we can decompose the individual components of the web page and store them locally in the context of the browser itself. HTML 5 localStorage is a very simple interface of key/value so any sort of complexity will require management of its namespace and limitations. It could be used to store javascript, CSS, templates, etc, etc, in the context of the app/site. The problem is that there’s meta data required to be stored with each file so you’d need to do something along the lines of the the old Apple resource and data fork for files for the client stub of the web boot loader to store resources (eg, javascript, CSS, etc) and the metadata associated with it (eg, current revision). On the other hand, the openDatabase interface -- which is nothing more than a very thin veneer over SQLITE as far as I can tell -- happily stores all of the metadata in the same row as the file blob. Normally I’d be cautious about such a thing, but SQLITE has been around for a long, long time now so hacking up the javascript shim should have been a pretty straightforward exercise for the browser guys without huge exposure to weird API bugs, incompatibilities, and other random fails. In the New Approach, the server-side boot loader serves up the skeleton of the page not with actually direct references to the file to load (eg <script>), but with an abstract notion of what the files are, the meta data including versioning information, and the javascript client stub of the loader. The client loader then just looks through its database to see if it needs to upgrade the particular files referenced in the manifest, or whether the local versions are fresh. This was a quite straightforward exercise. The beauty of this is that the client boot loader stub is very simple, and the server side should not need to change very much from its normal job of serving up versioned, packed and zipped page components. Image data -- which is a benefit of the Widget zip ball -- is problematic though: instantiating an image from local data is kind of possible using the data: URI scheme with the Image DOM object, but using them would not be nearly as natural because you’d be forced to use the javascript/DOM mechanism to generate content rather than the HTML <img> tag. Or so it seems at this point. So maybe there really is a case for both the HTML 5 widget mechanism, as well as the New Approach/SQL: they each have their strengths and weaknesses.

Why it Matters

When I woke up, I had expected that there would be standard web sdk’s for drawing all of the standard widgets that one normally needs to do... just about anything. Slider bars, dialogs, windows and all of that sort of thing -- ala Motif or something like that back before I went to sleep. There was not. Or more correctly, there wasn’t anything even remotely dominant. That really surprised me, but it started to make sense: downloading a boatload of javascript to form that sdk is expensive: no one wants to download 1MB of js bloat for a 5k nugget of web goodness. So everybody rolls their own. Given the boot loader I outlined above, however, we can start thinking about the way that operating systems approach this problem with massive graphical/windowing libraries: you share them and demand page them into memory as needed. In the web world we probably aren’t going to see the moral equivalent of a demand paged VM anytime soon, but demand paging is a solution to a specific problem: how to efficiently use RAM. The browser equivalent is not wanting to transfer and execute huge amounts of javascript. We can approximate that by hand coding a manifest of the files needed by the page and letting the client loader deal with them if they’re not currently loaded. This gives the ability to contemplate rather large and standardized libraries in web pages served by a site. Woopie! For uncomplicated libraries like, oh say, jquery a hand coded manifest is pretty straightforward. However, modules in real libraries have all kinds of dependencies and they may not be obvious. Operating systems deal with these sorts of dependencies as one of their functions -- look at ldd on Linux on a binary. I think we’re going to need an equivalent automated dependency graph generator to form a “real” javascript library loader. Here our memory “pages” are javascript modules (really they’re more like the old idea of segmented VM). So somehow - through some in-file declarations, or actually compiling and generating -- we want to suss the dependencies out and send them to the web page in an automated fashion. One of the implications here for engineers is Trees Good, Cyclic References Bad. That is, we really want the path to, say, the slide bar widget to be a linear path of modules from root to leaf rather than finding a cascade of dependencies that pull in half the massive library. I expect this is going to require some clever thought about how to factor (and refactor) modules. It may also necessitate module duplication to keep the tree property. This may sound like heresy, but it need not be: we’re talking about the tree structure as it appears to the client, not whether a module is strategically symlinked as needed. That said, this needs some real life validation. How will a really large library fare? What are the practical limits? How serious are cyclic references? Do we need a module GC to round up dead references in the boot loader client database (think of the effect of refactoring)? These libraries are shared, but only with the confines of a site. Should they be shared wider? Are there going to be site-wide collisions for large, heterogeneous sites? How would that be dealt with?

What’s Left for the New Web World

The larger observation here is that the world is shifting from server-side layout generation to client-side layout and that is going to shake the web world up quite a bit. This paper has mostly focused on what the new world looks like at a low level, and the basic problem of booting a web app up that works well in a mobile environment. Much, much is left unknown here. For example:
  • What is the solution to templating in the new world?
  • The boot loader solution is a win with hybrid mobile web apps, but is it a viable approach generally?
  • Could you get better performance still if browsers allowed javascript to be byte compiled and stored locally instead of the javascript source?
  • Should this entire idea of boot loading become standardized on browsers so that we can have the browser itself handled this as a browser “OS” level primitive, so that it could deal with multi-site sharing (ie, not have a jquery instance loaded into every site in the universe)?
  • There is a wealth of server side content authoring software that basically doesn’t exist for a client-side approach. What is actually needed? This is particularly acute with HTML 5 canvas vector graphics, but is really a general deficiency too.
  • What about SEO? I’ve read Google’s guidelines, but they are clearly written with 1998 in mind. You can fake some of this, but it’s still “fake” and they might notice that. Ultimately this is Google’s problem too.
I’m sure there’s plenty more but this missive isn’t intended to be exhaustive.

Conclusion

I’m sure that a lot of people will take exception to probably each and every thing I’ve written here: that’s cool, the world is a messy place and I’m sure there are a lot of very clueful people who get this and wonder why I think any of this is new, or those who think I’m just plain wrong. Nor is the point of writing this to chronicle general history: it’s to chronicle my own naivete and evolution. While I was asleep there was essentially a complete generation of web engineering methodology that I didn’t see. Unsurprisingly the engineering tools that were just starting to come about speciated and matured in the 10 years I slept. Things like Rails, Cold Fusion, and probably lots of tools that I’m still completely clueless about rule the day. But some forward thinking people also decided to take javascript and CSS at its word that it was both real, relatively incompatibility free and go for it. They were, undoubtedly, very brave. I on the other hand only witnessed their handiwork which made it manifestly obvious that it was not only possible, but an easier way to think of things. Why bother with the awkward situation of server content generation along side AJAX callback and its co-necessary manipulation of the DOM? Why not just move the content layout generation to the client altogether? So that’s what I did, and I haven’t suffered any lightning bolts from an angry Old Testament God for my audacity.