A lot of people have asked me for advice on learning to code. What should I learn first? How long will it take? How can I get a job as a programmer? The answer to most of these questions is “it depends.” But there is one question that I can answer definitively. Should I learn to code? The answer is yes.

No matter what your day job is, how old (or young) you are, or what your level of commitment might be, any exposure to programming will have a notable impact on your life. Learning to code is like learning to speak a new language. With a relatively small amount of effort — just learning a few phrases really — you’ll find that a whole new world has opened up to you.

In the next few paragraphs I’ll try to say some stuff that’s relevant to as many people as possible.

My first recommendation for learning to program is to learn HTML. HTML is the language of the web. It’s relatively simple, and very forgiving of mistakes. It’s a great way to ease yourself into a very fundamental aspect of programming: syntax. Before you can make any headway, you have to learn to read and write syntactically correct code. htmldog.com has some great introductory tutorials on the subject that walk you through creating a simple web page. After a few hours you will have an understanding of a core piece of technology that billions of people use every day and which has completely transformed the world in the last two decades.

Once you have a bit of HTML under your belt, you can move on to a simple procedural programming language like python. Python is a great programming language to start with because it lets you focus on the more conceptual side of programming without having to worry about the nitty gritty details of how computers really work on the inside. What makes python especially great is that it’s not just for beginners. Some of the biggest sites on the internet were programmed in python. The most popular example is YouTube. The Beginner’s Guide on the python website has a very thorough list of resources for learning python. Ten years after reading it, I still recommend the book I used to learn python, which is available online for free: How to Think Like a Computer Scientist, co-authored by my first computer science teacher: Jeff Elkner.

If you’ve managed to navigate your way through HTML and python, then you are well on your way to accomplishing any goal you are willing to pursue.

Now a word about motivation. Programming can be hard. Programming can be frustrating. At some point you will find yourself staring at your computer screen for an hour, not being able to figure out why your code isn’t working the way you expect. This is par for the course and a normal part of the learning process. I’ll leave you with a few tricks to getting through these rough spots.

  1. Try not to program alone. Two heads are better than one and 99.999% of software is written collaboratively. Working together makes problem solving a lot easier and you’ll learn more in the process. There is an entire programming methodology that is premised on two people working together. It’s called Pair Programming and you should do it.
  2. Have a project. It’s incredibly boring to spend all your time doing tutorials and reading books. It doesn’t matter how big or small the project is, it’s just there to act as a test bed for all the new things you are learning. A great first project is making a personal web page about something you are interested in, or even just coding up your resumé in HTML.
  3. Get involved in an open source project. There are tons of projects out on the internet that other people have started and which desperately need your help to fix bugs, build new features, or even just write documentation. Working on an open source project is a great way to improve your coding skills because you can learn from the code that’s already been written. If you are reading this and already have an open source project that you are working on, add a link to it in the comments.

At the request of many people, I’ve moved the code from my Easy Facebook Scripting in Python gist into a proper Github repository and made a python package for it that has been published to pypi. The api has changed a bit so be sure to check out the documentation in the README file.

In the next release of fbconsole, I plan on adding these additional features:

  • Handling of expired access tokens
  • Logout function
  • Video upload support

If there are any specific features you’d like to see added to fbconsole, file an issue on Github.

UPDATED: fbconsole Pypi Package and Github Repository

Sometimes you just want to write a little script using Facebook’s api that updates your status, or downloads all your photos, or deletes all those empty albums you accidentally created. In order to streamline my writing of one-off facebook scripts, I created a micro api client that implements the client-side authentication flow and has a few utility functions for accessing the graph api and fql.

To use this mini api client, all you have to do is put 4 lines of code at the top of your python script:

from urllib import urlretrieve
import imp
urlretrieve('https://raw.github.com/gist/1194123/fbconsole.py', '.fbconsole.py')
fb = imp.load_source('fb', '.fbconsole.py')

Now you can specify the permissions you’ll need for your script (from the list of available api permissions) and authenticate yourself:

fb.AUTH_SCOPE = ['publish_stream']
fb.authenticate()

By default, the api client makes requests as the “fbconsole” app. You can use your own app by setting fb.APP_ID. When you authenticate, a browser window will open asking for whatever permissions were requested by your script. After you go through the permission dialog, the script will continue running. The access token used is stored in a local file when you authenticate so the next time around you won’t be presented with a dialog in your browser.

Once authenticated, you can make whatever calls to the graph api or fql that you want. For example:

Post a status update

status = fb.graph_post("/me/feed", {"message":"Hello from my awesome script"})

Fetch likes on a status update

likes = fb.graph("/"+status["id"]+"/likes")

Delete a status update

fb.graph_delete("/"+status["id"])

Upload a photo (why does python make this so hard?)

fb.graph_post("/me/photos", {"message":"My photo", "source":open("my-photo.jpg")})

Query FQL tables

friends = fb.fql("SELECT name FROM user WHERE uid IN "
                 "(SELECT uid2 FROM friend WHERE uid1 = me())")

If you download https://raw.github.com/gist/1194123/fbconsole.py and run it, you’ll be dropped into a python shell so you can just play around with api calls in an interactive environment. An IPython shell will be used if you have IPython installed.

The code is just in a gist on github at https://gist.github.com/1194123. Feel free to comment on this post or on the gist if you have questions.

Every now and then I think of some interesting programming project that would be fun to do. A lot of times I actually start working on these side projects. Unfortunately, I typically only stick with the project until I get to the finishing touches where it’s 90% done. Of course, it’s the last 10% of the project that always takes 90% of the time and by then I’ve stopped learning cool new stuff and all that’s left are annoying bugs. So before I start on my next project (after having lost interest in the last one), I’m going to write out my idea so that maybe someone else with more spare time than me can build it.

Every few weeks for the last year, someone around me mentions node.js as this cool new thing that everyone has to try out. In case you haven’t heard, node.js takes the V8 javascript engine and lets you run server side javascript. The especially exciting part is that all IO, including such common things as reading a file from disk and talking to a database, are asynchronous. It uses an event loop instead of a threading model to achieve really good concurrency, which apparently is important for things like really scalable network applications.

I’d love to have an excuse to play around with node.js but haven’t thought of anything that compelling until now. I usually like building web applications but unfortunately building traditional web applications in node would be missing the point (or painfully difficult). All the database drivers for traditional dbs like mysql and postgresql are blocking so you don’t get any of the benefits of node’s asynchronous IO model. If you are not doing anything cool with concurrency, you might as well stick with what you know and use python or ruby.

Ok, so what are cool concurrency related problems to solve? You could always write a chat server, but everyone and their cousin is writing a chat server in node these days. You could write a static file server or a web proxy, but that’s kind of boring and everyone already uses nginx. But what about taking static resource servers to the next level?

Imagine a static file server and/or web proxy that was geared towards serving resources like javascript and css used in larger web applications. Some features you don’t typically expect from a static file server but would be totally awesome and useful:

  • automatic javascript minification – http://github.com/mishoo/UglifyJS looks promising
  • automatic css minification – you might have to write this yourself, not too hard
  • automatic image optimization – use pngcrush or something similar
  • proxy to another server instead of the file system to get the unminified files
  • rewrite static resource urls in html files on the fly to point to the minified sources
  • concatenate multiple css and js files into single files based on usage as detected in html
  • hooks for adding your own customizations – it’s just javascript… no need to dust off those c books
  • track usage statistics and performance metrics in some key-value db

This wouldn’t replace nginx or CDNs or anything for really big scale stuff, but I think it could be a useful development tool. You would run this static resource server along side your single-threaded application server during development. You could even run it in production as the primary source backing a CDN. It would be easy to use with any sort of application server written in any other language. Being able to customize the server with javascript would make it easy to integrate any static resource packaging system you might divise for your application server. The best part though is that you get to write it in node!

Ever since Facebook launched graph.facebook.com, I’ve been wanting to check it out and see just how superior it is to Facebook Connect. It turns out it was pretty easy to authenticate myself from python using OAuth 2. So I wrote a little script that spits out my feed on the command line.

If you’ve ever felt frustrated implementing an OAuth client before, rest assured that OAuth 2 is one million times easier to work with than OAuth 1. You don’t have to keep track of all these different tokens or worry about generating signatures in just the right way. The one gotcha here is that you need a web server for the user to be redirected to after they authorize your application with Facebook. Fortunately, python makes it really easy to start up a mini webserver for the purposes of OAuth.

Here is the script in 75 lines of python. If you want to try it out yourself, you’ll have to register an application with Facebook at http://developers.facebook.com/setup/.

#!/usr/bin/python2.6
import os.path
import json
import urllib2
import urllib
import urlparse
import BaseHTTPServer
import webbrowser

APP_ID = 'your-app-id-here'
APP_SECRET = 'your-app-secret-here'
ENDPOINT = 'graph.facebook.com'
REDIRECT_URI = 'http://127.0.0.1:8080/'
ACCESS_TOKEN = None
LOCAL_FILE = '.fb_access_token'
STATUS_TEMPLATE = u"{name}\033[0m: {message}"

def get_url(path, args=None):
    args = args or {}
    if ACCESS_TOKEN:
        args['access_token'] = ACCESS_TOKEN
    if 'access_token' in args or 'client_secret' in args:
        endpoint = "https://"+ENDPOINT
    else:
        endpoint = "http://"+ENDPOINT
    return endpoint+path+'?'+urllib.urlencode(args)

def get(path, args=None):
    return urllib2.urlopen(get_url(path, args=args)).read()

class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):

    def do_GET(self):
        global ACCESS_TOKEN
        self.send_response(200)
        self.send_header("Content-type", "text/html")
        self.end_headers()

        code = urlparse.parse_qs(urlparse.urlparse(self.path).query).get('code')
        code = code[0] if code else None
        if code is None:
            self.wfile.write("Sorry, authentication failed.")
            sys.exit(1)
        response = get('/oauth/access_token', {'client_id':APP_ID,
                                               'redirect_uri':REDIRECT_URI,
                                               'client_secret':APP_SECRET,
                                               'code':code})
        ACCESS_TOKEN = urlparse.parse_qs(response)['access_token'][0]
        open(LOCAL_FILE,'w').write(ACCESS_TOKEN)
        self.wfile.write("You have successfully logged in to facebook. "
                         "You can close this window now.")

def print_status(item, color=u'\033[1;35m'):
    print color+STATUS_TEMPLATE.format(name=item['from']['name'],
                                       message=item['message'].strip())

if __name__ == '__main__':
    if not os.path.exists(LOCAL_FILE):
        print "Logging you in to facebook..."
        webbrowser.open(get_url('/oauth/authorize',
                                {'client_id':APP_ID,
                                 'redirect_uri':REDIRECT_URI,
                                 'scope':'read_stream'}))

        httpd = BaseHTTPServer.HTTPServer(('127.0.0.1', 8080), RequestHandler)
        while ACCESS_TOKEN is None:
            httpd.handle_request()
    else:
        ACCESS_TOKEN = open(LOCAL_FILE).read()
    for item in json.loads(get('/me/feed'))['data']:
        if item['type'] == 'status':
            print_status(item)
            if 'comments' in item:
                for comment in item['comments']['data']:
                    print_status(comment, color=u'\033[1;33m')
            print '---'

Last Thursday, we decided to flip the switch on the brand new version of Divvyshot, which I’ve spent a good chunk of the last 6 months working on. With the help of Michael Yuan’s brilliantly simple design, and Sam Odio’s unwavering tenacity, I feel confident in saying that Divvyshot is the easiest and most beautiful way to share photos on the web today.

So let me take a moment to tell you exactly how Divvyshot is breaking new ground for online photo sharing by going through some of our user interface decisions. We spent a lot of time avoiding some existing user interface paradigms, which though commonplace on the web, are clunky and unpleasant. Here’s a sample.

Just Say No to Paging

Most websites that display lots of information use paging to split up the information into distinct chunks. When you search on google today, if the result you are looking for is not in the top 10, you have to click through to the second, third, or fourth (sometimes more) pages. Paging is used to improve page load times as less data is being sent over the internet, but it has the undesired effect of breaking up the natural flow of your eye from top to bottom. On Divvyshot, we display lots of data; sometimes thousands of photos on a single page, but instead of forcing you to click through 10 or 20 pages of 100 photos each, we automatically load more photos as you scroll down the page. Using the load more on scroll model the user has the best of both worlds: a page that loads quickly, and an uninterrupted browsing experience. It’s fast, simple and intuitive. Even your mom can do it.

Edit Forms? Who needs them?

Many websites force users onto a separate webpage for editing content. If you are making big sweeping changes to the content or entering in a lot of data (think website billing information) having a separate page can clean things up. But if you just want to change one little piece of information (like correcting a typo), you have to go through an unnecessarily lengthy process of clicking on an edit button, waiting for the edit form to load, typing in the new data, clicking the save button, and waiting to be taken back to the first page you were on. That’s a lot of clicking around and waiting for such a minor change. On Divvyshot, editing little bits of data is completely integrated into the user interface. Want to rename a photo? Just highlight the text, type in a new one, and hit enter. The new name is saved immediately, without the need for any page reloads or excessive clicking.

Page Refreshing is Old School

Divvyshot is by no means a static website. Things are constantly changing as new events get created, new photos are uploaded, comments are being written, etc. Rather than having to hit the refresh button every 30 seconds to see if your friend has uploaded the photos from the party last night, Divvyshot automatically updates the page when there is something new to look at. You can literally see new photos stream onto the page as they are uploaded by someone on the other side of the Earth. Even though I implemented this feature, it still catches me by surprise every time something new pops up onto my Divvyshot page. I can’t help but think, “that’s cool”

Uploading Files in the Year 2010

Uploading files to the web has been a painful process for a long time – especially if you need to upload multiple files. Often times the user is provided with a few file input widgets which they can diligently click on to select files, then hit the send button and wait patiently as the data is uploaded to the web synchronously – remember not to hit the refresh button or you might submit your files twice! Uploading files with AJAX in the background is also a painful experience requiring the awkward use of iframes, and still requires those pesky file input widgets.

Java applet uploaders were at one point the answer to multifile upload, even allowing you to drag and drop files like a normal desktop application. But java applets take forever to load and sometimes just don’t work for reasons that are difficult to debug. When’s the last time you updated the version of java running on your machine?

Then there is flash, which now provides a hook for selecting multiple files. When combined with a JavaScript bridge, you can whip up a pretty good multifile upload experience using something like SWFUpload. Unfortunately, flash is not quite a panacea, bringing its own problems to the table. Anyone here use linux?

Despite a long history of less than stellar file upload patterns, there is a light at the end of the tunnel. That light is HTML5. With the latest version of Firefox, it is now possible to build a truly beautiful and intuitive upload experience. Firefox 3.6 supports dragging and dropping of files onto any drop target, the selection of multiple files from a file input widget, and the reading of that file data in JavaScript for asynchronous uploading. Other modern browsers like Chrome and Safari should follow suit with their own implementations in the near future. Divvyshot has embraced this and other cutting edge browser technology to take user experience to the next level. Check out this quick screen cast showing just how drag and drop uploading works:

We’re very excited about Divvyshot. If you haven’t checked it out already, it’s definitely worth a look. We also love to get feedback. Email human@divvyshot.com or check out our page on uservoice to submit a feature request or vote on others.

django-piston is a mini framework for creating REST APIs. One of the big reasons to use django-piston rather than rolling your own framework is that it supports OAuth authentication “out of the box.” Unfortunately, documentation on how to enable OAuth authentication is largely lacking. Here are the steps I went through to fully hook up OAuth support in my django-piston powered REST API.

  1. Add piston to installed apps and run syncdb
    This step isn’t necessary when using piston without OAuth support. However, if you want OAuth support, piston defines a few models that are needed to keep track of the OAuth tokens and consumer keys. Adding piston to INSTALLED_APPS will register these models.
  2. Add an oauth challenge template at oauth/challenge.html
    For some reason, this template does not come with django-piston. When an unauthenticated request is made to an API protected by OAuth, the contents of this template will be returned. In theory, the template should describe what OAuth is and how a client programmer can get access to the API using OAuth. In practice it doesn’t really matter what you put there, you just have to have the file to avoid TemplateNotFound exceptions.
  3. Hook up OAuth token urls
    OAuth requires the web service to provide three urls through which OAuth can aquire the proper tokens and authorize a user. Piston provides views for these functions which will create the proper tokens and integrate with Django’s auth framework. Here is how my url conf looks:
    urlpatterns = patterns(
        'piston.authentication',
        url(r'^oauth/request_token/$','oauth_request_token'),
        url(r'^oauth/authorize/$','oauth_user_auth'),
        url(r'^oauth/access_token/$','oauth_access_token'),
    )
          
  4. Add an authorization template at piston/authorize_token.html
    This template is for the page the user will be directed to when they authorize the client application. It should be a simple form asking the user to confirm that they want to give the requesting application access to their data. Piston actually comes with this template but it unfortunately lacks a submit button! You also want to write your own template so that it can be better integrated with the look and feel of your site. Just to get things going, here is a sample that works:

    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
    <html>
      <head>
        <title>Authorize Token</title>
      </head>
      <body>
        <h1>Authorize Token</h1>
    
        <form action="{% url piston.authentication.oauth_user_auth %}" method="POST">
          {{ form.as_table }}
          <button type="submit">Confirm</button>
        </form>
    
      </body>
    </html>
          

    If you want to further customize how this particular page works, you can specify an OAUTH_AUTH_VIEW setting to tell piston which view to use. For example:

    OAUTH_AUTH_VIEW = 'api.views.oauth.authorize_oauth'
    

    You’ll want to take a look at piston.authentication.oauth_auth_view as a reference when writing your own view.
  5. Add your first OAuth consumer!
    You can either do this programatically in the shell, or using django’s admin interface. If using the admin interface, you’ll have to register the piston.models.Consumer model with the admin.

I’ve had OAuth up and running now for a few days and am very glad that users will no longer have to give API clients their username and password to access their private data. It was a bit of a pain to set up but I think it’s worth it.

Last week Google submitted their own entry into the JavaScript framework race by open sourcing Closure, the library that powers google docs, maps, mail and more.

I couldn’t resist spending some time playing with the closure tools, especially the dojo-like module dependency resolver (a.k.a goog.provide and goog.require) and the closure compiler. It was a bit tricky at first because the example “applications” from the tutorial are pretty contrived and there is no information on using closure with existing JavaScript libraries.

Anyway, I thought I would share how I hooked up Closure to an existing JavaScript library (the one that powers http://gvr.carduner.net).

Step 1 – get the Closure Library.

Closure is actually a set of three tools: a JavaScript Library, a JavaScript compiler, and a templating language. The dependency resolution tools are part of the library, so you should check that out first. At the moment, the closure library can be checked out from the project subversion repository:

svn checkout http://closure-library.googlecode.com/svn/trunk closure

If you are just interested in using the dependency resolution tools and not the entire framework, you just need two files: base.js and calcdeps.py. The full checkout can take a while as they’ve included all the generated api documentation, which you can also read online.

Step 2 – Instrument your JS library code

If you already have your JavaScript code split up into “modules” that live in different files, adding the dependency code is pretty easy. In my case, I wanted to use Closure with gvr.carduner.net, which already has a set of interdependent JavaScript files like so:

gvr.core.js		gvr.renderer.js		gvr.web.client.js	jquery.history.js
gvr.js			gvr.robot.js		gvr.web.tests.js	launcher.js
gvr.lang.js		gvr.runner.js		gvr.world.js
gvr.lang.parser.js	gvr.tests.js		gvr.world.parser.js

Open up each JavaScript file, and add declarations to the top about what each file provides and requires. For example, gvr.runner.js depends on gvr.robot.js and gvr.core.js. In turn it provides the gvr.runner namespace. So at the top of gvr.runner.js I added the following:

// gvr.runner.js
goog.require("gvr.core");
goog.require("gvr.robot");
goog.provide("gvr.runner");

Then of course I had to add the provide statements to gvr.core.js and gvr.robot.js, as in:

// gvr.core.js
goog.provide("gvr.core");

// gvr.robot.js
goog.require("gvr.core");
goog.provide("gvr.robot");

The goog.provide statements will actually create the object namespace you pass in, so the following would be valid:

goog.provide("foo.bar.baz");
foo.bar.baz.blah = "blah";
foo.somethingElse = "somethingElse";

In fact, if you try to create the namespaces later, the compiler will throw warnings/errors at you. So if you have any code that looks like the following, you should remove it.

var foo = foo || {};  // DELETE THESE LINES
foo.bar = foo.bar || {};  // AS THEY CONFLICT WITH goog.provide("foo.bar.baz");
foo.bar.baz = foo.bar.baz || {};

Once you have added all the right goog.require and goog.provide statements to your code, you’ll need to generate a dependency graph using the calcdeps.py script.

Step 3 – Build the dependency graph with calcdeps.py

The dependency resolution system uses a pre-generated dependency graph to link namespaces like foo.bar.baz to their corresponding JavaScript files and the files they require. This is all stored in a file called deps.js. The one for the closure library can be found in trunk/closure/goog/deps.js. Here are the first few lines to give you an idea of what this looks like:

// This file has been auto-generated by GenJsDeps, please do not edit.
goog.addDependency('array/array.js', ['goog.array'], []);
goog.addDependency('asserts/asserts.js', ['goog.asserts'], []);
goog.addDependency('async/conditionaldelay.js', ['goog.async.ConditionalDelay'], ['goog.Disposable', 'goog.async.Delay']);
goog.addDependency('async/delay.js', ['goog.Delay', 'goog.async.Delay'], ['goog.Disposable', 'goog.Timer']);
// ... this goes on for quite a while ...

In order for closure’s dependency mechanism to know about your libraries, you need to create a deps.js file of your own. This can be done with the calcdeps.py script you should have downloaded by now. If you checked out the entire closure library source, the calcdeps.py script can be found in trunk/closure/bin/calcdeps.py.

The calcdeps.py script must be run from the directory where your files will be served, as it stores the relative file paths in the generated deps.js file, which is in turn used to build urls to all your JavaScript files. For example, my application’s directory structure looks like this:

gvr-online/
  closure/ # this is the trunk checkout of closure
    closure/
      bin/
        calcdeps.py # I'll use this script to generate my own deps.js
      goog/ # this is the closure library source, including deps.js and base.js
  app/
    src/
      ui/  # This directory is exposed to the web as http://localhost:8080/ui/
        lib/ # this is where my javascript library lives
        closure/
          goog/ # this is a symlink to the closure/closure/goog/ directory at the top level

The gvr-online/app/src/ui/ directory is what gets exposed through the web, so the calcdeps.py script should be run from that directory. Here is the command I used to run it:

cd app/src/ui/ && python calcdeps.py -p lib -o deps > deps.js

The -p lib option tells calcdeps.py to search in app/src/ui/lib/ for js files with goog.provide and goog.require statements. The -o deps option tells calcdeps.py to generate a dependency graph file, which gets saved to app/src/ui/deps.js. If you are using the rest of the closure library, and not just the dependency stuff, you will need to add an extra -p closure argument.

With that done, we can try this out in a browser.

Step 4 – Instrument your HTML

Next you’ll need to add the closure hook to your html files. In my project, there is just one html file, index.html. If you just include the base.js file like the closure tutorial suggests, you will not be able to goog.require your own library modules. You have to tell base.js where to find your library, and where to find the deps.js file. I added the following to the <head> section of index.html:

    <script type="text/javascript">
      var CLOSURE_NO_DEPS = true;
      var CLOSURE_BASE_PATH = "/ui/";  //this is the directory where I ran calcdeps.py
    </script>
    <script type="text/javascript" src="/ui/closure/goog/base.js"></script>
    <script type="text/javascript" src="/ui/deps.js"></script>

The CLOSURE_NO_DEPS option tells base.js that it shouldn't load closure's deps.js file and that we will handle the dependency graph loading ourselves. The CLOSURE_BASE_PATH setting is a prefix that should be added to the paths specified in the deps.js file. Next we load base.js which defines the goog.require and goog.provide functions. And finally, we load the deps.js file that was generated in the last step.

With these files loaded, you can now goog.require any of your modules. For example, at the bottom of my index.html file, I can have this:

<script type="text/javascript">
goog.require("gvr.web.client");
</script>
<script type="text/javascript">
client = gvr.web.client.newClient();
client.getUser(function(user){ alert("Hi "+user.nickname); });
</script>

Closure - pun intended

So far, I think closure's dependency resolutions tools are my favorite. It's relatively simple (you only need two files really) and doesn't require you to structure your code in any particular directory hierarchy (unlike Dojo last I checked). My only wish at this point is for calcdeps.py and base.js to have a mechanism for registering third party libraries like jQuery without adding goog.provide() to the top of their files. You could add other libraries to the end of the generated deps.js, but that isn't very maintainable and won't work with the closure compiler (I think?). I haven't yet gotten to using the closure compiler with my code, so more on my experience with that later.

There has been some talk about using “class based views” in Django to make view code more reusable. Apparently, there was even a presentation given about it. At Divvyshot, our code base is growing quickly and we are starting to reuse view code a lot. We’ve been refactoring all of our view code into classes, which makes them much easier to customize and mash together. Today I worked on some pretty exciting stuff that makes harnessing class-based views a snap.

Here’s a scenario we run into a lot.

  1. We have a view that displays information about a person with a url like /people/{id}/ where the id is the person object’s id field
  2. We have another view that displays information about an event with a url like /event/{slug}/ where the slug is some small number of alphanumeric characters uniquely identifying the event.
  3. We have a third view that shows information about an event relating to a person with a url like /event/{event_slug}/person/{person_id}/

The third piece to the above combination is where class-based views really pay off. We already have a bunch of code for working with a person’s data and a bunch of code for working with an event’s data. Wouldn’t it be great if we could just magically combine those two pieces of code and get all the data about both an event and a person and their relationship spit out onto a page? Well, we can and here is a simplified example of how it would look in our code base.


First there is the code for displaying a page about a person. I’ll explain in detail what’s going on.

class PersonDetail(Handler):
    template = "myapp/person/detail.html"
    person = fromurl("id").model(Person)
    def update(self):
        # do a bunch of stuff with self.person, for example
        if self.request.user.get_person() == self.person:
            self.context['page_title'] = "This is you"
        else:
            self.context['page_title'] = "%s %s" % (self.person.first_name, self.person.last_name)

In Detail

First you’ll notice that PersonDetail is a class and not a function. Django does not require views to be functions, just to be callable. PersonDetail subclasses Handler, which provides the __call__ method that’s necessary to make an instance of PersonDetail callable. In case you are jumping to conclusions, we do not use an instance of PersonDetail directly as a callable view for thread safety reasons that I will explain later.

The next thing you’ll see is that the template is specified as a class attribute with the path used by a template loader. The actual template rendering with a proper request context and all that jazz is abstracted away for us by a render method defined in the Handler class.

The next cool thing is the line person = fromurl("id").model(Person) which declaratively spells out the mapping from a url parameter to a Person model object. In particular, this says to pull out the id from the keyword arguments passed to the view function (based on the regex in the url conf) and use it to look up a Person object. By default, a 404 response is returned if no such object is found. This is sort of a replacement for person = get_object_or_404(Person, id=some_id) that works better with class-based views.

Next we have an update method, which gets called before the template is rendered. The purpose of the update method is just to prepare the view, and not to render a template to a response. That means adding stuff to the template context, adding additional attributes to the view instance, creating and processing forms, handling post data, etc. By putting all this logic in a standalone method, it is easy to modify the views behavior without having to worry about how the HttpResponse is created.

In this example, we put variables that should be made available to the template into self.context, which is just a dictionary. Alternatively, we could set attributes on the view instance itself, which is made available to the template. For example, having {{view.person.name}} in the template would yield the desired result. The request is also made available as the self.request instance attribute. By setting attributes in the view instance, it becomes much easier to share data between multiple helper methods of the view instance. For example, you might have a method that processes a GET request and a separate one for POST requests. Subclasses of your view can then selectively override just one of the methods and all the while you don’t have to worry about passing around any required data, like the request object itself.


Next we have the code for displaying stuff about an event. This is a lot like the PersonDetail class. The only thing to note is that the event attribute has an additional piece of metadata which says that the "slug" url parameter corresponds to the "url_slug" field of the Event model.

class EventDetail(Handler):
    template = "myapp/event/detail.html"
    event = fromurl("slug").model(Event, "url_slug")
    def update(self):
        # do a bunch of stuff with self.event
        self.context['page_title'] = self.event.name


As the final section of the scenario I outlined above, we will combine these two classes using python’s multiple inheritance support. Strictly speaking, it’s not necessary to use multiple inheritance to combine the functionality of the previous two classes, and frankly I haven’t decided yet whether it is a good idea. But as long as you are careful and know what’s going on in the base classes, it should be OK. This is python after all and we don’t do hand holding.

class EventForPerson(EventDetail, PersonDetail):
    template = "myapp/event/person.html"
    def update(self):
        # do a bunch more stuff with self.event and self.person
        EventDetail.update(self)
        PersonDetail.update(self)
        self.context['page_title'] = "%s and %s" % (self.person.first_name,
                                                     self.event.name)

This example is a bit contrived because the only thing any of the update methods do is set the same variable in the template context to something different. But the idea you should take home from this is that the views could have arbitrarily complex business logic that can be easily extended and customized through subclassing, just as can be done with Model objects, admin views, HttpResponse objects, or anything else that is object oriented. With the multiple inheritance setup we have, our template, myapp/event/person.html can access the person object, the event object, and anything else provided by the update methods from EventDetail and PersonDetail. We could even {% include %} the other two templates in myapp/event/person.html and they would just work. In creating the EventForPerson class, we didn’t even have to worry about how the Event and Person objects get looked up from the parameterized url. If we refactor the object lookup later (for example, switching from person ids to person slugs), we’ll only have to change the code in one place.

Url confs

Now for a quick note about how these get hooked up in a url conf file. You might be tempted to do something like this:

urlpatterns += patterns('',
    url(r'^event/(?P<slug>[\d\w\-]+)/person/(?P<id>\d+)/', EventForPerson()),
)

where the EventForPerson class is instantiated so as to provide the url conf with a callable object. But this means you would have one instance of EventForPerson for every request that gets processed. Besides this not being thread safe, it’s just plain confusing because the update methods might “dirty up” the instance while processing one request, and that might affect the next request that gets processed. To avoid that, our urlconf looks like this:

urlpatterns += patterns('',
    url(r'^event/(?P<slug>[\d\w\-]+)/person/(?P<id>\d+)/', EventForPerson.view),
)

where EventForPerson.view is just a class method that instantiates and calls a brand new instance of EventForPerson for each request, passing in whatever parameters it receives and returning whatever result it gets. Unfortunately, due to a limitiation of Django, you cannot use the handy string notation url(r'^some-regex', "myapp.views.EventForPerson.view") to achieve the same result. So you have to import the view classes into the url conf.

Dealing with conflicting regex groups in a urlconf

The last feature I want to briefly mention is how we deal with conflicting groups in a urlconf. Suppose that both our base classes, PersonDetail and EventDetail looked up objects based on a regex group named “id”. If we wanted to combine the these two view classes into one, the url regex pattern would have to use different group names. The pattern might look like ^event/(?P<event_id>\d+)/person/(?P<person_id>\d+)/. Even though the base classes are looking for the “id” group, we can override their behavior in a subclass. It would look like this:

class EventForPerson(EventDetail, PersonDetail):
    template = "myapp/event/person.html"
    event = EventDetail.event.fromurl("event_id")
    person = PersonDetail.person.fromurl("person_id")
    def update(self):
        # do a bunch more stuff with self.event and self.person

Without having to know which models are used to look up person and event, I can still reconfigure which parts of the url get used to look them up.

Conclusion

If you don’t need to reuse your view code, you shouldn’t bother writing them as classes. If you do need to reuse view code, writing them as classes is the only sane way to do it. The utility classes we use at Divvyshot for all our class-based views are still baked into the code base but I hope to open source the useful bits soon. If you are interested in using a similar class-based view implementation, let me know and I’ll move the open sourcing of these utilities higher up on my to-do list.

If you are familiar with writing Django applications, you have probably run across the problem of extending the builtin User authentication model. Django does not yet have the hooks necessary for modifying the User object in a nice way, so you more or less have to resort to monkey patching.

Here is the basic monkey patching pattern I have seen:

def user_get_name(self):
    # do something with the user object which is self
    return "%s, %s" % (self.last_name, self.first_name)

User.get_name = user_get_name

Or if it is really just a one liner you can use a lambda, which avoids dirtying up the local namespace of wherever you are performing the monkey patching:

User.get_name = lambda self: "%s, %s" % (self.last_name, self.first_name)

The first monkey patching pattern makes reading the code incredibly painful (at least to me) and the lambda pattern isn’t much better either.

Decorator Pattern

You can perform the same operations in a more readable manner using decorators. Here is what it would look like:

def monkeypatch(cls):
    def decorator(f):
        setattr(cls, f.__name__, f)
    return decorator

Now to monkey patch the get_name method of a User object, you would do this:

@monkeypatch(User)
def get_name(self):
    return "%s, %s" % (self.last_name, self.first_name)

I personally think this is a bit more readable. The real advantage to using a monkeypatch decorator though, is that you call out the fact that you are monkey patching. While reading the above code, it is very clear that monkey patching business is going on.

Monkey patching is almost never the best way to accomplish what you’re trying to do, but it will often get the job done fast. To remind yourself that you should revisit any monkey patching code later and think of a better way to do it, consider renaming the decorator to XXXmonkeypatch.

Class decorators with python 2.6

If you are using python2.6, you can also use monkey patching decorators on entire classes. Here is an example of such a decorator:

def monkeypatch(cls_to_patch):
    def decorator(cls):
        cls_to_patch.__bases__ += (cls,)
        return cls
    return decorator

You would use this decorator like so:

@monkeypatch(User)
class MyUser:
    def get_name(self):
        return "%s, %s" % (self.last_name, self.first_name)

    def get_initials(self):
        return self.first_name[0]+self.last_name[0]

The main caveat with this method is that MyUser actually becomes a base class to User so if User ever gets a new method of the same name as one of your monkey patch methods, the User version will take precedence. This might be a feature depending on what exactly it is you are doing.

Follow

Get every new post delivered to your Inbox.

Join 69 other followers