Google AppEngine and Facebook Applications – 10 Things I wish I had known

For the past six weeks, I’ve spent some of my spare time learning about Python, Google AppEngine and how to create Facebook applications with them. In a few weeks, I’ve learned a thing or two the hard way and thought I would share some lessons learned to save other developers from beginner’s frustration.

1. Never exceed 1000

When working with AppEngine, it’s good practice never to exceed 1000 in anything you’re doing. You name it, this rule applies. For example:

  • Your application can’t have more than 1000 files.
  • Each file can’t exceed 1000K (this includes third party libraries).
  • Each page needs to render in under 10000ms.
  • Database queries might not return more than 1000 results.
  • Each data structure in memory shouldn’t exceed 1000K
  • Each object stored in memcache can’t exceed 1000K
  • and so on….

Before you choose to build an app with AppEngine, make sure you can accomplish what you want to do within these limitations. It might make sense to only use AppEngine for part of the whole project (e.g. AppEngine for processing and Amazon S3 for storage).

(**UPDATE** Google recently upped some of these AppEngine limits, but not for everything)

2. AppEngine forces your code to scale out, not up

When I first heard about cloud computing and scalable infrastructures, I thought it meant giant supercomputing clusters which can handle massive amounts of processing and calculations.

AppEngine isn’t like this at all. It’s designed from the ground up to be scalable, but it achieves this by doing hundreds of thousands of small tasks instead of tens of really big tasks. And your source code needs to embrace this philosophy. If your script needs to spend time processing thousands of records, you should re-think why it has to be one script instead of ten smaller ones.

Switching my brain to architect for AppEngine was the hardest part of AppEngine development, but once I got into the groove, it makes total sense. I’ve really enjoyed building my applications from the ground up with scalability in mind. I might not be as open minded if I had to port and existing application to AppEngine, but luckily, I haven’t had to do that yet. ;)

3. Use DynDNS to develop AppEngine/Facebook apps locally

AppEngine imposes a daily quota of 250 deployments to their server. This limit seems reasonable, but often you’ll need to test your Facebook applications in Facebook itself. And if you’re tweaking CSS or troubleshooting bugs, then you can use up your quota quickly if you have to deploy a new app each time you want to test a change in Facebook. If you use up all your deployments for the day, then you can’t upload anymore and have to stop development until the quota resets in 24 hours.

This has happened to me twice now, and after the second time, I found a great thread in the Developing for Facebook + Google App Engine group describing a solution for using DynDNS or similar service to give a domain name to your local PC, then pointing your Facebook app at your local computer. That way you can test the application on Facebook.com using your local AppEngine devserver. Trust me, this is worth the setup time.

4. Be prepared to dig in, tweak and modify Python libraries

There’s a lot of great Python code libraries out there, but much of it doesn’t work with AppEngine because of AppEngine’s unique webapp framework. You can get most libraries to work with AppEngine by adding a line or two of custom code, but you have to be willing to dig into the code and fix it.

For instance, I’m using the Google YouTube API, and in order for it work with AppEngine, you need to override the http_request_handler like this:

import gdata.service
import gdata.urlfetch
gdata.service.http_request_handler = gdata.urlfetch

Another example is custom template tags. You need to register your custom tags with AppEngine’s framework:

register = webapp.template.create_template_register()

And then in each of your individual scripts you need to register the library. So for a library named ‘customtags’ it would be:

webapp.template.register_template_library('customfilters')

w00kie has a good blog entry talking about this in more detail, but don’t expect a lot of existing Python libraries to be completely plug-n-play with AppEngine.

5. There is never too much error detection

When a user visits your URL on Facebook, Facebook will call the URL on AppEngine, AppEngine will use its framework to get data from the Internet, from its DataStore, and from Memcache, then return the result to Facebook which processes the FBML and displays the content to the user.

Unfortunately, just about anything can go wrong. I’ve had Facebook authentication fail even though you’re logged in, I’ve had Facebook give up on waiting for AppEngine to render its page, I’ve had AppEngine throw errors when doing a simple urlfetch, and I’ve had third party APIs suddenly stop responding. These errors are rare and normally not reproducible, but you still don’t want your user trying to figure out what an "ApplicationError 5" means. , So write your code to handle lots of exceptions.

6. FBJS is your friend and is key to achieving scalability in Facebook apps on AppEngine

The home page of my Facebook app is a beast. The content you see on the home page comes from more than 30 URLs on 10 different domains and third party APIs. Waiting for AppEngine to download and render this content takes forever, but I was able to pull it off by breaking up the page into five separate pieces. There’s a shell page, and then within that shell page there are four modules which each use FBJS to make a separate AJAX call to AppEngine to retrieve and display their own content.

I’ve learned the hard way that putting all your code in one page can take forever to render and consume lots of CPU, and FBJS helps reduce spread the page load out across multiple scripts.

7. Debugging FBJS is a real pain

While FBJS helps you scale out, debugging FBJS is a real pain. First, it only warns you of syntax errors, so if you have a logical error your script fails without warning. Facebook doesn’t report errors to the browser or allow you to use alerts, so the only solution I’ve found so far is to comment out your JS code one line at a time until you find the trouble spots. I would only advise doing this if you’re developing locally, otherwise you’ll quickly run into your quota limit for daily uploads to AppEngine.

8. If you’re retrieving external content, memcache is your best friend

As I mentioned earlier, the home page of my Facebook app gets most of its content from external URLs. For each URL, you fetch its content, process it into a native Python object (list or dict), and then render the content out via a template. This can eat up your CPU hours, reduce your response time, and make users give up on you.

Using memcache fixes all this. Memcache can store native Python objects, so once you’ve parsed a URL’s content in a native format, store the native object directly in memcache and retrieve it next time a user needs content from that URL

9. Use cron jobs to keep memcache current

Using memcache speeds up response time for all users except your first user. Since you shouldn’t be treating your first user any differently than the others, it’s worth setting up a script that keeps memcache refreshed with external content. This way all users will benefit from the speedup of memcache.

In my Facebook application, I’m retrieving content from a pool of around 3,000 different URLs, so I have set up a script that randomly picks 3-5 of these URLs, retrieves their content, and stores the result in memcache. I’ve also setup a cron job to call this caching script every minute or so and it’s sped up the average response time of the page, because the server never has to go out and retrieve 30 URLs of content at once. Also, if you are using third party APIs that put a limit on your usage, this is a great way to ensure you stay under those limits.

Right now, I’m executing the cron job from my own webserver, but AppEngine has said that cron support is on their roadmap, so hopefully in the future you’ll be able to support this entirely from within your AppEngine setup.

10. Once you’ve built an app or two with AppEngine, you’ll either love it or hate it.

I’ve really enjoyed developing apps with AppEngine, but I will admit it’s not for everyone. Anyone needing to do a lot of heavy data processing, or handle incredibly large data sets will experience nothing but frustration with AppEngine. However, for the majority of online projects, it’s a great way to build something scalable quickly, making it ideal for Facebook applications.

My first major Facebook application should be ready for public beta in the next week or two, so I’ll keep you posted about its progress.

12 thoughts on “Google AppEngine and Facebook Applications – 10 Things I wish I had known

  1. Pingback: RepresentedBy Facebook app launches public beta :: wubbahed.com

  2. Vivek M. Chawla

    Thanks for putting this together! I’ve been doing my own work with Facebook Apps on GAE, and it was highly instructive to read about your experiences.

    One question: You talked about using DynDNS so that Facebook hits your local AppEngine dev server. I was wondering if there was a reason you preferred to do things this way instead of just setting the Canvas Callback URL of your Facebook app to http://127.0.0.1:8080 (or whatever port you have the dev server running on)?

    Thanks again for the post!

    Vivek

    Reply
  3. Thomas Lopes

    Good tips and article! Thanks!

    But, if you mind updating the tips sometime, gave us some indexes tips. I worked on a project that demands some extra indexes, and I see this was a little pain for our developers. I guess this is a common issue for AppEngine applications.

    Best regards,

    Reply
  4. Scott S. McCoy

    I keep a virtual dedicated system around (I lease from linode). I’ve found that rather than using DynDNS, reverse proxies over SSH work quite well. I point my facebook application canvas to the IP of the public server, and I use ssh to open the reverse tunnel, using ssh -fNR 8080:0:8080 . This lets my test facebook application run on my local computer no matter where it happens to be (a cafe, a corporate network, my own private network with no DMZ).

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>