For the past six weeks, I’ve spent some of my spare time learning about Python, Google AppEngine and how to create Facebook applications with them. In a few weeks, I’ve learned a thing or two the hard way and thought I would share some lessons learned to save other developers from beginner’s frustration.
1. Never exceed 1000
When working with AppEngine, it’s good practice never to exceed 1000 in anything you’re doing. You name it, this rule applies. For example:
- Your application can’t have more than 1000 files.
- Each file can’t exceed 1000K (this includes third party libraries).
- Each page needs to render in under 10000ms.
- Database queries might not return more than 1000 results.
- Each data structure in memory shouldn’t exceed 1000K
- Each object stored in memcache can’t exceed 1000K
- and so on….
Before you choose to build an app with AppEngine, make sure you can accomplish what you want to do within these limitations. It might make sense to only use AppEngine for part of the whole project (e.g. AppEngine for processing and Amazon S3 for storage).
(**UPDATE** Google recently upped some of these AppEngine limits, but not for everything)
2. AppEngine forces your code to scale out, not up
When I first heard about cloud computing and scalable infrastructures, I thought it meant giant supercomputing clusters which can handle massive amounts of processing and calculations.
AppEngine isn’t like this at all. It’s designed from the ground up to be scalable, but it achieves this by doing hundreds of thousands of small tasks instead of tens of really big tasks. And your source code needs to embrace this philosophy. If your script needs to spend time processing thousands of records, you should re-think why it has to be one script instead of ten smaller ones.
Switching my brain to architect for AppEngine was the hardest part of AppEngine development, but once I got into the groove, it makes total sense. I’ve really enjoyed building my applications from the ground up with scalability in mind. I might not be as open minded if I had to port and existing application to AppEngine, but luckily, I haven’t had to do that yet.
3. Use DynDNS to develop AppEngine/Facebook apps locally
AppEngine imposes a daily quota of 250 deployments to their server. This limit seems reasonable, but often you’ll need to test your Facebook applications in Facebook itself. And if you’re tweaking CSS or troubleshooting bugs, then you can use up your quota quickly if you have to deploy a new app each time you want to test a change in Facebook. If you use up all your deployments for the day, then you can’t upload anymore and have to stop development until the quota resets in 24 hours.
This has happened to me twice now, and after the second time, I found a great thread in the Developing for Facebook + Google App Engine group describing a solution for using DynDNS or similar service to give a domain name to your local PC, then pointing your Facebook app at your local computer. That way you can test the application on Facebook.com using your local AppEngine devserver. Trust me, this is worth the setup time.
4. Be prepared to dig in, tweak and modify Python libraries
There’s a lot of great Python code libraries out there, but much of it doesn’t work with AppEngine because of AppEngine’s unique webapp framework. You can get most libraries to work with AppEngine by adding a line or two of custom code, but you have to be willing to dig into the code and fix it.
For instance, I’m using the Google YouTube API, and in order for it work with AppEngine, you need to override the http_request_handler like this:
import gdata.service import gdata.urlfetch gdata.service.http_request_handler = gdata.urlfetch
Another example is custom template tags. You need to register your custom tags with AppEngine’s framework:
register = webapp.template.create_template_register()
And then in each of your individual scripts you need to register the library. So for a library named ‘customtags’ it would be:
webapp.template.register_template_library('customfilters')
w00kie has a good blog entry talking about this in more detail, but don’t expect a lot of existing Python libraries to be completely plug-n-play with AppEngine.
5. There is never too much error detection
When a user visits your URL on Facebook, Facebook will call the URL on AppEngine, AppEngine will use its framework to get data from the Internet, from its DataStore, and from Memcache, then return the result to Facebook which processes the FBML and displays the content to the user.
Unfortunately, just about anything can go wrong. I’ve had Facebook authentication fail even though you’re logged in, I’ve had Facebook give up on waiting for AppEngine to render its page, I’ve had AppEngine throw errors when doing a simple urlfetch, and I’ve had third party APIs suddenly stop responding. These errors are rare and normally not reproducible, but you still don’t want your user trying to figure out what an "ApplicationError 5" means. , So write your code to handle lots of exceptions.
6. FBJS is your friend and is key to achieving scalability in Facebook apps on AppEngine
The home page of my Facebook app is a beast. The content you see on the home page comes from more than 30 URLs on 10 different domains and third party APIs. Waiting for AppEngine to download and render this content takes forever, but I was able to pull it off by breaking up the page into five separate pieces. There’s a shell page, and then within that shell page there are four modules which each use FBJS to make a separate AJAX call to AppEngine to retrieve and display their own content.
I’ve learned the hard way that putting all your code in one page can take forever to render and consume lots of CPU, and FBJS helps reduce spread the page load out across multiple scripts.
7. Debugging FBJS is a real pain
While FBJS helps you scale out, debugging FBJS is a real pain. First, it only warns you of syntax errors, so if you have a logical error your script fails without warning. Facebook doesn’t report errors to the browser or allow you to use alerts, so the only solution I’ve found so far is to comment out your JS code one line at a time until you find the trouble spots. I would only advise doing this if you’re developing locally, otherwise you’ll quickly run into your quota limit for daily uploads to AppEngine.
8. If you’re retrieving external content, memcache is your best friend
As I mentioned earlier, the home page of my Facebook app gets most of its content from external URLs. For each URL, you fetch its content, process it into a native Python object (list or dict), and then render the content out via a template. This can eat up your CPU hours, reduce your response time, and make users give up on you.
Using memcache fixes all this. Memcache can store native Python objects, so once you’ve parsed a URL’s content in a native format, store the native object directly in memcache and retrieve it next time a user needs content from that URL
9. Use cron jobs to keep memcache current
Using memcache speeds up response time for all users except your first user. Since you shouldn’t be treating your first user any differently than the others, it’s worth setting up a script that keeps memcache refreshed with external content. This way all users will benefit from the speedup of memcache.
In my Facebook application, I’m retrieving content from a pool of around 3,000 different URLs, so I have set up a script that randomly picks 3-5 of these URLs, retrieves their content, and stores the result in memcache. I’ve also setup a cron job to call this caching script every minute or so and it’s sped up the average response time of the page, because the server never has to go out and retrieve 30 URLs of content at once. Also, if you are using third party APIs that put a limit on your usage, this is a great way to ensure you stay under those limits.
Right now, I’m executing the cron job from my own webserver, but AppEngine has said that cron support is on their roadmap, so hopefully in the future you’ll be able to support this entirely from within your AppEngine setup.
10. Once you’ve built an app or two with AppEngine, you’ll either love it or hate it.
I’ve really enjoyed developing apps with AppEngine, but I will admit it’s not for everyone. Anyone needing to do a lot of heavy data processing, or handle incredibly large data sets will experience nothing but frustration with AppEngine. However, for the majority of online projects, it’s a great way to build something scalable quickly, making it ideal for Facebook applications.
My first major Facebook application should be ready for public beta in the next week or two, so I’ll keep you posted about its progress.
Thanks for compiling this top 10. Very helpful.
I noticed that you pointed to the blog post instructions for using the Google Data APIs on App Engine, and since the initial announcement, I’ve tried to make things easier. Now you can use:
import gata.alt.appengine
gdata.alt.appengine.run_on_appengine(your_service_client_instance)
This is explained here: http://code.google.com/appengine/articles/more_google_data.html#auth
Enjoyed the post, thanks for writing this up
Pingback: RepresentedBy Facebook app launches public beta :: wubbahed.com
Thanks for putting this together! I’ve been doing my own work with Facebook Apps on GAE, and it was highly instructive to read about your experiences.
One question: You talked about using DynDNS so that Facebook hits your local AppEngine dev server. I was wondering if there was a reason you preferred to do things this way instead of just setting the Canvas Callback URL of your Facebook app to http://127.0.0.1:8080 (or whatever port you have the dev server running on)?
Thanks again for the post!
Vivek
I agree about FBJS. Since the announcement of App Engine for Java, I’ve deployed a simple Facebook App. I think GAE is a good fit for this kind of development. Facebook apps serve a specific purpose, but they might get popular and need to scale.
My blog post about it: http://socialjava.blogspot.com/2009/04/facebook-apps-in-java-running-on-google.html
More info on http://www.socialjava.com. Includes source.
Carmen
Great article!
A few remarks:
1. for local machine FB development I found ssh reverse tunneling to be the easiest solution http://blog.kenweiner.com/2007/09/reverse-ssh-tunnel-for-facebook.html
2. cron jobs are now part of appengine api
http://code.google.com/appengine/docs/python/config/cron.html
Looking fwd to seeing your FB app ship!
Very Useful. Thanks a lot of sharing your thoughts.
Good tips and article! Thanks!
But, if you mind updating the tips sometime, gave us some indexes tips. I worked on a project that demands some extra indexes, and I see this was a little pain for our developers. I guess this is a common issue for AppEngine applications.
Best regards,
To use your DynDNS tip, the following option is required when you start the dev appserver: ‘–address=0.0.0.0′ Otherwise my local dev appserver wouldn’t respond a request other than ‘localhost’. There’s also an “extra flags” box in the settings of my GUI launcher that takes the same option string.
Documented here:
http://code.google.com/appengine/docs/python/tools/devserver.html
I keep a virtual dedicated system around (I lease from linode). I’ve found that rather than using DynDNS, reverse proxies over SSH work quite well. I point my facebook application canvas to the IP of the public server, and I use ssh to open the reverse tunnel, using
ssh -fNR 8080:0:8080. This lets my test facebook application run on my local computer no matter where it happens to be (a cafe, a corporate network, my own private network with no DMZ).Can I please have sample code?
Thanks,
Nirav
I am stuck with infinite redirect between facebook login and logout when I am using the facebook python sdk
(https://github.com/facebook/python-sdk/tree/master/examples/appengine/)
Did you encounter a similar problem by any chance?
Thanks,
Nirav