Archive for the 'Programming' Category


RESTful URL Design For Search And Collections

4

I’m trying to find the appropriate design for RESTful design of URLs for search and for collections of items.

The setup: we have two models, Cars and Garages, where Cars can be in Garages. Base URLs:


/car/xxx           (xxx = car id)
/garage/yyy     (yyy = garage id)

Now we want to provide a search for cars - eg. show me all the blue sedans with 4 doors. What’s the appropriate URL?


1  - /cars/color/blue/type/sedan/doors/4
2  - /cars/color:blue/type:sedan/doors:4
3  - /cars/?color=blue&type=sedan&doors=4
4  - /car/search/...

None of these are satisfying.

1 through 3 use “cars” as the base (as opposed to “car”). So the pattern for doing searches / collections would be to pluralize the model. This seems ok.

1 has arbitrary ordering of the fields and no good way to distinguish fields versus their values. 2 is slightly better, but still doesn’t seem right.

3 uses the QUERYSTRING for the parameters instead of the PATHINFO, and frankly looks better to me, but I’ve heard of objections to using QUERYSTRING. The problem I have with it is it’s not consistent - if I was searching on a single field my URL would probably be: /cars/color/red or something like that. Having the URL drastically change form just because there are more search parameters seems wrong.

4 uses the “car” base url along with the verb “search”. That seems wrong - verbs shouldn’t be part of the URL, right? It’s been suggested several times though.

Now a slightly different case - let’s find all the cars in a given garage:


1  - /garage/yyy/cars
2  - /cars/?garage=yyy

1 seems pretty good in this case.

Please chime in with your thoughts, either in the comments here or in the Stackoverflow thread.

Setting Up Beanstalkd on Ubuntu for Python

1

beanstalkd is a promising in-memory queuing system in the mold of memcached (minimal configuration, just works) with client libraries in a variety of languages. The following worked for me for installing it on Ubuntu 8.04:


mkdir ~/packages

# pre-requisite: libevent.
cd ~/packages
wget http://monkey.org/~provos/libevent-1.4.8-stable.tar.gz
tar zxvf http://monkey.org/~provos/libevent-1.4.8-stable.tar.gz
cd libevent-1.4.8-stable
./configure
make
sudo make install

# add /usr/local/lib to your load library path so beanstalkd can find libevent
vi ~/.bashrc   (add the following somewhere near the end):
export LD_LIBRARY_PATH=$LD_LIBRYARY_PATH:/usr/local/lib

(exit vi)
source ~/.bashrc

# need git in order to get latest code for beanstalkd
cd ~/packages
sudo apt-get install git-core

# grab beanstalkd
git clone http://xph.us/src/beanstalkd.git
cd beanstalkd
make

# now you should be able to start the beanstalkd daemon
./beanstalkd -d -p 99988

# get the python beanstalkd client
cd ~/packages
svn checkout http://pybeanstalk.googlecode.com/svn/trunk/ pybeanstalk-read-only

cd pybeanstalk-read-only
sudo python setup.py install

# get pyyaml, a pre-requisite for the python beanstalkd client
cd ~/packages
wget http://pyyaml.org/download/pyyaml/PyYAML-3.06.tar.gz
tar zxvf PyYAML-3.06.tar.gz
cd PyYAML-3.06
sudo python setup.py install

# open two different shells (or use screen) type the following in the two different shells:
cd ~/packages/pybeanstalk-read-only/examples
python simple_clients.py producer localhost 99988
python simple_clients.py consumer localhost 99988

Stackoverflow: Surprisingly Good Source of Technical Answers

0

Stackoverflow Logo

Stackoverflow is a newish service for developers - ask a question, get answers, vote on answers, build reputation. Sort of like Yahoo! Answers but for developers.

I’m surprised at how good the service is so far. I asked a question on IRC and the same question on Stackoverflow. Within 2 minutes I got an incorrect answer on IRC and the two correct answers on Stackoverflow. The voting and reputation seems to really work and there’s no IRC trolls / egos to deal with.

I’m hoping the site will maintain its usefulness as it grows. Well worth checking out. I’m here.

Drizzle: MySQL Based Slim / Cloud-Oriented DB

0

Drizzle is interesting:

Drizzle: A High-Performance Microkernel DBMS for Scale-Out Applications
Drizzle is a community-driven project based on the popular MySQL DBMS that is focused on MySQL’s original goals of ease-of-use, reliability and performance.

Headed up by Brian Aker, Director of Architecture at MySQL AB. Take a look at the MySQL Differences page and you’ll mostly see features removed and cleaned up, which is great. Designed for high levels of concurrency, targeted to “cloud” applications. Monty and Brian’s posts offer motivation for the project.

Something to keep an eye on.

Three

2

Three.

The number of programmers who will write most of the code in a system developed by a team of 24 engineers, two project managers, three group leaders, a quality lead and an office manager.

From Russ Olsen.

TraceMonkey: It’s A Big Deal

0

Mozilla announces TraceMonkey, a just-in-time compiler for Javascript. If you’ve watched Steve Yegge’s talk on Dynamic Languages (transcript) you’ve already had a taste of what the future could look like for dynamic languages - namely, performance on par with today’s low level languages.

Javascript started as an ugly language but has been steadily shedding its bad parts and adopting a beautiful functional style. With the performance piece figured out and a tremendously large number of installations and runtimes (just about every browser in existence has a Javascript engine), it could become the most important programming language of the near future.

Trying Mercurial

0

All the cool kids are into Git these days and I’ve been reading plenty of articles about how good it is and how to use it. The problem is, I don’t really have a problem with Subversion. I know I should, because all the cool kids do, but I just don’t run into a lot of issues with it. In the absence of a problem to solve it would simply be peer-pressure to give Git a shot.

So in an attempt to remain ever-so-independent I’m going to try a distributed system, but not Git. I’m going with Mercurial.

Actually mainly it’s because Mercurial seems significantly simpler, and I’m a simple guy. It’s also written in Python, which gives me a warm and fuzzy. And I’m finally motivated to try it because I’m going to try a code path which may not work out, and I understand these distributed systems deal with that well.

Hmm. The main thing I’d want from a source code control system would be a bit of packaging and deployment intelligence built in. Maybe something to minify and join my javascript files and mend the files that reference them. I’m extremely pleased not to have a “make” step anywhere in my process, but I do miss some of the capabilities.

If I’m making a mistake and I should go with Git, or perhaps CVS, do let me know.

Parsing (Top-Down) in Python

0

Excellent article on Simple Top-Down Parsing in Python. The nud and led business could be better explained, but the rest of the article and code is great. I learned several things I hope to employ shortly.

I’m trying to remember if we studied this in compiler class or not. I think not, although I have terrible memory, so it’s possible we did.

Anyway, a companion tutorial article that would approach this strictly from the perspective of using the toolkit Effbot built in his article would be nice. In other words, knowing the under-the-hood details is fantastic and informative, but given the tools and helper functions built in the article one could fairly easily build a parser without worrying about how the helper functions are implemented. Sort of a user manual for building parsers given the helper functions.

Django: Retrieving Backward Related Objects

0

Another one in the category of always-forget-how-to-do-this-so-noting-here:


class Entry(models.Model):
    blog = models.ForeignKey(Blog)

b = Blog.objects.get(id=1)
b.entry_set.all() # Returns all Entry objects related to Blog.

# b.entry_set is a Manager that returns QuerySets.
b.entry_set.filter(headline__contains='Lennon')
b.entry_set.count()

Full docs here.

Cross-Language Data Serialization and Exchange

0

Interesting new open source release from Google called Protocol Buffers. Language neutral data serialization and exchange via protocol definition and generated code for C++, Java, and Python.

Apparently Protocol Buffers are heavily used inside Google, so they look to be a robust implementation. Should be a good format for wire protocols.

They compare it to XML and tout its size and speed advantages. In a client/server implementation, however, JSON is the more likely alternative. I wonder how the size and speed compare.

Javascript Is The Guy With The Thing

2

Man with ShovelIn most programming languages (Java, C, Python, Perl) I’m generally thinking “I’ll put this thing on this shelf here, then I’ll do x, then I’ll pick up that thing, do some work on it, put the result over here,” and so forth.

With Javascript, particularly when used correctly, which for me means in the Way Of JQuery, the thought process is more like “When some event happens, this guy will wake up and he’ll know what to do. He’ll remember his name, what he was supposed to work on, and he’ll be carrying his own tools. He might get blocked at some point, but then he’ll just wait around and when he’s ready to go he’ll remember who he is, what he was doing, and how far along doing it he was. And when he’s done he’ll go away and along with him will go his tools and any other mess he made”.

Javascript is a lot more “guy with the thing” thinking instead of “what’s on this shelf here?” thinking. I guess that’s called closures, or something like that. Anyway, I’m liking it.

Photo by St-Even.

Put A Queue In Your Standard Toolkit

2

Kids Queue

A friend forwarded me an email about yet another group using a database to implement what’s really a queue. Not surprisingly, performance is an issue.

Queues are still not part of the average developer’s standard set of tools. At least the Java world has a standard API and several good implementations to pick from. The scripting world is a hodge-podge, and I still haven’t found a great choice despite a good bit of looking.

I’m looking forward to a simple, commonly used queue interface / implementation that people can wrap their heads around and employ widely. Use of queues is one of the basic techniques for achieving scale, and we’re still lacking the basic tools to use it.

Photo by Sean Dreilinger.

D Lazy Evaluation Prettyness

0

This is kind of pretty:


void log(lazy char[] dg)
{
    if (logging)
	fwritefln(logfile, dg());
}

void foo(int i)
{
    log(”Entering foo() with i set to ” ~ toString(i));
}

Note the lazy keyword in the definition of the log function, which tells D to only evaluate the value if needed (ie. lazily).

Nice. Smells a little like Twisted’s deferred business, except different.

Via Raganwald.

OpenID for Email Verification?

2

I have a need to verify user email addresses, which I’ve been doing the traditional way - sending an email with a secret to the user’s address and having them reply or click on a URL.

Unfortunately this is not optimal - emails tend to not make it to the user, go into bulk/spam buckets, and are less real-time than I’d like. I’m looking for a better way.

I’m hoping OpenID will help me. I mainly care about Yahoo, Google, and Hotmail, all of which support OpenID to some extent.

I believe OpenID Simple Registration is what I’m looking for. I have a lot of homework to do to see which providers support SREG, how to use them, etc. I’ll post my progress here, and if you have knowledge / experience with this, please leave a comment below.

Using Django Signals To Watch For Changes To Instances

3

Say you want to monitor changes to instances of a model and update something based on the changes. In my example I wanted to maintain a sum of the values that had certain characteristics. You can accomplish this with Django Signals.

Signals are events that fire at various pre-defined moments - for example, before an instance is saved, after it’s saved, etc. You can subscribe to these events, allowing your callback handler to be called at those moments.

The code below subscribes to the post_init and post_save signals. post_init gets triggered when a model’s __init__ class is done executing, which generally means when a model instance is created for the first time or instantiated from a query to the DB. This is actually too frequent for the use case I have in mind (checking the before-modification and after-modification values of certain fields), but seems to be the only place I can hook in to get the pre-modification values.

post_init gets triggered after the instance is saved to the DB. The code below stores the pre-modification values in pre_save when it gets triggered by the post_init signal, and checks them against the post-modification values when it gets triggered by the post_save signal.

Note that you’ll probably want to clean up pre_save periodically. Unfortunately post_init and post_save are not symmetrical (you’ll get a post_init anytime an instance is created, for example when you query the DB), so you can’t simply delete from pre_save when the post_save signal gets triggered.


from django.dispatch import dispatcher
from django.db.models import signals

pre_save = {}

def change_watcher(sender, instance, signal, *args, **kwargs):
    print "SIGNAL:", sender, instance.report, signal, args, kwargs
    if signal == signals.post_init:
        pre_save[instance.id] = (instance.field1, instance.field2)
    else:
        if pre_save[instance.id][0] != instance.field1:
            print “Changed field1″
        if pre_save[instance.id][1] != instance.field2:
            print “Changed field2″

for signal in (signals.post_init, signals.post_save):
    dispatcher.connect(change_watcher, sender = Expense, signal = signal)

Django+MySQL: How To Fix Unicode (aka Mysterious Question Marks)

1

If you’re running into the problem where unicode items in your Django / MySQL project are displayed as question marks, here’s the likely problem and solution, found in this django-users thread:

The likely problem is that your MySQL encoding is set to latin1, as opposed to utf8. You can check this via:

 mysqld --verbose --help | grep character-set

You’ll probably see:

character-set-server              latin1

You want this to be uft8. To modify it, edit your my.conf file ( /etc/mysql/my.conf on ubuntu ), adding the following lines to the appropriate sections:


[client]
…
default-character-set = utf8

[mysqld]
…
character-set-server=utf8
collation-server=utf8_unicode_ci
init_connect=’set collation_connection = utf8_unicode_ci;’

Now restart mysql:


sudo /etc/init.d/mysql restart

And alter your existing tables to use the utf8 encoding:


mysql your_db_name

alter table your_table_name convert to character set utf8;

And that should do it.

Static Typing and Breath Mints

1

Laughed out loud at this one:

Static typing is like giving a drunk a bunch of breath mints and saying “Don’t drive drunk. But if you must, use these breath mints in case you get pulled over.”

Via Simon.

The Hamburger Theory of Threads and Processes

2

Hamburger

You’re busy making hamburgers and suddenly you get lots of customers. You want to scale your service to take care of more customers more quickly.

Threads: all of your workers share a single set of tools, utensils, and the same workspace. One puts mustard on the spreader, turns to grab the bun, and finds that someone else has put ketchup on there while he was turning. So you come up with complex rules about who must ask permission, under what circumstances, for grabbing what tools. Sometimes worker A is waiting for worker B to put down the knife, while B waits for A to put down the cheese, and they end up waiting forever.

Processes: each worker gets his own tools, utensils, and work space. Any sharing is explicit: worker A must intentionally pass the utensil to worker B.

More tools are used with processes, so in some sense it’s less efficient. But the rules are much simpler.

If the tools and utensils are very large and valuable, perhaps threads will work. Picture lots of bees each working on a piece of the beehive.

When the tools are small their duplication is less wasteful. The simplicity of the rules makes it easy for you to get your system going, add new items to the menu, and spend a lot less time worrying about your workers waiting for somebody else to put down the cheese.

Photo by JustABigGeek.

Handy Javascript Scoping Trick

0

Neat Javascript function scoping trick from Dustin Diaz, master of cool Javascript tricks. His Javascript Video Tutorials were the first time I was exposed to proper Javascript (Justin, do some more!).

Here’s the trick:


var o = 'hello world';

(function() {
  alert(this);
}).call(o);

And why is this neat? Because it lets you avoid the “that = this” funkiness (if you’ve had to do it you know what I’m talking about). Read Justin’s post for actual info.

Coding Efficeny Improvemnt Tip: “Next” File

0

Next sign

I don’t recall where I read this, but I’ve been using it to great effect: as you are about to stop coding, note what your next task should be in a file called NEXT.txt . When you come back to resume coding look at NEXT.txt to get context for where you are and what you need to do.

I’ve found this simple practice reduces the penalty of starting and stopping very significantly.

Photo by Thomas Hawk

Next Page »