Archive for April, 2009

Django-Piston: REST Framework for Django

7

Django-Piston in a promising looking REST frameowork for Django. On first inspection it seems to have all the right attributes and setup. I hope to give it a try soon.

[Update] By the way, I’ve been using django-piston in a real project, like it quite a bit. I recommend it.

One question I have – while I agree HTTP PUT and DELETE are the right verbs to use for Update and Delete, in practice they’re not well supported and can cause confusion. I’m wondering if there’s a way to change the mapping to the following:

POST /resource  -- Create
POST /resource/id -- Update
POST /resource/id?action=delete -- Delete
GET / resource/id -- Read

Value of The Printed Newspaper: Less Than Zero

0

We subscribed to the local paper, mainly to support a cause that was raising money by selling subscriptions.

Thursday through Sunday we get the paper physically delivered to our house. And I get to experience the value first-hand.

The value of the printed paper is less than zero. I’d prefer they stop delivering it immediately.

I end up with newspaper all over the house, I feel bad about the wasted paper, and I have to make room in my already overflowing recycling bin.

This is not a comment on the contents of the paper; it could be great stuff for all I know (I’m not really reading the local papers so much anymore). It is a comment on printed newspaper as a medium.

A few years ago I used to argue that paper is so much more convenient that electronic media that I’ll never switch. In those few years, unbeknownst to me, I did switch. Paper is now less convenient for me than carrying my little netbook around.

Preserve JavascriptDB: Yet Another Non-Traditional Data Store

0

Non-traditional data stores are coming fast and furious these days. Here’s another interesting one: Preserve with JavascriptDB. This one I’d like to check out.

On Magic Powers

1

As with most profound experiences in my life this took place on a late Southwest flight after a long, exhausting day. The young man seated a few seats away began to scream, threaten, move violently, and curse wildly as the plane took off.

The initial reaction of people close by was fear, disgust, and horror, followed by understanding, compassion, and finally by a genteel unspoken agreement to pretend he doesn’t exist.

The young man suffered from Tourette’s syndrome.

I realized at that moment that taking offense is a choice.

We chose not to be offended by this young man due to his condition.

Knowing this I’ve gained a magic power. I can chose not to take offense. I can choose not to be bothered by things that really should bother me.

And once in a while I actually employ this power – instead of fuming and screaming bloody murder, sometimes I choose simply not to be offended and move on.

Not often enough though.

Proxies For Request Modification?

0

Interesing post from igvita on Ruby Proxies for Scale and Monitoring discussing the use of Ruby and EventMachine to create simple proxies for monitoring, benchmarking, content examination, and even request modification.

I’ve always wanted to do benchmarking as Ilya suggests. Real production traffic is the best way to test. Good stuff.

I’m tempted by the beanstalkd use case as well – he uses his proxy to detect and route certain requests to an archiving mysql instead of to his beanstalkd instance. I’m leary of maintainability issues however – I’ve generally found indirection, particularly at wire protocol level, can quickly lead to hard to find bugs.

Something to experiment with at some point.

Notes On Distributed Key Stores

0

Leonard Lin posts his notes on distributed key stores. His requirements are fairly similar to mine so I read with interest. Short version: he likes Tokyo Cabinet / Tokyo Tyrant with his own consistent hashing scheme thrown on top. 

Btw, it’s interesting how much interest there suddenly is in distributed key-value stores – everyone I know is using or evaluating one. How did we live without them for so long? Gasp.

Python Simple Inheritance Example

8

Another one of those things I tend to forget the syntax of, noting for easy future lookup:

 


class Base():
    def __init__(self, param):
        print "Base:", param
    def method(self, param):
        print "Base.method:", param

class Derived(Base):
    def __init__(self, param):
        Base.__init__(self, param)
        print "Derived:", param
    def method(self, param):
        Base.method(self, param)
        print "Derived.method:", param

>>> d = Derived("me")
Base: me
Derived: me
>>> d.method("you")
Base.method: you
Derived.method: you

Picasa To Flickr/FaceBook Upload

6

Uploading pictures from Picasa to Flickr (or Facebook, or pretty much anything other than Picasa Web Album) is still too difficult. There’s a nice project called picasa2flickr but it has some issues – it takes focus and blocks while uploading the images, which can take quite a while. I used to drag and drop from Picasa into flickr uploadr but that doesn’t seem to workanymore with Picasa 3. Thanks Googs.

The solution, seems to me, should be something like this: Select a number of pictures, right click on them, and have an “Upload to Flickr”, “Upload to FaceBook”, etc, button. Clicking the button should simply send the list of files to a background uploader and return immediately. The background uploader can take its merry time uploading the pictures.

I’m really tempted to go write this if only to stop being my wife’s “I need to upload photos to FaceBook” tech support guy. Looks like there’s a Picasa Button API (via picasa2flickr) so this should generally be doable. Now to find the time.

Distributed Database Talk

0

Very informative PyCon talk on various fancy distributed data stores, including BigTable, Dynamo, Cassandra, and several others.

 

Is There Any Legitimate Reason To Follow > 1000 People On Twitter?

1

You can only pay attention to so many people. Unless I’m unaware of some magical tool or technology, 1000 people is way past the point where you can pay attention.

When I see people who are following 3000 or more people follow me, I have to wonder, do they have any legitimate reason? Or is their sole motivation to get me to follow them?

On Twitter Replacing RSS/Blogging

2

Simon Willison asks whether people are using Twitter as a replacement for RSS aggregators, and finds anecdotally that a sizable number are.

I can buy that. I still prefer RSS for a number of reasons, not the least of which is the velocity of Twitter is too high – I don’t have time to keep up with a lot of the conversations, and by the time I get around to reading (if I do), it’s often too late to participate. It’s also annoying to see partial nonsensical conversations between people you’re otherwise interested in. A lot of the old timers are using Twitter as the new IRC, and I’m not quite used to that yet.

Anyway, as a still-interested-in-RSS fellow I’m concerned not about people using Twitter as an RSS aggregator replacement, but about authors tweeting instead of blogging. I enjoy reading longer, more thoughtful pieces, and I feel like people are decreasing their blogging output in favor of increased tweeting. Looking at my RSS feeds it feels like a good number of the non-commercial authors are publishing less than they used to. I suppose if I wasn’t lazy I could actually calculate this instead of guessing, but, alas, I’m lazy.

So, Simon, perhaps you could ask the complement of the authors as well: are they using Twitter in place of writing blog posts?

Tokyo Cabinet Observations

12

I’m using Tokyo Cabinet with Python tc for a decent sized amount of data (~19G in a single hash table) on OS X. A few observations and oddities:

  • Writes slow down significantly as the database size grows. I’m writing 97 roughly equal sized batches to the tch table. The first batch takes ~40 seconds, and processing time seems to increase fairly linearly, with the last taking ~14 minutes. Not sure why this would be the case, but it’s discouraging. I’ll probably write a simple partitioning scheme to split the data into multiple databases and keep the size of each small, but it seems like this should be handled out of the box for me.
  • [Update] I implemented a simple partitioning scheme, and sure enough it makes a big difference. Apparently keeping the file size small (where small is < 500G) is important. Surprising – why doens’t TC implement partitioning if it’s susceptible to performance issues with larger file sizes? Is this a python tc issue or a Tokyo Cabinet issue?
  • [Also] Seems I can only open 53-54 tc.HDB()’s before I get an ‘mmap error’, limiting how much I can partition.
  • Reading records that have already been read from the tch seems to go much faster on the second access (like an order of magnitude faster). I suspect this is the disk cache at work, but if anyone has extra info on this please enlighten me.
Another somewhat surprising aspect: using the tc library you’re essentially embedding Tokyo Cabinet in your app; I had assumed it was going to be network based access, but it’s not. You can do network access either using the memcached protocol or using pytyrant.

If you have enough traffic, the cost of servers outweighs the cost of programmers

1

Quote from Bill Venners (via):

If you have enough traffic, at some point the cost of servers outweighs the cost of programmers

Absolutely true, which is why places like Yahoo and Google are among the last bastions of very skilled C/C++ programmers.

Of course I should mention: you are not at that point. You really aren’t. So for now ignore this quote.

Twitter Scala/Ruby Drama

9

The Twitter folks decide to use Scala, and one of their prominent developers decides to write a book about it. Interesting, motivates me to take a look at Scala.

Particularly interesting is their happiness with the type system in Scala. I’ve found happiness with duck typing, and these guys are moving away from duck typing to something else, so another viewpoint for me to check out. Good.

But – blaspheme – they’re using Scala to replace Ruby. The Ruby community is incensed. Did these guys do their homework? Did they research every possible queuing system in existence before writing their own? Did they not try JRuby? Surely there’s a way to make it work with Ruby. These guys must be incompetent, lazy, or just plain stupid.

I’m not going to link to all the drama, but here is one of the most reasonable, well written criticisms.

Now this is a reasonable criticism, and the comments do provide a good bit of insight and justification. Heck, even one of the authors of RabbitMQ justifies why the Twitter guys decided not to use RabbitMQ.

Fine. But this thing with the Ruby community getting bent out of shape whenever someone decides to use another language is getting old. From all appearances the Twitter folks did much more evaluation and study than 95% of the rest of the world would have. They decided to use something else. They’re writing a book about it.

So move on. Somebody found something they like better than Ruby. Shocking.

Not everybody is going to like your system. I thought DHH had already expressed how he feels about what he judges to be extraneous requirements. I think DHH meant he doesn’t care. Looks like the rest of the Ruby crowd cares deeply, religiously, fervently.

Less Painful Document To PDF Scanning

1

In need of scanning quite a few pages to pdf and without the scanning software that came with my printer and with a dead fax machine, I came up with the following fairly painless method. Noting here for future use:

  • Scan each page of the document via the command line cmdTwain:
"\Program Files\CmdTwain\cmdtwain.exe" page1.jpg
  • Convert the scanned jpg’s to a single pdf using ImageMagick:
"\Program Files\ImageMagick-6.4.0-Q16\convert.exe" -adjoin page*.jpg someDocument.pdf