Archive for October, 2007

Too Many Inboxes

0

It occurs to me that I have 6 personal email addresses, 2 work related addresses, a couple of mobile devices each capable of at least SMS, 2 blogs with comments enabled, facebook, twitter, and even a MySpace account, giving me quite a few places to look for incoming messages. I can’t decide if this is a good or bad thing.

JavaScript Style Python…

0

In a very odd development, today I find myself missing various JavaScript idioms when writing Python code. This is strange because I profess to like Python and dislike JavaScript. Hmm. I guess every language has its positives.

JavaScript is a decent language actually, once you get past the initial ugliness.

Back Home

0

We’re back home, safe and sound, no damage in our immediate vicinity. The air quality is much better and life is heading towards normal.

Thanks everyone who called or sent email, and sorry if I didn’t get back to you right away.

On a side note, the blog was down due to a temporary power outage here. I was surprised it didn’t automatically restart after the power came back on. Turns out there was still an Ubuntu 5 CD in the drive from when I first installed it! That’s pretty impressive – a couple of years of operation with no downtime.

Saima’s Wedding Reception

2

Saima’s wedding reception was this weekend in beautiful Malibu, California. Lots of pictures at the usual place. Here’s a small sampling.

First Dance

Jian

Kamran

Theresa and Roohi

Eid Mubarak from the Darugars

0

A very happy Eid Mubarak to everyone!

Eid Mubarak from the Darugars

Competing With FaceBook

0

It’s hard to compete with FaceBook – they have so much of my context already. They know my “social graph”, as they say, and it’s a great pain to replicate that graph elsewhere.

Rumor is Google plans to compete with FaceBook on openness. That’s great. Open is wonderful and airy. And Fuzzy.

But it actually makes sense. When I think of what irks me about FaceBook, it’s where FaceBook is not open. For example, they don’t allow me to take my list of friends with me off FaceBook (even scraping is against their TOS). They don’t send me actual updates; only emails instructing me to check FaceBook for updates. They don’t have very good portability – you have to use the Website, as opposed to, say, Twitter, which has a plethora of interfaces, from SMS to IM to various desktop apps.

Google already knows a lot of my social network – it’s in my gmail account. It just acquired Jaiku, which is probably the most capable multi-device, multi-interface company out there. It hired Brad Fitzpatrick, who’s determined to open the social graph.

So if Google ends up offering an open social network with support for third party apps where my “social graph” data is belong to me and can be reached from any and all devices and most of my friends are already on there and my existing blog is seamlessly integrated in, then I’d be interested.

And there’s really no reason Google couldn’t do that. The only hitch I see is FaceBook’s willingness to allow monetization of apps to be completely owned by the app owner. It’s not in Google’s nature to be that open with monetization; they like to take their secret cut and give you what they deem appropriate.

There’s no reason Yahoo couldn’t do it either. Yahoo actually has more of my social graph – between Mail and Messenger, that’s everybody I know. Then there’s flickr. So many properties could be put to use…

So let’s see what happens Nov 5th.

I HATE the MacBook Pro

7

I deeply despise this thing. A horrible, horrible machine.

Nothing works. The wireless drops connection every 5 minutes. I’ll be shocked if I stay connected long enough to post this.

It’s hot. Very hot.

It’s unusable with 1G memory. It takes more than 30 seconds to switch windows. Hit Alt-tab, wait 30 seconds, finally you get a response.

It actually lags behind my keypresses. That’s right, I can type faster than this thing can deal with, and I’m not a very good typist.

I was convinced I have a faulty machine, but apparently this is normal behavior. “Oh, just don’t use FireFox”. “Oh, yeah it does that sometimes”.

I have to make a choice between using Eclipse or listening to music, because god forbid I have both open, the machine will do its very best impression of a rock.

I don’t know what people see in it, but this machine has been nothing but garbage for me. Yes, excellent, it’s a unix underneath, but linux 7 years ago was more responsive than this thing.

Bah. I’m just frustrated right now. If I have to re-connect to wifi and VPN in one more time I’m going to accidentally run over it several times with a truck and get a regular PC….

Map/Reduce via Hadoop and Python

0

Michael Noll has a nice tutorial on how to write Hadoop map/reduce functions using cPython (not Jython).

I like this approach; access to full Python including extensions, with the ability to do highly distributed processing via Hadoop, without having to touch Java… works for me.

Pentax K100D In Hand, New Pictures

3

The Pentax K100D arrived on Friday, and today I tried it out. Shockingly, buying a DSLR doesn’t immediately make you into a professional photographer. But it does help.

Here are a few pics of the family:

Kamran
Rayyan
Roohi
Jian

Excite Giving Up the Ghost?

1

I use my excite as my “spam” email account – I use it to sign up for new services and receive “deal” emails from various retailers. I’ve noticed in the last month or two the rate of losing emails seems to have climbed significantly. A week ago I tested sending 3 emails to my excite account and only 1 showed up. It’s gotten so bad it’s not even possible to use it for validation emails.

Is excite finally giving up the ghost and going away? Am I the last user?

Fortunately I recently opened a hotmail account which has been working well, so I’ll be switching spam duties to hotmail…

Pathalog: Find User Paths by Analyzing Your HTTP Log Files

0

I needed a better way to visualize how people are using my site, beyond what the typical log analyzers report on. So I cooked up a quick hack to grab user paths from the http log files and display them in a no-frills report. It’s already been useful – I found 3 distinct parts of the site that were obviously confusing in retrospect and improved them.

This is not a general purpose log analyzer – it doesn’t report on number of page views, bandwidth, etc. Instead it can be used to see what pages your users click on and in what order. It’s useful for sites that have a natural flow (are “applications”). For example, you can see what leads your users to sign up and what leads them to confusion.

I’m making the code available in its current form in case people find it useful; if I waited till I packaged it up properly and made it friendly it’d join my long list of other never released projects. It requires Python, probably version 2.5, and can be invoked as:

python pathalog.py /path/to/your/access.log > paths.log

You can see a sample report here and grab the code from here under the MIT license.

The configuration is at the top of the pathalog.py file. You have the option of doing reverse dns to get hostnames from the IP addresses in the log files, but reverse dns can be quite time consuming, so you can turn it off. Note that the reverse dns results results are cached so subsequent runs are much faster.

I’ve tried it on Windows, Linux, and OS X. On Windows you’ll need to create a /tmp directory, or modify reversedns.py to use an alternative directory.

Membership Test Using Dict vs Set in Python

2

I was thinking of using Sets instead of dicts for a basic membership test (is item x in the set or not / does item x exist in the dict), but decided to do a quick benchmark to see which is faster. Updated: Simon points out Python 2.5 built-in sets are as fast as dicts. I was a bit surprised that dict appears to be about 3 times faster. Perhaps this is not surprising; maybe sets do more for you.

Here’s the code and results:

from sets import Set
import random
import time

myset  = Set()
mydict = {}
mybuiltinset = set()

for x in xrange(1000):
    r = random.randint(0, 10000)
    myset.add(r)
    mydict[r] = True

set_start_time = time.clock()
counter = 0
for l in xrange(1000):
    for x in xrange(10000):
        if x in myset: counter += 1
set_stop_time = time.clock()
print "set: counter =", counter

dict_start_time = time.clock()
counter = 0
for l in xrange(1000):
    for x in xrange(10000):
        if x in mydict: counter += 1
dict_stop_time = time.clock()
print "dict: counter =", counter

builtin_start_time = time.clock()
counter = 0
for l in xrange(1000):
    for x in xrange(10000):
        if x in mydict: counter += 1
builtin_stop_time = time.clock()
print "builtinset: counter =", counter

print "time for set:", set_stop_time - set_start_time
print "time for dict:", dict_stop_time - dict_start_time
print "time for builtinset:", builtin_stop_time - builtin_start_time

Results:

set: counter = 938000
dict: counter = 938000
builtinset: counter = 938000
time for set: 12.89
time for dict: 3.67
time for builtinset: 3.69

This is Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) on a MacBook Pro dual core something or another with 1G Ram.

3 Practical Steps To Taking Better Pictures

0

Here are three practical steps to help beginners improve their people pictures:

1 – Buy a decent camera.
Mega pixels don’t matter. Trust me. If you don’t trust me, trust Ken Rockwell. These days you can’t find a camera with less than 5 mega pixels, and there’s no way you’re going to need more than that. Resist the urge to let the higher mega pixel number sway your decision on what you buy.

Digital Zoom is not a good thing. Avoid it. If you want zoom buy a camera with a zoom lens.

Research your camera for 45 seconds before you buy it. DPReview is a great place to start.

Take a look at one of the “deal” sites to get a better price; Fatwallet is a good place to start, or you can take a look at my Yahoo Pipes Camera Fatwatcher, which looks for camera deals on Fatwallet. You may just be able to afford that camera you really want to buy.

2 – Step back from your subject and zoom in.
If your camera is close to the person you’re taking a picture of, you over-emphasize their nose and give the face an unfortunate shape. If you are farther away you achieve a more even, normal looking picture. Philip Greenspun has a nice explanation of why.

Give it a quick experiment; take a picture with your camera very close to your subject’s face, and then take several steps back, zoom in so their head takes the same amount of space in the viewfinder, and snap a second picture. Compare the two and you’ll see how much better the latter is.

3 – Pay attention to lighting and avoid the flash (where possible)
Lighting is very important to how well your picture turns out. Too much and too little light make it difficult to take good pictures. A good scenario is indirect natural light – for example, a sun-lit room without direct sunlight in the picture.

Flash generally makes for a terrible picture. If you have to use the flash, try step 2 above – back away from your subject and zoom in so the flash is not as harsh.

There is one common but counter-intuitive scenario where flash is actually helpful – direct, harsh sunlight. If your subject is under direct light, their face is likely to have strong shadows. Using the flash can fill in these spots, resulting in a better picture. This is what they call “Fill Flash”; this page has a good example of a picture with and without fill flash.

So, to take a nice picture of someone, find a place with plentiful indirect light – Greenspun suggests a lobby at a museum or university; I find out-of-direct-light in rooms with lots of windows also works well. Set up your subject, step away from them as far as reasonable, zoom in, and click away. Hopefully your camera will not need flash, but this depends largely on your camera. My old Canon S50 took great pictures without engaging the flash, but my newer Canon S500 will invariably trigger the flash in the exact same room with the same light.

Give these tips a try and let me know if you have others.