So Lambda: 2008

Nov 25, 2008

WOA isn't for the masses

Nick Gall's note on Web Oriented Architecture--basically SOA constrained to RESTful princicples--got me thinking about the futility of end-user programmable services.

Given the difficulty even highly trained engineers have in creating good interfaces, only a select few will be able to take a list of arbitrary services and make anything like the "slightly less general interfaces" mentioned in the note. Of course, that makes it worth it, because even a tiny percentage of the web's population is still significant and can benefit all. We just shouldn't expect the nirvana where ever user is a programmer and creates a wonderful mashup of services.

I suspect instances of "serendipity" and "unexpected reuse" will exist at the services level, but will be far and few between. As far as I can tell, I'm right so far...

Oct 28, 2008

Relative Times in PHP

While working on a side project I found myself in need of a function to produce relative times and dates--e.g. "3 minutes ago" instead of "October 28, 2008 12:18 AM". You'd think this should be easy to find in the age of Google, but I was quite disappointed. It turns out that there are some simple functions out there, but none that adequately take into account the new PHP 5 DateTime object (which avoids UNIX timestamps and therefore allows years prior to 1970 and after 2038), and is performant. So here's my solution for posterity:


/* Warning: this is a quick and dirty solution.  I'm sure there 
   are some bugs or better ways of doing it.  Hoever, feel free 
   to use it however you like. */
 
function is_leap_year($year) {
   return ((($year % 100 == 0) && ($year % 400 == 0)) || ($year % 4 == 0));
}
 
/* for relative dates */
$month_lengths = array(31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);
$units = array("year", "month", "day", "hour", "minute", "second");
$unit_comp_max  = array(12, 30, 24, 60, 60);
 
function relative_time($datetime){
  global $month_lengths;
  global $units;
  global $unit_component_max;
 
  $a = date_parse($datetime);
  $b = date_parse(date("r"));
 
  for ($i=0; $i    $val = 0;
    $unit = $units[$i];
    $val = $b[$unit] - $a[$unit];
    if ($val > 0) {
      if ($a[$units[$i+1]] >= $b[$units[$i+1]]) {
        $val = $val - 1;
        if ($unit == "month") {
          $b['day'] = $b['day'] +$month_lengths[$a['month']];
          if ($a['month'] == 2 && is_leap_year($a['year'])) {
            $b['day'] = $b['day'] + 1;
          }
        } else {
            $b[$units[$i+1]] = $b[$units[$i+1]] + $unit_comp_max[$i];
        }
      }
      if ($val > 0) {
        return  $val . " " . $unit . (($val>1)? "s ago":" ago");
      }
    }
  }
 
  $val = $b['second'] - $a['second'];
  if ($val > 0) {
    return  $val . " " . (($val>1)? "seconds ago":"second ago");
  }

  return "only moments ago";
}

Please let me know if you have a better solution, but this seems to work for me at the moment.

Aug 29, 2008

RIAs DNA runs against the web

Good post from DeWitt Clinton on the oppositional nature of RIA to the Web. I agree with him and other RIA skeptics that a "richer" web is seductive but ultimately counterproductive. While DeWitt's argument of the inevitable force of the Web is compelling, Tim Bray's comments on Web vs heavyweight application usability is even more so. For the vast majority of applications simplicity and consistency should trump unique interfaces.

Aug 25, 2008

Kongregate sucks/rocks

Ok, this is a pretty geeky image to post, but I had to capture the moment when I saw it. I've been somewhat addicted to Kongregate the last couple of months, as it suits my needs to play games, is free, and allows quick play of 15 minutes or so that I have spare in the evenings. Like an RPG it totally appeals to the collector part of my brain, encouraging me to play just a little more to acquire the badge and points, even through bad games.

I used to have a completionist OCD, where I had to finish any book I opened, and game I played, or any movie I began watching regardless of how terrible it was. Maybe this was due to thriftiness in my family genes--i.e. I payed good money for it, I had better use it to its fullest. Whatever the cause I've been able to fight it back in recent years, and proudly have a stack of unfinished books and games (unfortunately this trait never seemed to translate to useful projects, so that stack remains sadly untouched).

Web games are perfect in the sense that they are low cost entry and can be left at any time if they suck. Kongregate plays on my weakness, though. One the one hand, I can't complain, because Kongregate rocks at giving me that fix, provides a damn good service (for free!), and most of the games are pretty good. On the other hand, it preys on that weakness in the same sense that many RPG/MMORPGs do, with the dreaded treadmill--forcing you to perform mundane tasks to level up, grinding away at mediocre games for way too long.

This isn't Kongregate's fault, but it is a chance to rant at those RPGs. I've long believed that a much better strategy for a truly broad audience would be to forgo the accretion strategy and come up with an engaging concept that would allow anyone to join in for a few minutes of play without having to worry about spending a ton of time levelling up or getting killed by those who have disproportionately more time or money.

Some online games have done better in this regard. Puzzle Pirates, for example, although that's largely derivative causal games mashed together, with little social interaction in the core gamely. There's a ton of room for improvement. I predict the first game to do this right will blow World of Warcraft's numbers out of the water. I'm waiting, people, get on it! (I know I won't--it's one of those unfinished projects on my pile...)

UPDATE: Another potentially negative aspect of grinding: gold farming. Bruce: "Possibly there is evidence here that game design need looking at."

Aug 5, 2008

Vision and Action

There's a Japanese proverb I love: vision without action is a daydream; action without vision is a nightmare. I've come to believe that too many visions induce a B-grade suspense thriller that has a lot of action but ultimately leaves you feeling empty.

Aug 1, 2008

Offline code reviews are more effecient

I've been listening to the stackoverflow podcast by Jeff Atwood and Joel Spolsky. Like Joel's blog/columns I find him alternately insightful and infuriatingly naive.

Anyway, the recent stackoverflow podcast #15 contains segment on code reviews in which both Jeff and Joel agree that interactive face-to-face code reviews are more effective than offline reviews in which reviewers read the code and send comments to the author. I have extensive experience with both and I strongly disagree.

This is actually studied and there is good, researched hard data that shows offline code reviews are just about as good as group- or meeting-based code reviews. Meetings find 4% more defects on average, which is statistically significant, but take anywhere between 50-1200% longer to get there. This indicates diminishing returns to say the least, hence they are less efficient. A good summary of these findings can be found in the free book Best Kept Secrets of Peer Code Review, specifically the "Brand New Information" chapter (note: you can ignore the sales pitch chapters at the end if you like--no disclaimer necessary: I don't own stock in the company or use their products).

Given Joel's earlier podcast rant about people on the internet blogging things based on anecdotes without research and data, I find this kind of ironic. Yet it is understandable: meeting-based code reviews do feel a lot better emotionally than offline reviews; there is less oppportunity for misunderstanding and most people do enjoy the social interaction. As good engineers, though, we should recognize that what feels good isn't always the best for us, and do the right thing.

To be fair, Jeff and Joel are really lauding the learning factor of information and tip-swapping that occurs during discussions, which has nothing to do with the code defect rate or efficiency. However, further reading of the literature shows that group-based code review tends to find few new defects, and those they do find tend to be surface-level in nature.

The book theorizes--and this is borne out by my own experience and those of my trusted coworkers--that really understanding code and algortihms at a sufficient level of depth takes time and concentration that is nigh impossible to achieve in a social setting.

I don't want to completely discount the learning aspect. If you have less experienced developers then it does help to train them in code reviews. However, you should do this consciously with the understanding of the productivity hit your more senior employees are taking. Of course, there's nothing stopping you from doing offline reviews and then reviewing results with junior developers.

What I've seen work is having primarily offline reviews with comments sent back to the author (and tracked in a system), and then having face-to-face (or voice-to-voice) meetings to clarify if necessary. This gets the benefits of concentrated brain cycles from the reviewer while maintaining human contact and communication where needed. In addition, some percentage of code can be targeted for meeting-style reviews to maintain the benefits Jeff and Joel care about in terms of learning. Along with good code review guidelines and coding convention guidelines this process can scale to larger (50+) teams with many smaller code reviews a day. It is also very effective for geographically distributed teams.

Jul 31, 2008

Bad Engineering Idea of the Week

I'm struggling to remember to post new stuff here, or maybe I'm struggling to find something new to say. Either way, I'm going to try a different strategy: post about software engineering things from my job--lessons learned, handy tips, interesting bugs, hard problems, etc.

Today's tidbit is a bad idea for fixing a bug. Our architecture has a primary server and a secondary server for backup purposes, both of which must be kept in sync to guarantee correct backup behavior. One new feature attempts to provide better error detection and feedback, a key part of which is determining whether the backup process is running.

For a little more context, the primary server already will not allow clients to connect until it handshakes with the backup process and verifies a synchronized starting point. There is a simple socket connection and protocol to determine if the backup process is listening and do the handshake. If the backup process is not available, the primary server polls the socket occasionally and waits forever.

The objective of the new feature is to watch the startup of the primary server and send status to a separate application that monitors the status of all servers and clients in the network. How would you solve this problem?

Well, one of our engineers decided to modify the startup batch script to log in to the secondary server, get a task list and see if the backup process was running. If it's not, it fails immediately and stops the startup process. Why is this a bad idea? Let me count the ways...

It introduces a new interface between the two servers that didn't exist before, which adds complexity to the model.
It introduces technology not used elsewhere in the product, namely logging in to other servers and using non-portable command line tools.
It adds its own failure mode for incorrect login/passwords on the secondary server.
It couples the two servers in the startup sequence, as opposed to letting both start independently.
It requires starting over completely to recover from the identified failure mode.
It relies on the primary server configuration knowing the name of the backup process on a different machine.
The condition it detects is different from the information you want. It detects if a process is running or not, as opposed to detecting if the backup process is accepting connections and is ready to handshake (the process might be running, but more subtly broken, which this will not detect).

I'm sure there are more, but you get the idea. So what is the better solution? How about using the existing socket connection information instead of adding a redundant channel? Provide the status to the administration tool and wait for the backup to start up. An admin can easily diagnose the situation and get the backup process running if needed. You could add an optional timeout if the backup hasn't started in 20 minutes or so, but that's not really necessary.

There are a couple of key lessons/principles at work here. One is loose coupling, which is almost always a win, generally without introducing much complexity. The other is reuse: don't add new stuff unless you really need it.

Jul 20, 2008

Java Fork/Join

Does anyone else find it ironic that Doug Lea's Fork/Join for Java reinvents green threads? Yeah, I'm a little late reading this, but there is other more recent good commentary on Fork/Join (the last paragraph says it all). Don't get me wrong, I think Doug Lea is one of the smartest people working on concurrency problems these days, but I can't help feeling that most of the work on Java concurrency these days is a big bandage on top of the gaping chest wound of the shared-state model.

Jun 3, 2008

PLT Scheme Turns 13

I'm glad to see PLT Scheme continue to thrive and strike out on its own in a way that both supports the R6RS standard and continues to innovate. Relegating mutable pairs to second class status last year was a delightful move. Now if only they would get rid of shared state concurrency I'd be a total convert.

May 31, 2008

Gartner Top 10 "Technologies" For Next 4 Years

Gartner includes Semantics in it's list of the top 10 disruptive technologies for the next 4 years, as item number 10. Some commenters on the blog link are saying it should be higher (assuming an ordered list), while calling the virtualization and multicore "boring trends".

It should be no surprise that I think semantics should be taken off the list. It's not a next-4-year big thing, if any big thing at all. Multicore and virtualization is riding an exponential curve, which is the only real way to be disruptive. I believe semantics is not only difficult, but also linear. Thus, our progress in the semantic space will be far outpaced by exponential trends. I do believe we'll make progress, but my prediction is that it will be in the brute force space, aided by Moore's law over time.

Multicore isn't exciting per se, but the disruption it will drive in the software space is already visible. Concurrency is already huge if you look in the right places, but the its increasing ubiquity will start to sink in very soon.

May 18, 2008

Interoperability is Hard

True interoperability between independent software products is hard. There are multiple levels on which you need to guarantee compatibility (read: agreement among all parties) in order for any deep interoperability to work correctly, including: transport, schema, semantics, ownership, and identity.

A lot of so-called interoperability works fairly well by severely limiting one or more of these aspects (most often identity and ownership), which is fine. However, most of the conversations I hear tend to revolve around the transport, paying minimal attention to schema, and almost none to semantics.

I think this is because 95% of your implementation time tends to get eaten up by the transport, which fools you into thinking it's the hardest part. In fact, semantics is often the hardest part, but that's all done in design, which can easily get ignored. The problem is, a transport defect can be found and fixed in one product, whereas a semantic defect usually causes you to need to change the schema, which affects all products (and therefore usually doesn't get fixed, leading to poor interoperability).

Therefore good interoperability needs the meaning of your data understood and agreed upon by all parties before you settle on the schema. This turns out to be quite hard to do. It's pretty difficult even if you're in control of all the moving pieces.

I'd like to think there's a way to decouple parts of interoperability to be able to iterate on standards after they are entrenched, but I haven't found it yet.

Apr 17, 2008

Adding meta-data to search

Fred Wilson covered an interesting search innovation by one of his portfolio companies, Indeed, in which they interpolate salaries for job postings that don't include a salary.

Fred categorizes this clever trick as adding "intelligence" to search, but it's really an example of semantic extraction combined with search. Now this is a perfect example of how aspects of the semantic web will emerge. It's shallow, but very useful, and there's no requirement for exhaustive human meta-data entry or conformity to a standard.

Apr 11, 2008

Attractive but flawed

The Economist chimes in on the Semantic Web as well. I quote: "It sounds a mess and it is" ... reviews of Twine "have been mixed". But of course if people are investing millions is must be a good idea, right?

Listen, the idea of a semantic web is alluring, but it's just not going to happen. We can't agree on semantics in real life except in small groups or in very shallow ways. Computers just aren't going to be any better at it until we create something smarter than ourselves. These technologies are all parlor tricks when compared to the grand vision espoused by semantic web evangelists.

I'm not saying some of the automated semantic extraction technologies are useless. Some of them are very cool, and it is absolutely the way we should be heading (waiting for humans to tag everything is a waste of time). However, we need to recognize while this path is taking us somewhere good it will ultimately fall short of the vision--much in the same way that the AI field has given us some great improvements without approaching a true artificial general intelligence.

Mar 26, 2008

Semantic Web Pattern

Shorter Alex Iskold:

Introduction: The Semantic Web means lots of things to lots of people, but it's important and real!

Top-down approaches to semantic extraction (e.g. Google) are very successful and hard to compete with, but bottom-up approaches are possible now!
RDF, Microformats, and meta-headers are used in narrow or limited applications, but we have choices!
No one has a compelling killer app for the Semantic Web, but enterprises will buy anything that sounds good!
APIs are available, so someone will build something cool any day now!
Semantic search progress is practically non-existant, but a lot of people are trying!
Since semantic search is a bust, more focused guess...I mean Semantic Extraction looks promising!
Semantic databases are not production-ready yet and don't scale, but people are really working hard on the infrastructure to be ready when they take off!

Conclusion: The Semantic Web was promised to be just around the corner a decade ago, but we're just in the early stages and it holds such promise and is just around the corner!

Yes, I see a pattern...

Mar 12, 2008

Python, Lua, Ruby, or Scheme for Game Prototype?

I had an interesting idea for a casual game and was planning on prototyping it, but now I'm stuck on choosing a programming language to do so. I've had luck with Python and Pygame before, but for some reason can't get it to work on Vista.

Both Lua and Ruby look promising with active game/SDL libraries, but I keep thinking that they won't meet the future needs of the game if I take it past the prototype stage. Lua's lack of an object-oriented model bugs me, but I think I could get around that. Performance is also a concern, though. Ruby is definitely getting better, but I also need to understand how far I can push Ruby's for dynamic class members, and the online docs aren't so hot. Also for either one I'll probably need to learn how to write extensions in C to implement some of my ideas.

I'd really like to pick Scheme, but the game support is poor for existing implementations, as is the final distribution model for those that aren't compile-to-C. I have a half-built Scheme implementation from a while ago that would be ideal, but then that's yet another distraction from really getting to the prototype.

I have a feeling I'll start with Ruby and move to something else as necessary. I initially avoided Ruby, but now that I'm taking a closer look it seems pretty good for this type of work.

Mar 4, 2008

Virtual Earth is not the goal

Rudy Rucker has an interesting and convincing post about the improbability of creating a mirror virtual Earth to replace the "real" one. Unfortunately, I think it conflates several issues and he's only right one of them--fortunately it's the his main point!

To get the main argument out of the way, he's absolutely correct that there's very little cost-benefit in simulating an exact duplicate of the Earth, and that such a simulation is probably doomed to failure because of cost, complexity, and theoretical physical limits.

However, that doesn't mean that the Earth won't end up as computronium anyway. I can easily see a chain of events whereby we create a virtual world (low fidelity and incomplete), load ourselves up into it, create new types of reality, and decide that the VR world is "better" along some cost-benefit axes and continue to convert all available matter. This means we can end of never creating a good simulation of Earth, but that we don't keep the old one around either.

I think Rudy is placing some value on the existing world such that, since we can't perfectly recreate it, we'll keep the old one around because it's so great. While I sympathize and love the Earth as it is, I think the whole point of the Sigularity is that we can't know the minds of post-singular beings and what value they will place on anything. It's very plausible that high-speed simulated worlds are more valuable--maybe even to the point of a necessity for Darwinian survival--that the slow-paced "real reality" is at best forgotten.

I don't think Stross speculates much on what post-singular beings do in computronium, but he does say it focuses more on completely new and complex forms of social interaction and economics. One need look no further than today's Web to see how little simulation is required to allow that to happen.

I'm not placing value judgements on these potential paths, but it's a safe bet that "matching nature" is not the value function that the post-singular beings will attempt to maximize.

Marc likes Obama

I respect Marc Andreessen, so I listen when Marc says he likes Obama. The last part of the post makes some good points:

First, Obama's organizational skills with the campaign should be a decent leading indicator of his ability to organize people in government. He's doing well so far, and this type of managing people is what the job of President is all about.

Second, he's experienced different cultures around the world, which gives him an excellent perspective in foreign matters. While this doesn't make him an instant expert on foreign policy, even my limited traveling in the U.S. and abroad tells me that this is an incredibly important factor that few presidential candidates have.

Mar 1, 2008

Feeling good

So, obviously I'm out of surgery now. I'm at home, feeling pretty good. The pain is minimal and I'm just resting. I can't allow the neck area to get wet for the next two weeks, so that will make bathing a challenge, but all things considered this is going pretty well. Now I just have to wait for the pathology report.

Rapidly approaching singularity

I've recently been following umair's Bubblegeneration/Collectivegeneration blogand his dour predictions for the economy (and our whole social-political-economic structure). It's pretty clear that he's right about the direction things are headed and the fundamentally corrupt DNA in current governmental and corporate practices.

However, I'm still trying to reconcile this with the explosive progress being made in new technology and social structures being led by an increasingly connected world online. It's hard not to feel that we're spawning new polical, social, and economic structures with more resilient DNA at an increasing pace.

Now, whether or not these forces balance out or one wins is too difficult to say at this time. One could claim that that's the inherent unpredictability of the Singularity. I, for one, am an optimist, and I believe that we'll settle out in a new equalibrium that will be better overall. Unfortunately, I also think we'll have to struggle through a period of volatility to get there, with a lot of disruption and some pretty bug losers as well as winners.

Feb 26, 2008

Fat Heads is full of Fat Heads

I've always liked Fat Heads on the South Side of Pittsburgh, but waiting over an hour for a table without anyone even apologizing is uncool. Our group of seven left for Piper's Pub across the street and we were seated immediately (and ate some great food). I'm not going to give any love to Fat Heads for a long time.

Feb 25, 2008

What of RIAs?

Rich Internet Applications (RIAs) are the new old rage, reminded as I am of it by Adobe's launch of AIR. I'm with Tim Bray on this, though, in that I think RIA's are not that great.

The web comes with a lot of constraints, and that is a Good Thing, which leads to simpler designs by forcing designers to get to the heart of a problem. These simple pieces are also leading to composable applications (mashups and web service integration). I don't know if this is leading to nirvana, but it's certainly better than where we were before the Web.

So I agree with Tim that most of the richness claims are red herrings. However, and he doesn't seem to point this out yet, there is a compelling capability in being able to work offline using the same technology stack as the Web. Now, I don't believe Flash is the way to go here. Rather, some offline type of DHTML/AJAX that resyncs when connected would be pretty cool. However, this is a hard problem to solve--something I am currently struggling with at work (for a thick client app no less!), so I am also skeptical of claims that RIAs or AIR will help much in that space.

Feb 24, 2008

Atonement

A decent movie. The acting was good, the dialogue well conceived, the cinematography excellent. I found the scoring inventive but also somewhat intrusive.

I felt the use of WWII to be a rookie mistake in that the realities of it dwarfed the supposed main plotline and clashed with the fanciful storytelling aspects. However, that scene on the beach was incredible. Although the use of the war was awkward, Atonement far outshines in comparison to The English Patient, but I don't favor it for Best Picture tonight.

If it's not one thing it's another

So my surgery is scheduled for this coming Friday. They're going to remove a mass on my left parotid gland. It will be a conservative procedure since the pathology report from the biopsy was inconclusive (my doc said those things were useless in these cases). It'll take a week to get the results back, but I should be back at work in a couple of days, albeit with a nice long set of stitches. This is far enough ahead of our Florida trip that it shouldn't be a problem.