Monday, October 20, 2008

USS Pampanito is a Steampunk Submarine



I had forgotten that the USS Pampanito was built at a time closer to Queen Victoria's reign than to today. Polished brass, great big chains, hydraulic lines and knobs everywhere!
There is also a great set of 360-views at http://www.maritime.org/tour/tatrvr.htm

Thursday, October 16, 2008

Scientific Computing Survey

-----------------------------------------------------------------
Interesting looking survey from Greg Wilson and others. If you're a scientist, I'd encourage you to add your information.

~Matt

Computers are as important to modern scientists as test tubes,
but we know surprisingly little about how scientists develop
and use software in their research. To find out, the University
of Toronto, Simula Research Laboratory, and the National Research
Council of Canada have launched an online survey in conjunction
with "American Scientist" magazine. If you have 20 minutes to take
part, please go to:

http://softwareresearch.ca/seg/SCS/scientific-computing-survey.html

Thanks in advance for your help!

Jo Hannay (Simula Research Laboratory)
Hans Petter Langtangen (Simula Research Laboratory)
Dietmar Pfahl (Simula Research Laboratory)
Janice Singer (National Research Council of Canada)
Greg Wilson (University of Toronto)

Tuesday, October 7, 2008

Preprocessor Warning Signs

Using #if 0 makes searching hard

Some symbol is not defined so you grep for it and you find it in a file. But you've forgotten that you commented that part of the file out. So the symbol exists in the source but not in the .o file, leading to puzzled head-scratching. A better idea is to delete the unwanted lines since you can find them in your version control whenever you want them, and perhaps even leave a message about what they were.

Try this to see how many times you've done this:


[mdoar]$ find . -name \*.[ch] | xargs grep '#if 0' | wc


#ifdefs for unit tests mean that the code tested differs from the code used in the product


If #ifdef has to be used to make a function testable, then perhaps the function needs refactoring? If possible, use the build tool to create the tests, not the preprocessor. Some sample quotes at random from the web:

"I don't ever use ifdefs for unit testing because code should never know it's being unit tested."

"Tests must be non-invasive. I don't want to have to add #ifdef UNIT_TEST declarations into my production codebase as it'll end up making a mess and worse, could actually change behaviours. The framework and test code will be in an externally compiled project"

Tuesday, September 9, 2008

AtlasCamp



So I've booked my place at AtlasCamp, Atlassian's first customer/developer love-in happening this November up in Marin County, CA. I'm not really sure what to expect except lots of product discussions, and I suspect a drink or two. Maybe we'll write some code, I don't know. I'm looking forward to it.

Anyway, if you're also going and would like to carpool from around San Jose, I'll have a couple of places in my car.

~Matt

Tuesday, June 17, 2008

Response to "What is it like to write a technical book?"




This is a fair and honest summary by Baron Schwartz of what it's really like. I wrote a book (and proof-read it better than this post) in 2005 and probably won't get a chance to do it again ("at least not with this wife!" says my dearest).

What a well-established publisher brought to the table for me was a guaranteed outlet for the book, more marketing than I would have had the energy or time to do, and a US$6K advance. They also provided high-quality images, indexing and copy-editing.

There's a nice progress chart of my work shown above and here with the total word count and words/day over time. If I were ever to do it again, I would estimate an overall rate of 500 words/day requiring at least 200 7-hour sessions, so 1400 hours. For comparison, my 120-page doctoral thesis took about 4 months of 6 day weeks with 10 to 12 hours/day, so about 1000 hours.

Saturday, June 14, 2008

Consulting is like waiting for a bus



Every few weeks I am contacted by people wanting help with their development environments. This is a Good Thing for a consultant, but I've started to notice something very odd; the new contacts always come in groups, just like buses. I never get one new contact in a day, it's always either none, two or three.

Now this might be explained if I did more public talks, or groups of colleagues recommended me, but I haven't been able to spot a pattern yet. It's not related to the day of the week either, though I know the budget cycle in companies affects when consultants are hired and fired. Anyone got any ideas? I wondered about synchronization theory but it seems like a stretch.

Wednesday, May 28, 2008

Making Bugzilla Read-only

I finally got this working after a few frustrating attempts. So, what not to do first.

Change database-level user permissions

I actually got this working once but it took three or four iterations since Bugzilla needs to be able to write to some tables to let you log in. Also, once you have messed with the permissions, I find restoring them to the prior state to be fiddly. This approach is really at the wrong level.


The editbugs group


There's a system level group in Bugzilla named "editbugs" that is apparently related to allowing a user to edit bugs. However removing a user from the group does nothing that I could see.

What Worked For Me

1. For each product, check the Closed checkbox so that the product does not appear in the list of products where bugs can be created. This effectively stops new bugs from being created, though perhaps using the right URL might still work.

2. Define a new group named something like "CanStillEdit" that can be used for bugs and add it to all products. Don't add anyone to the new group yet.

3. For each product, edit the group memberships at the end of the configuration page and set the "Can Edit" checkbox for just the CanStillEdit group. Leave all of the other groups unchanged. Now only members of the CanStillEdit group can edit bugs. It took a few passes through the documentation about this feature to see what they meant.

If all goes well then trying to save a change to a bug will produce a big red box with an error message in it. Usually the error message is helpful ("you do not have permission to do that"), but sometimes it says things like "you tried to update X from to 123" which is confusing but harmless. Search works just as before.

Other details:

1. The other settings for the groups I was working with were Shown/NA.
2. Administrators can also still edit issues.
3. All this is for Bugzilla 2.x - I would hope that there are better ways to do this in the latest versions.

Monday, May 26, 2008

Big Red Button as a Psych Test



In a recent post Raymond Chen talks about Big Red Switches and various people weigh in with recollections about pressing such things. Well, I have in my possession a Big Red Button that I found at the excellent Triangle Machinery & Tool Co in San Jose. When I had a desk, I used to have it mounted with a thick cable running from it to under my desk. It wasn't connected to anything but it told me lots about the people around me.

Most people would ask "what's that for?" or "is it connected?", but occasionally there would be someone who would press the button and then ask the same questions. Those are the people you should never let into a server room!

Wednesday, May 14, 2008

Emacs, OS X and the EDITOR variable

This took a bit of fiddling to get right, so I offer it up to the Great Google Index with the hope that it may save someone else some time. The EDITOR environment variable is what is used by Unix applications (such as Subversion and git) when they want to start an editor for input. Starting emacs from the command line in OS X uses the open command.

To get this working with emacs and OS X Leopard I had to make the file /usr/bin/emacs contain:

#!/bin/sh
open -nWa /Applications/Emacs.app/Contents/MacOS/Emacs "$@"

This opens a file using a new instance of emacs and waits until emacs exits.

Three Hard Lessons in Consulting


1. Don't start the project until everything is ready. If a machine needs more memory to run the next version of an application, wait for the memory.

2. Stop working after the agreed period of time. It's better to disappoint a client and renegoiate than to surprise them with unexpected billable hours.

3. Windows infrastructure has a price. Locked directories, having to use Notepad, Wordpad or install Emacs, debugging complex service configurations: all of these things add up to about 25% more time taken during major tasks such as upgrades.

I'm officially and practically platform-independent in the services I provide. Linux, Unix, Windows, OS X, I do them all. But from now on I'm going to start charging different rates for Windows environments. Not a lot, just enough to offset the extra time and effort that they require.

Friday, May 2, 2008

My Dash Express



Funnily enough, I'm not much a tech fan-boy. Generally what happens is that our friends are first adopters, and then we use their cast offs a few years later. I wrote a book on a discarded laptop and I watch films on a second-hand screen, while sitting on a pre-loved sofa. It's not just that I'm cheap, it's also a green thing.

So what on earth persuaded me to spend $400 last month on a new GPS, the Dash Express? That's just what my dearest wife wanted to know as well. Now, she has no use for such navigation devices having spent most of her life in the Bay Area. But now that I travel to different clients for work I was getting lost more often than seemed sensible, even for my relaxed approach to navigation.

The attractions of the Dash over other GPS systems for me are:

- GPS, cellphone and wifi antennae are all used. The GPS is for positioning, the cellphone for traffic information and internet connections while on the road, and wifi for updates at home
- I can right click on an address in a web page and send it to my car. No more typing on tiny screens. This is great.
- I can search while I'm on the road. For example, when I get near my destination I realize I have to park somewhere, so I can search for parking right then.

The downsides are:

- $10/month subscription fee, though the first three months are free
- Routing is slow and favors highways
- I had to screw a bracket onto my dash to secure it since CA doesn't allow anything to be mounted on the windscreen.


So far I've been happy with it, though I'm expecting great things from the first few software updates later this year. It has certainly saved me from some major blunders since I've been using it. And my wife refers to the Dash as my mistress!

Atlassian User Group, Palo Alto


Well, I enjoyed the Atlassian User Group yesterday at Stanford. I'd forgotten just how complex getting a parking spot can be there, but I made it with only a few surprised comments from my new mistress. I gave a summary of my JIRA MultiSite post, but the things that I enjoyed most were:

- seeing what's coming up in JIRA 4.0 (nice work from Nick Menere and others)

- asking Mike (CEO) both technical and business questions and seeing him handle both well

- meeting lots of smart people using JIRA, with an eye to new clients of course.

The sponsors Stanford and WANdisco provided a good spread of food, which always helps such things. I'd recommend any JIRA administrators in the Bay Area come to the next one.

Tuesday, April 22, 2008

More bugs than revisions


(Debian bug count from http://master.debian.org/~ajt)

Heard as an aside: "that project's got more bugs than revisions!"

Which made me think what kind of project might reasonably expect that statement to be true. The space shuttle software as a whole, perhaps? Lots of testing implies lots of bugs, and scrupulous code reviews means fewer commits.

At the other end of things, many small projects might get a few hundred revisions and then a few hundred bugs over time. Of course, most bugs will end up closed in one way or another.

Anybody able to point to a real project with more bugs than revisions?

Wednesday, April 16, 2008

Just costs you double

Me: "just" costs you double.
Them: Huh?
Me: Every time you use "just" to describe a feature or a process it tells me you've made a gross assumption about what I'll need to do ...

That comment on a post titled All I Need Is A Programmer made me nod in agreement. "Just" is a warning sign in conversations for me, like "always", "never", and the dreaded "trust me".

All those phrases make me think of exceptions. "Always? Well, there is that edge case ...". "Just? Why 'just'? What about ..." and I begin to wonder whether the person really knows what they're talking about.

Just say 'no' to 'just'. (You knew that was coming, right?)

Monday, March 17, 2008

The Mac Mini as a Laptop and VMWare Fusion

I needed a new laptop because the one I had borrowed had disk errors in the Windows registry, as disks do after too long running Windows. Since I find laptop screens, keyboards and mice uniformly awkward, I bought a laptop without the keyboard, mouse or screen - a Mac Mini. (Frys, US$750)

The upsides, apart from being less than half the price of a decent Mac laptop, are that it really is compact and there are some decent virtualization choices (Bootcamp, Parallels and Fusion). I went with VMWare Fusion and overall, it behaves as you might wish. Easy to install, cheap at $100 and runs most VMWare images.

The downsides of not buying a laptop are that power outages become more significant without a battery, and perhaps the disk drive isn't mounted in a suitable manner for carrying the box in a bag every day? Swapping from VM to VM, or VM to host OS sent the swap rate soaring, so I installed an extra 1GB of RAM to help with that. While installing the extra memory, I took a good look at the hard disk and it seems to be mounted just as it would be in a laptop, so I'm hoping that the disk will survive being moved every day. If anyones else has experience with mac mini hard disk lifetimes, I'd be interested to hear from you.

And so today, I had another "virtualization is great" moment. The server that I had traveled an hour to work on was busy having its motherboard replaced. Since I actually wanted to work on was the OS configuration, so I booted the Windows server image with Fusion on my little Mac Mini, did the necessary work, and moved right along. Very neat.

Tuesday, February 26, 2008



The February issue of Python Magazine is out and contains my article "Using Python and SOAP to create a CLI for JIRA" about the Python CLI that I wrote for JIRA a while ago. The article's summary reads:

Many web applications include an API that lets you interact with them from the command line as well as with a browser. In this article, Matthew shows how to build a command line interface for JIRA, a well-known issue tracking system, using Python and SOAP. JIRA is a Java application, but using SOAP allows you access to many of its features using just Python.


One hint that I wish I had remembered to add is that when you have a redirect to your JIRA server, for instance when http://jira.mycompany.com is redirected by Apache or IIS to http://jira.mycompany.com:8080, you may see your login mysteriously fail. The answer is that you have to use the redirected URL with the JIRA CLI. You can find out what the redirected URL is by running the CLI with the argument -v10 to increase the logging verbosity and look at the line that starts with "Host". This example shows that the port to use is 8080.

*** Outgoing HTTP headers **********************************************
POST /rpc/soap/jirasoapservice-v2 HTTP/1.0
Host: localhost:8080

Wednesday, February 20, 2008

Finally, a use for those JIRA user properties

One of the most useful JIRA plugins I've found is the JIRA Toolkit, described as "a bunch of neat custom fields Atlassian have developed for their own use". As an aside, if they're so neat and useful then why aren't they in the core product?

One of the more recently developed fields is the View User Property custom field, which is currently only documented in the issue that last link refers to. This handy little field allows you to display properties that you previously added to a user, as a read-only field in each issue.

For example, add a property named "Company" to some of your users in JIRA, then install the JIRA toolkit and create an instance of the View User Property field. Now configure it with "reporter:Company". Add the new field to some screens and the value of the reporter's Company field will show up in the issue. This also works with "assignee:Company". You can even get a user name with "My Custom Field:Company". This will use the user name found in the custom field named "My Custom Field". Just use the JIRA custom field name, including any spaces.

I've just used this to associate a company name with every customer user, and I'm sure there are a number of other pieces of per-user information that could be displayed with this field. One missing piece for this field is the ability to search for issues with particular values.

Thursday, February 14, 2008

Evaluating JIRA Multisite


Given the number of organizations already using JIRA across a WAN, there is plenty of interest in finding ways to improve the experience. I've had a few clients suggest using distributed databases, changing HTTP caching behaviour or not using HTTPS. None of these are really great fixes, and are complicated by the fact that JIRA keeps much of its data in a local Lucene index outside the database for performance reasons.

So when WANdisco announced a beta of JIRA Multisite last November in partnership with Atlassian, I was interested to see what it would do. It's billed as a high availability solution and in doing that it gives you local JIRA servers with all your data nicely synchronized. There is another approach that was announced at about the same time, the JIRA clustering solution Scarlet. I haven't evaluated Scarlet yet but it appears to have a single point of failure by default.

I contacted WANdisco to ask for an evaluation copy and they were happy to help. They have an existing replication tool for CVS and Subversion that they've connected to JIRA. You need their tool and their instance of JIRA. As an aside, though they are keeping up with each release of JIRA, I'd rather have instructions about how to modify my existing instance of JIRA to work with their tool, but I'll take what I can get for now.

To provide high availability you have to have 3 or more instances of JIRA, but since I was mainly interested in how each sites' performance changed, I just set up two instances of JIRA, one in San Jose, CA and the other in Bangalore, India. The connection between the two sites is a clogged T1 at best and the team in India often have sluggish response times from JIRA.

Setup Experience

WANdisco wanted to set the tool up, but I did it myself in an hour for the two nodes. Instructions were beta quality, but not bad. After that piece of stubbornness, their tech founder worked out what I had done wrong in about an hour, and then together we had it all working in another hour. Three hours from scratch is pretty good as these things go.

Testing

I modified a bug in San Jose and watched the change appear in India a second or two later. Then I modified a bug in India and saw the change locally in about the same time. Just as expected. Then we stopped one of the JIRA servers, made some changes, waited a bit, restarted the server and saw the changes all get synchronized. Other users updated issues over the next month and the changes appeared just as expected. The big win was that the users in India saw their local response improve dramatically. The underlying WANdisco replication tool was rock solid for the month's evaluation.


Restrictions


The version I tested didn't synchronize attachments, but that has been added since then. You do have to use the same OS (and database I believe) for all the instances of JIRA. This was not a problem for me, but if you have a Windows server in one location and Linux in another, it won't work.

I didn't try https, but I did set up LDAP authentication and that worked as expected

I'm pretty sure that if I wanted to go back to one instance of JIRA I could have exported the data and then reimported it into a non-multisite instance of JIRA.

Cost

Pricing is public and is US $7500 per instance of JIRA, which is about 50% more than the current Enterprise license cost. This seems about right given the cost of the tool and the target customers. Support comes from WANdisco and JIRA, in that order.

Summary

JIRA Multisite is still in its early stages, but it is very promising. It worked well for me with little effort, and provides good value for the price.

Wednesday, November 28, 2007

Perspective of a 3-year old


A few choice observations from my youngest son in the past few months. I'm posting them as reminders that not everyone thinks as we expect them to.

During the minor earthquake in San Jose, we all went outside. Afterwards he kept running outside then back inside again. When asked why, he said he was "looking for more earthquakes." (I guess we found one outside the first time?)

We came across a dried-up deer carcass, and I was teaching him to use a stick instead of his hands to touch roadkill. This lead to a discussion about death and whether the deer would come back to life. I gently explained that the deer was done with its body now. He looked thoughtful for a moment, and then drove the stick through the carcass, exclaiming "you stay dead then!"

When asked during the Christmas Pageant rehearsal what kinds of animals were in the manager with Jesus, he piped up with "pterodactyls!". To his disappointment, he is going to be a lamb instead.

Tuesday, November 13, 2007

Choosing Project Names


The discussion What' in a Project Name? over at Coding Horror reminded me of section 3.6.1 of my book Practical Development Environments:

Project names are usually chosen by engineering groups, with one name for each significantly different version of the products that they are working on. There should be no need to change a project's name once it has been chosen. Product names, on the other hand, are the names that customers see, and these names are usually chosen to help a product sell or to become popular. Product names can change at the whim of a market research poll or a new VP of Sales.

Some general guidelines for choosing names for projects are:

Keep it short

Since project names may appear in filenames or source code, shorter project names are preferable; four to six characters is common. Longer names will only be abbreviated anyway, and usually in two different ways.

Use distinctive sounds

Project names should sound different from each other when spoken aloud by people whose native language is not the one used by the rest of the group. Even if everyone speaks English, having two projects named "ctest" and "seebest" is too close for comfort.

Use low-frequency letters

It's much easier to be confident that all references to a project name can be found if the name contains characters that are less common in the local language. This is a good argument for choosing project names that use unusual characters, such as the letters q and z for English.

Apocryphal aside: a few years ago there was a project named IDS that apparently had a function named IDSConnect. Then the project was renamed DIS and all its functions were renamed accordingly, which led to their function for creating connections being renamed to DISConnect. The letters d, i, and s are too common in English to simply reuse them in
such an anagram.

Make it unmarketable

Sometimes a project name will be reused as a product name, but not if it is already trademarked, or if you make it odd or crude enough! Project names don't have to have a theme, though that can be fun. They don't even have to be meaningful, just memorable with an obvious way of pronouncing the word. You can choose a number of suitable names once and then let people decide which one they want to use next. Names of stars, types of sushi, rare diseases, and characters from comic books are some ideas to start with for project names.