Friday, July 27, 2007

The largest, longest-lasting hack ever?

Hack, in the sense of "workaround", created by humans, elegant or not, and not restricted to computing.

My current contender is upper and lower case letters in English. It's like mixing two fonts in the same document. If scribes and early printers wanted ways to break up text, making different shapes for the same letters is an odd way to do it. Maybe space constraints were their real problem?

And what about double quotes? Someone came up with the idea of using a small mark before and after a word or phrase to set apart. Then someone else decided to just use the same marks twice. I'm surprised we haven't seen triple and quad quotes appear over the centuries.

What's your vote for the longest-living and largest hack in the history of humanity?

Wednesday, July 25, 2007

Useful "FishEye for Jira" plugin

Jira is easy to integrate with CVS and Subversion, but there's a FishEye for Jira plugin that can give you even better integration. If you are using FishEye (think ViewVC on steroids, plus change logs) as a front-end to your CVS or Subversion repositories, you should definitely investigate this plugin.

The screenshot below shows how the plugin replaces the Commit tab in Jira with a new FishEye tab.

Why is this a better approach?

The default mechanism used by Jira to connect issues to commits is by looking for issue identifiers in commit messages. For example, a commit with the message "Fixing TST-1234 again" is probably referring to the issue TST-1234. So far, so good.

The problem is that getting this information from CVS and tracking ongoing changes involves running "cvs log" and parsing the output. I recently came across one such file that was 90MB and took 10 minutes of churn to produce. Since the commit is only connected to the issue when this file is reparsed, there is often a sizable delay between making a commit and seeing it appear for the issue in Jira. Subversion integration with Jira is less of a load on the Subversion server, but still suffers from the same delay.

The FishEye plugin is much smarter than that. It uses the FishEye API to remotely query your FishEye instance each time that you refresh the FishEye tab of an issue inside Jira. FishEye has already indexed all the commits for its own purposes, so it can provide up-to-date results far faster than the out-of-the-box approach. Another benefit is that Jira no longer needs to load down the CVS server, or even contact the Subversion server.

Nitty Gritty

Installation is just like any other plugin, with some configuration in a .properties file. Remember to enable the API access in FishEye, and use the same names for repositories in the configuration file that you used when defining them in FishEye.

Wednesday, July 18, 2007

Priority Inflation

In most companies and projects, limited resources mean that as the ship date for a release approaches, only bugs with Priority 1 and 2 get fixed; the others are closed or deferred. Over time this practice leads to priority inflation. Someone entering a bug knows that this bug won't stop the product, but she remembers that none of her Priority 3 bugs got fixed last time and she really wants this one fixed, so she makes it a Priority 2. In the extreme, by a process of induction, all bugs become Priority 1 bugs and the purpose of the field is lost.

There's not much you can do about this except be aware of it happening and remind people what the priority fields are actually for.

The Lack of Difference between Priority, Severity, and Urgency

Chapter 7 of Practical Development Environment discusses tracking bugs. This article expands on ideas about bug priorities from there. The short version is "life is simpler if you rename all your priority fields to show who cares about one."

The strip is from Hans Bjordahl's very funny Bug Bash site.

Priority, Severity, Urgency, ... huh?

Most bugs have a field to indicate how serious the bug is. A common series of values goes something like this:
  • 1 - The bug stops the product, and no workaround is possible
  • 2 - The bug stops the product, but a workaround is possible
  • 3 - The bug breaks a minor part of the product
  • 4 - The bug is cosmetic or an irritation
To make this field useful, you have to make it really easy for users to find out what each value is supposed to mean, right when they are entering the bug. Some people expect higher values to mean that the bug is more important. Some bug trackers provide tiny little icons to confuse you still further because you can't remember what each icon means.

Where's that Thesaurus?

But there are more serious problems with this field in practice. The first problem is that English has a lot of words for some rather similar ideas. Quick, does "priority" mean the same as "importance"? What's the difference between "urgency" and "severity"? If English is not a user's first language, this makes using your bug tracker much harder work. Yes, the words do have specific meanings but if you have to stop and think about those meanings, then they're not good words to use. My suggestion is to pick one word and use it exclusively. I'm going to use "Priority" for the rest of this discussion, but you can choose your own favorite.

Priority for Who?

"But that's not right!"you mutter, "priority doesn't mean the same as urgency." Well, actually, I think it does, depending on who's priority we're referring to. For instance, the priority of a particular bug for the engineering team is a totally different thing from the priority of the very same bug for a Sales Engineer on site with a customer breathing down their neck. The different words usually mean priority for different groups of people. If they really do mean different things, then make it more obvious. For instance, use "Due Date" instead of "Urgency" to make the sense of time explicit.

My key suggestion is to have multiple priorities for a bug but name them per team. For example, the fields could be Development Priority, QA Priority, Support Priority, and so on. You can even have a "CFO Priority" for bugs that are stopping a contract being signed at the end of a quarter. That way, everyone gets to record which bugs really matter to them, and then they can work out which ones get worked on first. And as a side benefit, no-one gets offended when their favorite bug's priority is reduced to "Minor".

Of course, this approach means that the leaders of the teams involved in producing the product have to talk to each other regularly about what they actually want the teams to work on. Call me crazy, but that seems like a good idea to me.

Priorities that are set by Customers

One last thought: some companies allow customers to enter a priority when filing a bug. This usually becomes a "how irritated are you right now?" field. Which is useful data, but perhaps not what you originally expected to record in that field. However, when you change the value of the customer's Prioritity field, it's always a problem. If you increase the severity, the customer worries whether the problem is perhaps part of a bigger issue. If you decrease the severity, you appear to be minimizing his distress. If you provide this kind of field for customers, I suggest you allow them change the values themselves.

Thursday, July 12, 2007

Unit Testing and the New Testament

Taken out of context and not the intended meaning, but last Sunday it struck me that this reads like a quote from some article about unit testing:

"Each one should test his own actions. Then he can take pride in himself, without comparing himself to somebody else, for each one should carry his own load."
Galatians 6:4,5 (NIV)

Wednesday, July 11, 2007

Review: Beautiful Code


for all developers beyond their first project.
9/10 for beginners who may not see as much of the beauty yet.

Edited by longtime O'Reilly editor Andy Oram and pragmatic academic Greg Wilson, their new book Beautiful Code (O'Reilly, June 2007, 618 pages, US $44.99) is a collection of 33 chapters written by a number of people who have all written lots of ugly code, and then written some very beautiful code. Each chapter is a chance for them to describe the ideas and code that they feel stand out in a humdrum world of temporary workarounds that last forever.

The list of authors is impressive, including "K&R" Kernighan, "Pearls" Bentley, and many of the names behind some substantial modern software projects, but the first thing that I noticed about this book was lots of code snippets, in languages that range from Fortran to Python. And the code snippets are indeed beautifully formatted (with a good lay-flat binding). But in the end the beautiful code is included to make the ideas behind it all the more concrete, in which it surely succeeds.

The range of topics is the next obvious aspect, from writing a regex matcher, to what makes an API useful. I've included the whole list at the end of this review and you'll likely see something that catches your own interest.

Another theme of the book is that almost no-one gets it right first time, but just look at what can be done when you keep trying. As we hear in preschool, "the process is as important as the product", and watching any expert at work is always educational. This book describes experts' mistakes, their wrong paths, "aha!" moments and then their final successes with admirable honesty.

So how much of a developer do you have to be to appreciate this book? My guess is that you should have written a good-sized project, and then admitted to yourself that your creation was ugly. Only then can you see beauty by comparison. This book is destined to become a classic.


Disclaimer: Greg Wilson reviewed my book Practical Development Environments in 2005, and O'Reilly sent me a copy of Beautiful Code for this review.

The book includes:

Chapter 1, A Regular Expression Matcher, by Brian Kernighan, shows how deep insight into a language and a problem can lead to a concise and elegant solution.

Chapter 2, Subversion's Delta Editor: Interface as Ontology, by Karl Fogel, starts with a well-chosen abstraction and demonstrates its unifying effects on the system's further development.

Chapter 3, The Most Beautiful Code I Never Wrote, by Jon Bentley, suggests how to measure a procedure without actually executing it.

Chapter 4, Finding Things, by Tim Bray, draws together many strands in Computer Science in an exploration of a problem that is fundamental to many computing tasks.

Chapter 5, Correct, Beautiful, Fast (In That Order): Lessons From Designing XML Verifiers, by Elliotte Rusty Harold, reconciles the often conflicting goals of thoroughness and good performance.

Chapter 6, Framework for Integrated Test: Beauty through Fragility, by Michael Feathers, presents an example that breaks the rules and achieves its own elegant solution.

Chapter 7, Beautiful Tests, by Alberto Savoia, shows how a broad, creative approach to testing can not only eliminate bugs but turn you into a better programmer.

Chapter 8, On-the-Fly Code Generation for Image Processing, by Charles Petzold, drops down a level to improve performance while maintaining portability.

Chapter 9, Top-Down Operator Precedence, by Douglas Crockford, revives an almost forgotten parsing technique and shows its new relevance to the popular JavaScript language.

Chapter 10, The Quest for an Accelerated Population Count, by Henry S. Warren, Jr., reveals the impact that some clever algorithms can have on even a seemingly simple problem.

Chapter 11, Secure Communication: The Technology of Freedom, by Ashish Gulhati, discusses the directed evolution of a secure messaging application that was designed to make sophisticated but often confusing cryptographic technology intuitively accessible to users.

Chapter 12, Growing Beautiful Code in BioPerl, by Lincoln Stein, shows how the combination of a flexible language and a custom-designed module can make it easy for people with modest programming skills to create powerful visualizations for their data.

Chapter 13, The Design of the Gene Sorter, by Jim Kent, combines simple building blocks to produce a robust and valuable tool for gene researchers.

Chapter 14, How Elegant Code Evolves With Hardware: The Case Of Gaussian Elimination, by Jack Dongarra and Piotr Luszczek, surveys the history of LINPACK and related major software packages, to show how assumptions must constantly be re-evaluated in the face of new computing architectures.

Chapter 15, The Long-Term Benefits of Beautiful Design, by Adam Kolawa, explains how attention to good design principles many decades ago helped CERN's widely used mathematical library (the predecessor of LINPACK) stand the test of time.

Chapter 16, The Linux Kernel Driver Model: The Benefits of Working Together, by Greg Kroah-Hartman, explains how many efforts by different collaborators to solve different problems led to the successful evolution of a complex, multithreaded system.

Chapter 17, Another Level of Indirection, by Diomidis Spinellis, shows how the flexibility and maintainability of the FreeBSD kernel is promoted by abstracting operations done in common by many drivers and filesystem modules.

Chapter 18, Python's Dictionary Implementation: Being All Things to All People, by Andrew Kuchling, explains how a careful design combined with accommodations for a few special cases allows a language feature to support many different uses.

Chapter 19, Multi-Dimensional Iterators in NumPy, by Travis E. Oliphant, takes you through the design steps that succeed in hiding complexity under a simple interface.

Chapter 20, A Highly Reliable Enterprise System for NASA's Mars Rover Mission, by Ronald Mak, uses industry standards, best practices, and Java technologies to meet the requirements of a NASA expedition where reliability cannot be in doubt.

Chapter 21, ERP5: Designing for Maximum Adaptability, by Rogerio Atem de Carvalho and Rafael Monnerat, shows how a powerful ERP system can be developed with free software tools and a flexible architecture.

Chapter 22, A Spoonful of Sewage, by Bryan Cantrill, lets the reader accompany the author through a hair-raising bug scare and a clever solution that violated expectations.

Chapter 23, Distributed Programming with MapReduce, by Jeff Dean and Sanjay Ghemawat, describes a system that provides an easy-to-use programming abstraction for large-scale distributed data processing at Google that automatically handles many difficult aspects of distributed computation, including automatic parallelization, load balancing, and failure handling.

Chapter 24, Beautiful Concurrency, by Simon Peyton Jones, removes much of the difficulty of parallel program through Software Transactional Memory, demonstrated here using Haskell.

Chapter 25, Syntactic Abstraction: The syntax-case Expander, by Kent Dybvig, shows how macros-a key feature of many languages and systems-can be protected in Scheme from producing erroneous output.

Chapter 26, Labor-Saving Architecture: An Object-Oriented Framework for Networked Software, by William Otte and Douglas C. Schmidt, applies a range of standard object-oriented design techniques, such as patterns and frameworks, to distributed logging to keep the system flexible and modular.

Chapter 27, Integrating Business Partners the RESTful Way, by Andrew Patzer, demonstrates a designer's respect for his programmers by matching the design of a B2B web service to its requirements.

Chapter 28, Beautiful Debugging, by Andreas Zeller, shows how a disciplined approach to validating code can reduce the time it takes to track down errors.

Chapter 29, Treating Code as an Essay, by Yukihiro Matsumoto, lays out some challenging principles that drove his design of the Ruby programming language, and that, by extension, will help produce better software in general.

Chapter 30, When a Button Is All That Connects You to the World, by Arun Mehta, takes you on a tour through the astounding interface design choices involved in a text editing system that allow people with severe motor disabilities, like Professor Stephen Hawking, to communicate via a computer.

Chapter 31, Emacspeak: The Complete Audio Desktop, by TV Raman, shows how Lisp's advice facility can be used with Emacs to address a general need-generating rich spoken output-that cuts across all aspects of the Emacs environment, without modifying the underlying source code of a large software system.

Chapter 32, Code in Motion, by Laura Wingerd and Christopher Seiwald, lists some simple rules that have unexpectedly strong impacts on programming accuracy.

Chapter 33, Writing Programs for "The Book," by Brian Hayes, explores the frustrations of solving a seemingly simple problem in computational geometry, and its surprising resolution.

Tuesday, July 10, 2007

Using computers in another way

doorstopIt seems to be the final destination for many machines - the doorstop. The lucky ones may become a foot rest or supports for the next generation of monitors. I'll have to bear that in mind next time I spend money on a machine - what else can I use that box for?

Friday, July 6, 2007

Field-level permissions in Jira

Notice how the Fix Version/s field is read-only? This is something that many people have wanted for a while with Jira. I've posted a description of how I restricted who can edit individual fields on the wiki for Atlassian Jira. It doesn't handle the case of completely hiding a field from a group of people, but It Worked For Me.


Tuesday, July 3, 2007

Writing an Effective Bug Report

This is a short excerpt from Practical Development Environments and contains a number of useful links to other guidelines.


The three key points to bear in mind when creating a bug report should be:

  • How to reproduce the bug, as precisely as possible, and how often this will make the bug appear
  • What should have happened, at least in your opinion
  • What actually happened, or at least as much information as you have recorded
Many applications can generate a textual description of how they have been installed and configured. If such a description is available, you should always add it to the bug. The application may also contain some tests to check that it is still configured correctly. If so, you should run the tests and attach their results too.

As well as providing correct and useful information in the bug, it's important to check that you behave as expected for the project. This is especially true for open source projects. Maybe you should ask questions first on a users' mailing list before escalating the issue to a developers' mailing list? Or you may mark a bug as maximum priority, because it's stopping your work, only to see it downgraded because no one else is blocked by that bug. Etiquette is important, and imperious commands to "fix this bug immediately" rarely help anything.

One common cause of frustration with bug tracking systems is related to how their information is added. Bugs are too often added with vague descriptions, missing information, premature conclusions, or a best guess at the real build label. A classic situation is when a complex program has a bug deep inside it, but the only error message that is visible is one from some unrelated area. That area often gets bugs from the deeper level assigned to it, much to the frustration of the developer responsible for it. Adding some good local documentation to wherever people add new bugs can go a long way to improving the quality of all the bugs.

Some useful documents with general advice about creating bugs include the Mozilla Project's Bug Writing Guidelines and Simon Tatham's How to Report Bugs Effectively. There are also many examples of more site-specific documents about writing bug reports; these contain product specific information such as descriptions of what each part of the product does. Two of these are Opera's Guidelines for Filing Good Bug Reports and FreeBSD's Writing FreeBSD Problem Reports.

Slightly off-topic, but also useful, is Eric S. Raymond and Rick Moen's classic article How to
Ask Questions the Smart Way
. This document has some excellent reminders and strong opinions about how to interact well with groups of technical volunteers.