I was reminded by a recent project that many (most?) hardware failures occur at the connections between things. That's why percussive maintenance gets results. I was writing a tool to execute lots of commands remotely on multiple machines, with the different commands being synchronized from one machine. I chose to write the tool in python, and that made creating it nice and easy.
But the shell commands to be executed had to be communicated to each machine. Now I could have written all the commands and their arguments (some of which contained the dreaded spaces) into separate files, copied them over with scp and executed them remotely. Except that there was a fair amount of analyzing of results and deciding what to run next based on the results of the analysis. This is generally tedious to do with bash and can be hard to maintain.
So I decided to keep the control logic in the python tool and use ssh to send each command separately to the remote machines. Which works, but the extra headache of getting all the spaces escaped and quoted added a major amount of maintenance pain to the tool. I could have encapsulated it all more cleanly, I'm sure, but it all "just grew" in scope and size.
The final block in this tottering tower was that some of the commands had to be run using sudo, but not all of them. Getting all those sudo strings in the right place took a few hours of my life from me without feeling that I got much for it. So it goes.
I've tried using CORBA, WebServices etc over the years, but they feel pretty heavyweight for this sort of thing. So the question I'm asking myself is: how could I have done it better?