Monday, February 4, 2013

Bash Caveat - It's all just text


 This is an important thing to consider when writing Bash scripts. In my experience its not necessarily the little command tricks that you know that make you a better coder, it’s the underlying understanding of how things work.

You’re dealing with Text
Mentally keeping track of the contents of variables, or whats being passed in a pipe is actually rather simple in Bash. Everything is a string. There is no fancy Object oriented concepts that you have to consider when dealing with data. It’s all just text. Take the following for example:

Cat file | cut –f1 | sort –u | wc –l

While the above follows under the category of “useless use of cat” it’s done to illustrate a point. You are taking the text output of a command and passing it as the text input of another command. THAT’S IT. The “target” program that you pass the data to has its own rules on how to deal with the text. In the above case what is happening is cat is opening the file, outputting the contents of the file as the input for the cut command, which reads in the text, and (due to –f1) outputs the first tab delimited field as output. This output text is being passed directly to the sort command which will alphabetically sort the list and eliminate the duplicates (-u). Sort then outputs this text, and the pipe (again) takes the output and sends it to wc which will count how many lines (-l) and output the result.

The only thing programs like this are designed to do is mangle/modify/analyze text in some way.

The nice thing about only dealing with text is that you can see its state/contents at any point, simply by outputting it to the screen.

I believe that keeping in mind you are only dealing with strings of text is one of the most important considerations to remember when writing bash scripts. 

The other good thing about the "everything is a string" philosophy is that you can tell which programs where built for scripting and which were mainly built for human consumption. The main question you have to ask is: How much parsing of text do i have to do to get some simple data out? If the answer is "a lot", then you may want to search for another tool/program that is more API-esque focused.

No comments:

Post a Comment