This sub-series of blog posts will cover shortcuts and tips that I’ve found useful across various projects over the years. Hopefully they might help save someone else some time too.

HOW TO ITERATE THROUGH COMMANDS USING FOR LOOPS

 

Sometimes it’s useful to be able to quickly iterate through a list while performing a series of actions from the linux command line. This can easily be done by using FOR loops which take the following general format:

 

for VARIABLE in LIST; do COMMAND1; COMMAND2; COMMAND3…; done

where:

  • VARIABLE changes value in each iteration according to the values/items in LIST
  • LIST is a range e.g. {1..10} or {A..Z}, a list of items e.g. item1 item2 item3 or the results returned by a command e.g. $(ls) or $(cat somefile.txt)
  • COMMAND is any command which you want to repeat or iterate through

For each iteration, the contents of VARIABLE can be referenced by using a ‘$‘ prefix e.g. $VARIABLE

 

Iterating across a range in a set

For example, to create a list of numbers ranging from 10 to 20, we could type the following:

for number in {10..20}; do echo $number; done

Alternatively, to produce a list of items ranging from A to M, we could enter:

for letter in {A..M}; do echo Item $letter; done

 

Iterating across discontinuous items in a list

FOR loops can also be used to iterate across discontinuous items in a list.  For example, to create a list of vitamins we could enter the following:

for x in A B C D E K; do echo Vitamin $x; done

 

Iterating across nested ranges and lists

Further to the previous scenarios, FOR loops can also iterate across lists where some or all of the items are actually sub-ranges.

For example, to create a list of e.g. matrix metalloproteinases (MMPs), we could enter:

for enzyme_no in 1 2 3 7 8 9 10 11 12 13 14 15 16 17 19 20 21 23A 23B 24 25 26 27 28; do echo MMP$enzyme_no; done

There should be 24 MMPs in total. We can count the number of lines in our list using wc -l (see man wc for more details):

for enzyme_no in 1 2 3 7 8 9 10 11 12 13 14 15 16 17 19 20 21 23A 23B 24 25 26 27 28; do echo MMP$enzyme_no; done | wc  -l

However, the above loop can also be condensed by grouping the list items (in this case numbers and letters) into sub-ranges or sets:

for enzyme_no in {1..3} {7..17} {19..21} 23{A..B} {24..28}; do echo MMP$enzyme_no; done

to give the same result.

Before we move onto the final examples, we are going to save this MMP list by redirecting the output to a new file:

for enzyme_no in {1..3} {7..17} {19..21} 23{A..B} {24..28}; do echo MMP$enzyme_no; done > List_of_all_MMPs.txt

 

Iterating across items in a list produced by a command (example 1)

The number of iterations for a FOR loop can also be dependent on the output of a command.  To illustrate this, we are going to use the file,  “List_of_all_MMPs.txt” created by the previous example. For each MMP in “List_of_all_MMPs.txt”, we are going to create another file that contains a list of all the other MMPs except for that one.  To do this, our variable, x is going to iterate through the contents of “List_of_all_MMPs.txt” using:

cat  List_of_all_MMPs.txt

To list all the MMPs in “List_of_all_MMPs.txt”, but with the exclusion of each MMP as it becomes the focus of an iteration, we are going to use an inverted grep (i.e. return everything except e.g. MMP1) which matches against the entire line (‘-x‘):

grep -vx MMP1 List_of_all_MMPs.txt

(Without the ‘-x‘, we would match against a portion of each line – which for MMP1, as an example, would result in us also excluding MMP10 through to MMP19).

Then for each iteration, we are going to store the results in a separate file by redirecting the output using ‘>‘.

Placing these all together in a FOR loop gives:

for x in $(cat List_of_all_MMPs.txt); do grep -vx $x List_of_all_MMPs.txt > List_of_all_MMPs_except_$x.txt; done

This FOR loop should produce 24 text files – one for each of the 24 MMPs – and each containing a full list of the MMPs except for the MMP mentioned in the filename (i.e. 23 MMPs per file).

 

Iterating across items in a list produced by a command (example 2)

Finally, we are going to use another FOR loop where the number of iterations is based on the output of a command to shorten the filenames of these 24 text files by bulk renaming them.

To do this, we are going to get a list of the files to rename using:

ls List_of_all_MMPs_except_*

and then rename each file using mv with a substitution (more on this in a subsequent post).

After running the following:

for x in $(ls List_of_all_MMPs_except_*); do mv -v $x ${x/List_of_all_MMPs_except_/All_MMPs_except_}; done  

the filenames of these 24 text files should now begin with the shorter “All_MMPs_except_“.

 

And so ends the whirlwind tour of the extremely versatile FOR loop.