We often find ourselves facing the fact that our generated numerical results as provided by our own (or someone else's for that matter) code don't match our needs. In order to avoid painful hours of copy and paste sessions I compiled an overview of a few useful tools.
Note: For the rest of this page we will assume a tabulated (multiple columns) format of the data files.
cat file1 >> file2
Files are simply appended to each other.
sort -n file1 file2 ...
All files will be merged and sorted by the numeric value of the first entry (column).
paste file1 file2 ...
All files will be merged such that columns will be placed next to one another.(Side by side)
sed
Sed is a very powerful stream editor which allows us to manipulate data in various ways. A detailed description of all features would exceed the scope of this writeup. I find myself using the following features more than any other:
sed 's/search term/replace term/g' in-file > out-file
Replaces all instances of search term with replace term and file is then written to out-file
sed '/^[ \t]*$/d' in-file > out-file
This deletes any entirely empty lines in the file ^$,
empty but containing spaces ^[ ]*$ or tabs ^[\t]*$
awk Although I mainly use PERL for more involved operations awk can
be quite useful for some simple logic/search operations. Once again, one very
useful 'mini-script': awk ' { if ( $1==$2 ) print $0; }' in-file >
out-file If the logic operation checking if the value in column 1 is equal
to that in column 2 returns TRUE ($1==$2) then we print the entire line ($0)
or any specific column ($i).
Similarly to awk we can also specify logic operations in gnuplot:
gnuplot> plot 'filename' u ($1==$3&$4==0.0346?$3:1/0):($2)
The data to be plotted is taken from filename with the x value being taken from column 3 ($3) and the y value from column 2 ($2). The logic instruction in the x value specifies that only the data points which have equal values in column 1 and 3 AS WELL AS have a fixed value of 0.0346 in column 4 will be used. Furthermore, the syntax :1/0 eliminates any problems with empty lines.
One very useful feature is that all the data parsing techniques mentioned in the first section can also be used directly by gnuplot along with any other command line options. One example would be the combination of two files using paste:
gnuplot> plot "< paste file1 file2" u 1:2 w l
This example will plot column 1 and 2 from the merged two input files. This method allows one to involve more than one file in the data manipulation.
While on the topic of command line manipulation, it is useful to know that any command line commands can be executed from within is they are preceded by a "!" such as the following will provide the current working directory while in gnuplot:
gnuplot> !pwd