Ahhh yes, the CSV format: it looks so easy to process. Yet an embedded newline here, a stray quote there, and it becomes an awful pain to work with. To that end, I give you csveach, a small perl script that attempts to make working with CSV files on the command line headache free.
Here's how it works. Suppose you've got some data:
# file: foo.csv Alice,555-1212,"To Be, Or not to Be, That is the question!" Bob,546-7777,"Better to keep your mouth shut and be thought a fool, then to open it and remove all doubt." Charlie,930-9808,"Faster. Just run Faster."
(Note the annoying embedded newlines)
First, you craft a short shell script that's going to get executed for each row in the CSV file. Each column is handed to the script as a parameter. $1 is the row number in the CSV file, $2 is the first column, $3 the second, and so on. Here's a trivial shell script that would extract just the name and phone number from the above:
# file: name.sh #!/bin/bash echo "$2: $3"
Finally, you run:
csveach ./name.sh foo.csv
Which gives the output:
Alice: 555-1212 Bob: 546-7777 Charlie: 930-9808
The shell script can just as easily work with the newline embedded text. For example:
# file: quote.sh #!/bin/bash quote=`echo "$4" | tr '\n' '|'` len=`echo $quote | wc -c` echo "$2's $len words of wisdom: $quote"
This can be run as:
csveach ./quote.sh foo.csv
and gives the output:
Alice's 44 words of wisdom: To Be,|Or not to Be,|That is the question!| Bob's 93 words of wisdom: Better to keep your mouth shut and be thought a fool,|then to open it and remove all doubt.| Charlie's 26 words of wisdom: Faster.|Just run Faster.|
By letting a shell script (or really any executable) do the heavy lifting, it's possible to reformat or process CSV data any way you want. And best of all, you can ignore all those exceptional cases that make CSV such a pain.
Enjoy!
#!/bin/perl ## ## For each row of a CSV file kick off a system command ## use Text::CSV; $script = shift(@ARGV); my $csv = Text::CSV->new ({ binary => 1, eol => $/ }); my $row_num = 0; foreach my $file (@ARGV) { open my $io, "<", $file or die "$file: $!"; while (my $row = $csv->getline ($io)) { $row_num++; my @args = @$row; unshift(@args, $row_num); (system $script, @args) == 0 or die "system @command failed: $?"; } }
I did something similar a long time ago, and did it again at my blog.
ReplyDeletepbewig - thanks for sharing.
ReplyDelete