I answered a question on the Perl forum today about splitting CSV. CSV is a comma separated format; for example:
blah,blah,blah
You can put values in quotes:
blah,"blah blah",blah
And those quotes make commas within them meaningless too:
blah,"blah,blah,blah",blah
If we do the naive thing and implement it using split on a comma:
my @fields = split(/,/, $string);
Then we will obviously get the Wrong Answer. The question was, is there a regex we can use with split that will do the Right Thing? And the answer was yes, though it took me a few minutes to come up with it. The thing is that we don't want to match anything more than the commas we are splitting on, but we do need to do some analysis on the string that is up ahead (or behind us) to detect if the comma we are seeing is in quotes.
Read More