Tuesday 15 May 2018

Useful code snips for working with XML, CSV, and other flat file formats in perl, groovy, python, java, c, c++

Lock files in Windows batch scripts, avoid more than one instance running

REM Check if another instance is running, and exit if true
IF EXIST ".lock" exit 0

echo Batch file start at %time% %date% by %username%.> .lock

REM Script processing starts here

REM Script processing ends here
del .lock


Replace XML tag data with Perl

use strict;
use warnings;
use XML::Twig;

for ( glob "*.xml" ) {
        print "process file $_\n";

XML::Twig->new(
    pretty_print  => 'indented',
    twig_handlers => {
         PaymentAmount => sub {
            $_->set_text( '0' )->flush
        },
    },
)->parsefile_inplace( $_, 'orig_*' );

Groovy and Tokenize or Split

While using tokenize() if you want to discard fields or lines with no data might work as such,

List myList =  inputStream.getText().tokenize("\n\r")

it can not be used if you want to retain the offset format of CSV or pipe delimited fields, since it will not yield the entries with no data.

As example the second field will be discarded. "Line1"|"""|"Name"

Here we will need to use split

List lHeader = sHeader.split("\\|")

Another odd behavior is if you split with a pipe without escapes, "|", split returns an array of characters. This is probably documented, but I did not have the time yet to read all the documentation and can not find any reference to this behavior.

No comments:

Post a Comment

Chipmaster Gear Cutting

  Calculate all the possible gear combinations for the gear selector to cut a 15TPI thread: Imperial TPI C 5 24 20 Imperial TPI ...