wiki:perlJsonEncoding
Last modified 6 years ago Last modified on 03/05/18 14:11:53

Encoding JSON using perl

Perl is not a language with strict typing. This lack of strict typing can have ramifications for producing JSON output. In our case this matters because the vso_jsoc_fetch.cgi CGI produces JSON that is critical for NetDRMS systems, and the exact JSON that is produced depends on the perl "typing". The "typing" in perl in fact amounts to a set of internal flags associated with a variable, and these can be set in surprising ways. Worse, the flags can be interpreted in different ways by modules attempting to determine a variable type. All this can be changed by updates to the perl version and the JSON module in use. For instance, consider this code in vso_jsoc_fetch.cgi :

my $totalSize=0;
my $totalCount=0;
for my $key (keys %{$data}) {
  $totalSize += $data->{$key}->{'susize'};
  $totalCount +=1;
}

For many years this worked fine. In the JSON output, every $data->{$key}->{'susize'}; appeared in the JSON as something like "susize" : "10284610"

However, after cpanm was used to install a new module, it seems like the JSON perl module was updated. It turned out that using $data->{$key}->{'susize'}; to do math with the line $totalSize += $data->{$key}->{'susize'}; actually changes some of the internal perl flags around the $data->{$key}->{'susize'}; so that it was treated as a numeric, rather than a string, type. As far as I can tell, the old JSON perl module did not look at those flags, but the new JSON module did. The result was that the variable was treated as a numeric rather than a string in the JSON output, so the output looked like this : "susize" : 10284610 (note that there are no quotes around the number).

This was very hard to debug, since the vso_jsoc_fetch.cgi CGI was producing JSON that seemed, at a glance, to be OK, but in fact did not parse. Worse, this affected not only the local NetDRMS node, but every NetDRMS node that attempted to download from the local node.

One workaround is to modify the code to use a temporary variable to do the math with, like so :

my $totalSize=0;
my $totalCount=0;
for my $key (keys %{$data}) {
  my $tmpVar = $data->{$key}->{'susize'};
  $totalSize += $tmpVar;
  $totalCount +=1;
}

This means that the re-typing from string to numeric happens on $tmpVar and the typing of $data->{$key}->{'susize'} is left alone so that it remains a string.

A more definitive solution, however, is to install the JSON::XS module and use it to definitively set the types for the JSON to use. This is probably best demonstrated by the script below. It is probably not optimal to install the perl JSON::XS module on a production NetDRMS machine, however it should be considered when setting up a new machine.

#!/usr/bin/perl

use warnings;
use strict;
use Cpanel::JSON::XS;
use Cpanel::JSON::XS::Type;

use Devel::Peek;

# Insofar as there is a "type" for perl variables, it's controlled
# by a set of flags associated with the variable. The Devel::Peek
# module lets you peek at the flags. I know that IV is the integer flag,
# and NV is the float flag, but beyond that, it's kind of opaque.
# Also the flags are prone to change between versions of perl, and
# different versions of the JSON module may look at different flags.

# Just to use Devel::Peek to look at some flags :
my $var = "3";
Dump $var;
print "^^^^^^^^^^\n";

my $x = $var + 0;
Dump $var;
print "^^^^^^^^^^\n";

$var += 0;
Dump $var;
print "^^^^^^^^^^\n";

# The bottom line is that it's probably bad practice, when encoding JSON,
# to rely on the flags. You're better off setting up a hash table that
# lets you set the type for each hash entry by name. What's below will
# ALWAYS print 'num' as an integer, 'amount' as floating point and
# 'name' as a string. It depends on having JSON:XS installed, which at the
# moment I don't, and I don't want to risk installing it on a
# running system. But when I set up a new system, I'm going to move
# vso_jsoc_fetch.cgi to using this methodology.

my $json = 
    Cpanel::JSON::XS->new->allow_nonref->allow_unknown->allow_blessed->pretty(1);

# Associate the hash names with types.
my $type = {
             'num' => JSON_TYPE_INT,
             'amount' => JSON_TYPE_FLOAT,
             'name' => JSON_TYPE_STRING
            };

# Put all entries in as integers
my $data1;
$data1->{"num"} = 1; $data1->{"amount"} = 2; $data1->{"name"} = 3;
my $body1 = $json->encode($data1, $type);
print $body1;

# Put all entries in as floating point.
my $data2;
$data2->{"num"} = 4.0; $data2->{"amount"} = 5.0; $data2->{"name"} = 6.0;
my $body2 = $json->encode($data2, $type);
print $body2;

# Put them all in as strings
my $data3;
$data3->{"num"} = "7.0"; $data3->{"amount"} = "8"; $data3->{"name"} = "Niles Oien";
my $body3 = $json->encode($data3, $type);
print $body3;

# No matter what, 'num' is printed as an integer, 'amount' as float, 'name' as string.
# For vso_jsoc_fetch.cgi we want to use string for everything.
# Niles March 2018.

exit 0;

The above script prints JSON that is consistent in the typing of the variables :

SV = PV(0x1f7fcf0) at 0x1faa008
  REFCNT = 1
  FLAGS = (PADMY,POK,pPOK)
  PV = 0x1f98300 "3"\0
  CUR = 1
  LEN = 16
^^^^^^^^^^
SV = PVIV(0x1fa9080) at 0x1faa008
  REFCNT = 1
  FLAGS = (PADMY,IOK,POK,pIOK,pPOK)
  IV = 3
  PV = 0x1f98300 "3"\0
  CUR = 1
  LEN = 16
^^^^^^^^^^
SV = PVIV(0x1fa9080) at 0x1faa008
  REFCNT = 1
  FLAGS = (PADMY,IOK,pIOK)
  IV = 3
  PV = 0x1f98300 "3"\0
  CUR = 1
  LEN = 16
^^^^^^^^^^
{
   "amount" : 2.0,
   "num" : 1,
   "name" : "3"
}
{
   "amount" : 5.0,
   "num" : 4,
   "name" : "6"
}
{
   "amount" : 8.0,
   "num" : 7,
   "name" : "Niles Oien"
}

Niles Oien March 2018