When no special steps are taken, if a null value (blank cell) is encountered in evaluating an expression (usually because one of the columns it relies on has a null value in the row in question) then the result of the expression is also null.
It is possible to exercise more control than this, but it
requires a little bit of care,
because the expressions work in terms of primitive values
(numeric or boolean ones) which don't in general have a defined null
value. The name "null
"
in expressions gives you the java null
reference, but this cannot be matched against a primitive value
or used as the return value of a primitive expression.
For most purposes, the following two tips should enable you to work with null values:
NULL_
"
(use upper case) to the column name or $ID. This
will yield a boolean value which is true if the column contains
a blank or a floating point NaN (not-a-number) value,
and false otherwise.
Note that if combined with other boolean expressions,
this null test should come first, i.e. write
"NULL_i || i==999
" rather than
"i==999 || NULL_i
",
though this is only essential for integer or boolean variables.
NULL
"
(upper case). To return a null value from a non-numeric expression
(e.g. a String column) use the name "null
" (lower case).
Null values are often used in conjunction with the conditional
operator, "? :
"; the expression
test ? tval : fvalreturns the value
tval
if the boolean expression test
evaluates true, or fval
if test
evaluates false.
So for instance the following expression:
Vmag == -99 ? NULL : Vmagcan be used to define a new column which has the same value as the
Vmag
column for most values, but if Vmag
has the "magic" value -99 the new column will contain a blank.
The opposite trick (substituting a blank value with a magic one) can
be done like this:
NULL_Vmag ? -99 : VmagSome more examples are given in Section 10.8.
Note that for floating point data,
STILTS treats null
and NaN (Not-a-Number) values
somewhat interchangeably.
Blank values arising either from an input file format that can represent
missing values, or from processing that fails to provide a definite value,
are in most cases represented internally as null
for integer-type values and NaN for floating point values.
However in general users should not rely on distinguishing between
null
and NaN.