The usage of tgridmap
is
stilts <stilts-flags> tgridmap ifmt=<in-format> istream=true|false icmd=<cmds> ocmd=<cmds> omode=out|meta|stats|count|checksum|cgi|discard|topcat|samp|plastic|tosql|gui out=<out-table> ofmt=<out-format> coords=<expr> ... logs=true|false ... bounds=[<lo>]:[<hi>] ... binsizes=<size> ... nbins=<num> ... cols=<expr>[;<combiner>[;<name>]] ... combine=sum|sum-per-unit|count|count-per-unit|mean|median|Q1|Q3|min|max|stdev|stdev_pop|hit sparse=true|false runner=sequential|parallel|parallel<n>|partest [in=]<table>If you don't have the
stilts
script installed,
write "java -jar stilts.jar
" instead of
"stilts
" - see Section 3.
The available <stilts-flags>
are listed
in Section 2.1.
For programmatic invocation,
the Task class for this
command is uk.ac.starlink.ttools.task.GridDensityMap
.
Parameter values are assigned on the command line as explained in Section 2.3. They are as follows:
binsizes = <size> ...
(Double[])
nbins
parameter must be supplied.
If supplied, this parameter must have the same number of words
as the coords
parameter.
bounds = [<lo>]:[<hi>] ...
(double[][])
If any of the bounds need to be determined automatically in this way, two passes through the data will be required, the first to determine bounds and the second to calculate the map.
If supplied, this parameter must have the same number of words
as the coords
parameter.
cols = <expr>[;<combiner>[;<name>]] ...
(String[])
Each item is composed of one, two or three tokens,
separated by semicolon (";
") characters:
<expr>
: (required)
column name or expression using the
expression language
for the quantity to be aggregated.
<combiner>
: (optional)
combination method, using the same options as for the
combine
parameter.
If omitted, the value specified for that parameter
will be used.
<name>
: (optional)
name of output column; if omitted,
the <expr>
value
(perhaps somewhat sanitised)
will be used.
The default value is "1;count;COUNT
"
which simply provides an unweighted histogram,
i.e. a count of the rows in each bin
(aggregation of the value "1
" using the
combination method "count
",
yielding an output column named "COUNT
").
[Default: 1;count;COUNT
]
combine = sum|sum-per-unit|count|count-per-unit|mean|median|Q1|Q3|min|max|stdev|stdev_pop|hit
(Combiner)
sum
: the sum of all the combined values per binsum-per-unit
: the sum of all the combined values per unit of bin sizecount
: the number of non-blank values per bin (weight is ignored)count-per-unit
: the number of non-blank values per unit of bin size (weight is ignored)mean
: the mean of the combined valuesmedian
: the medianQ1
: first quartileQ3
: third quartilemin
: the minimum of all the combined valuesmax
: the maximum of all the combined valuesstdev
: the sample standard deviation of the combined valuesstdev_pop
: the population standard deviation of the combined valueshit
: 1 if any values present, NaN otherwise (weight is ignored)Q.nnn
: quantile nnn (e.g. Q.05 is the fifth percentile)Note this value may be overridden on a per-column basis
by the cols
parameter.
[Default: mean
]
coords = <expr> ...
(String[])
icmd = <cmds>
(ProcessingStep[])
in
,
before any other processing has taken place.
The value of this parameter is one or more of the filter
commands described in Section 6.1.
If more than one is given, they must be separated by
semicolon characters (";").
This parameter can be repeated multiple times on the same
command line to build up a list of processing steps.
The sequence of commands given in this way
defines the processing pipeline which is performed on the table.
Commands may alternatively be supplied in an external file,
by using the indirection character '@
'.
Thus a value of "@filename
"
causes the file filename
to be read for a list
of filter commands to execute. The commands in the file
may be separated by newline characters and/or semicolons,
and lines which are blank or which start with a
'#
' character are ignored.
A backslash character '\
' at the end of a line
joins it with the following line.
ifmt = <in-format>
(String)
in
.
The known formats are listed in Section 5.1.1.
This flag can be used if you know what format your
table is in.
If it has the special value
(auto)
(the default),
then an attempt will be
made to detect the format of the table automatically.
This cannot always be done correctly however, in which case
the program will exit with an error explaining which
formats were attempted.
This parameter is ignored for scheme-specified tables.
[Default: (auto)
]
in = <table>
(StarTable)
-
",
meaning standard input.
In this case the input format must be given explicitly
using the ifmt
parameter.
Note that not all formats can be streamed in this way.:<scheme-name>:<scheme-args>
.<
" character at the start,
or a "|
" character at the end
("<syscmd
" or
"syscmd|
").
This executes the given pipeline and reads from its
standard output.
This will probably only work on unix-like systems.istream = true|false
(Boolean)
in
parameter
will be read as a stream.
It is necessary to give the
ifmt
parameter
in this case.
Depending on the required operations and processing mode,
this may cause the read to fail (sometimes it is necessary
to read the table more than once).
It is not normally necessary to set this flag;
in most cases the data will be streamed automatically
if that is the best thing to do.
However it can sometimes result in less resource usage when
processing large files in certain formats (such as VOTable).
This parameter is ignored for scheme-specified tables.
[Default: false
]
logs = true|false ...
(Boolean[])
If supplied, this parameter must have the same number of words
as the coords
parameter.
nbins = <num> ...
(Integer[])
binsizes
parameter must be supplied.
If supplied, this parameter must have the same number of words
as the coords
parameter.
ocmd = <cmds>
(ProcessingStep[])
Commands may alternatively be supplied in an external file,
by using the indirection character '@
'.
Thus a value of "@filename
"
causes the file filename
to be read for a list
of filter commands to execute. The commands in the file
may be separated by newline characters and/or semicolons,
and lines which are blank or which start with a
'#
' character are ignored.
A backslash character '\
' at the end of a line
joins it with the following line.
ofmt = <out-format>
(String)
(auto)
"
(the default),
then the output filename will be
examined to try to guess what sort of file is required
usually by looking at the extension.
If it's not obvious from the filename what output format is
intended, an error will result.
This parameter must only be given if
omode
has its default value of "out
".
[Default: (auto)
]
omode = out|meta|stats|count|checksum|cgi|discard|topcat|samp|plastic|tosql|gui
(ProcessingMode)
out
, which means that
the result will be written as a new table to disk or elsewhere,
as determined by the out
and ofmt
parameters.
However, there are other possibilities, which correspond
to uses to which a table can be put other than outputting it,
such as displaying metadata, calculating statistics,
or populating a table in an SQL database.
For some values of this parameter, additional parameters
(<mode-args>
)
are required to determine the exact behaviour.
Possible values are
out
meta
stats
count
checksum
cgi
discard
topcat
samp
plastic
tosql
gui
help=omode
flag
or see Section 6.4 for more information.
[Default: out
]
out = <out-table>
(TableConsumer)
This parameter must only be given if
omode
has its default value of "out
".
[Default: -
]
runner = sequential|parallel|parallel<n>|partest
(RowRunner)
sequential
:
runs using only a single thread
parallel
:
runs using multiple threads for large tables,
with parallelism given by the number of available processors
parallel<n>
:
runs using multiple threads for large tables,
with parallelism given by the supplied value
<n>
partest
:
runs using multiple threads even when tables are small
(only intended for testing purposes)
Using parallel processing can speed up execution considerably;
however, depending on the I/O operations required,
it can also slow it down by disrupting patterns of disk access.
If the content of a file is on a solid state disk,
or is already in cache for instance because a similar command
has been run recently,
then parallel
will probably be faster.
However, if the data is being read directly from a spinning disk,
for instance because the file is too large to fit in RAM, then
sequential
or
parallel<n>
with a small
<n>
may be faster.
The value of this parameter should make only very tiny differences to the output table. If you notice significant discrepancies please report them.
[Default: parallel
]
sparse = true|false
(Boolean)
[Default: true
]