This section describes the usual way of reading a table or tables from
an external resource such as a file, URL,
DataSource etc, and
converting it into a StarTable object whose data
and metadata you can examine as described in
Section 2. These resources have in common that the
data from them can be read more than once; this is necessary in
general since depending on the data format and intended use
it may require more than one pass to provide the table data.
Reading a table in this way may or may not require local resources
such as memory or disk, depending on how the handler works -
see Section 4 for information on how to influence
such resource usage.
The main class used to read a table in this way is
StarTableFactory.
The job of this class is to keep track of which input handlers are
registered and to use one of them to read data from an input stream and
turn it into one or more StarTables.
The basic rule is that you use one of the StarTableFactory's
makeStarTable or makeStarTables
methods to turn what you've got
(e.g. String, URL, DataSource)
into a
StarTable or a
TableSequence
(which represents a collection of StarTables) and away you go.
If no StarTable can be created (for instance because the file named doesn't
exist, or because it is not in any of the supported formats)
then some sort of
IOException or
TableFormatException
will be thrown.
Note that if the byte stream from the target resource
is compressed in one of the supported
formats (gzip, bzip2, Unix compress) it will be
uncompressed automatically
(the work for this is done by the
DataSource class).
There are two distinct modes in which StarTableFactory
can work: automatic format detection and named format.
In automatic format detection mode, the type of data contained in
an input stream is determined by looking at it. What actually happens
is that the factory hands the stream to each of the handlers in its
default handler list
in turn, and the first one that recognises the format (usually based on
looking at the first few bytes) attempts to make a table from it.
If this fails, a handler may be identified by looking at the file name,
if available (e.g. a filename or URL ending in ".csv" will
be tried as a CSV file).
In this mode, you only need to specify the table location, like this:
public StarTable loadTable( File file ) throws IOException {
return new StarTableFactory().makeStarTable( file.toString() );
}
This mode is available for formats such as
FITS, VOTable, ECSV, PDS4, Parquet, MRT, Feather and CDF
that can
be easily recognised, but is not reliable for text-based formats such as
comma-separated values without recognisable filenames.
You can access and modify the list of
auto-detecting handlers using the
getDefaultBuilders method.
By default it contains only handlers for VOTable, CDF, FITS-like formats,
ECSV, PDS4, Parquet, MRT, Feather and GBIN.
In named format mode, you have to specify the name of the format as
well as the table location. This can be solicited from the user if it's
not known at build time; the known format names can be got from the
getKnownFormats method.
The list of format handlers
that can be used in this way can be accessed or
modified using the
getKnownBuilders method; it usually contains all
the ones in the default handler list, but doesn't have to.
Table construction in named format mode might look like this:
public StarTable loadFitsTable( File file ) throws IOException {
return new StarTableFactory().makeStarTable( file.toString(), "fits" );
}
This format also offers the possibility of configuring input handler
options in the handler name.
If the table format is known at build time, you can alternatively use the
makeStarTable method of the appropriate format-specific
TableBuilder.
For instance you could replace the above example with this:
return new FitsTableBuilder()
.makeStarTable( DataSource.makeDataSource( file.toString() ),
false, StoragePolicy.getDefaultPolicy() );
This slightly more obscure method offers more configurability but has
much the same effect; it may be slightly more efficient and may offer
somewhat more definite error messages in case of failure.
The various supplied TableBuilders (format-specific
input handlres) are listed in Section 3.6.
The javadocs detail variations on these calls. If you want to ensure that the table you get provides random access (see Section 2.3), you should do something like this:
public StarTable loadRandomTable( File file ) throws IOException {
StarTableFactory factory = new StarTableFactory();
factory.setRequireRandom( true );
StarTable table = factory.makeStarTable( file.toString() );
return table;
}
Setting the requireRandom flag on the factory
ensures that any table returned from its makeStarTable
methods returns true from its
isRandom
method.
(Note prior to STIL version 2.1 this flag only provided a hint to the
factory that random tables were wanted - now it is enforced.)