GBIN format is a special-interest file format used within DPAC, the Data Processing and Analysis Consortium working on data from the Gaia astrometry satellite. It is based on java serialization, and in all of its various forms has the peculiarity that you only stand any chance of decoding it if you have the Gaia data model classes on your java classpath at runtime. Since the set of relevant classes is very large, and also depends on what version of the data model your GBIN file corresponds to, those classes will not be packaged with this software, so some additional setup is required to read GBIN files.
As well as the data model classes, you must provide on the runtime classpath the GaiaTools classes required for GBIN reading. The table input handler accesses these by reflection, to avoid an additional large library dependency for a rather niche requirement. It is likely that since you have to supply the required data model classes you will also have the required GaiaTools classes to hand as well, so this shouldn't constitute much of an additional burden for usage.
In practice, if you have a jar file or files for pretty much any
java library or application which is capable of reading a given
GBIN file, just adding it or them to the classpath at runtime
when using this input handler ought to do the trick.
Examples of such jar files are
the
MDBExplorerStandalone.jar
file available from
https://gaia.esac.esa.int/mdbexp/,
or the gbcat.jar
file you can build from the
CU9/software/gbcat/
directory in the DPAC subversion repository.
The GBIN format doesn't really store tables, it stores arrays of java objects, so the input handler has to make some decisions about how to flatten these into table rows.
In its simplest form, the handler basically looks for public instance
methods of the form getXxx()
and uses the Xxx
as column names.
If the corresponding values are themselves objects with suitable getter
methods, those objects are added as new columns instead.
This more or less follows the practice of the gbcat
(gaia.cu1.tools.util.GbinInterogator
) tool.
Method names are sorted alphabetically.
Arrays of complex objects are not handled well,
and various other things may trip it up.
See the source code (e.g. uk.ac.starlink.gbin.GbinTableProfile
)
for more details.
If the object types stored in the GBIN file are known to the
special metadata-bearing class
gaia.cu9.tools.documentationexport.MetadataReader
and its dependencies, and if that class is on the runtime classpath,
then the handler will be able to extract additional metadata as available,
including standardised column names,
table and column descriptions, and UCDs.
An example of a jar file containing this metadata class alongside
data model classes is GaiaDataLibs-18.3.1-r515078.jar
.
Note however at time of writing there are some deficiencies with this
metadata extraction functionality related to unresolved issues
in the upstream gaia class libraries and the relevant
interface control document
(GAIA-C9-SP-UB-XL-034-01, "External Data Centres ICD").
Currently columns appear in the output table in a more or less
random order, units and Utypes are not extracted,
and using the GBIN reader tends to cause a 700kbyte file "temp.xml"
to be written in the current directory.
If the upstream issues are fixed, this behaviour may improve.
Note: support for GBIN files is somewhat experimental. Please contact the author (who is not a GBIN expert) if it doesn't seem to be working properly or you think it should do things differently.
Note: there is a known bug in some versions of
GaiaTools (caused by a bug in its dependency library zStd-jni)
which in rare cases can fail to read all the rows in a GBIN input file.
If this bug is encountered by the reader, it will by default
fail with an error mentioning zStd-jni.
In this case, the best thing to do is to put a fixed version of zStd-jni
or GaiaTools on the classpath.
However, if instead you set the config option readMeta=false
the read will complete without error, though the missing rows will not
be recovered.
The handler behaviour may be modified by specifying
one or more comma-separated name=value configuration options
in parentheses after the handler name, e.g.
"gbin(readMeta=false,hierarchicalNames=true)
".
The following options are available:
readMeta = true|false
Setting this false can prevent failing on an error related to a broken version of the zStd-jni library in GaiaTools.
Note however that in this case the data read, though not reporting an error, will silently be missing some rows from the GBIN file.
(Default: true
)
hierarchicalNames = true|false
Astrometry_Alpha
", if false they may just be called "Alpha
".
In case of name duplication however, the hierarchical form is always used.
(Default: false
)
This format can be automatically identified by its content so you do not need to specify the format explicitly when reading GBIN tables, regardless of the filename.
Example:
Suppose you have the MDBExplorerStandalone.jar
file
containing the data model classes, you can read GBIN files by
starting TOPCAT like this:
topcat -classpath MDBExplorerStandalone.jar ...or like this:
java -classpath topcat-full.jar:MDBExplorerStandalone.jar uk.ac.starlink.topcat.Driver ...