The tables read by the tread
function and produced
by operating on them within JyStilts have a number of methods
defined on them.
These are explained below.
First, a number of special methods are defined which allow a table to behave in python like a sequence of rows:
__iter__
for row in table:
" will iterate over all rows.
__len__
(random-access tables only)
len(table)
" to count the number of rows.
This method is not available for tables with sequential access only.
__getitem__
(random-access tables only)
table[3]
" or table[0:10]
to obtain the
row or rows corresponding to a given row index or slice.
This method is not available for tables with sequential access only.
__add__
, __mul__
, __rmul__
+
" and and "*
" to be used with
the sense of concatenation.
Thus "table1+table2
" will produce a new table with the
rows of table1
followed by the rows of table2
.
Note this will only work if both tables have compatible columns.
Similarly "table*3
" would produce a table like
table
but with all its rows repeated three times.
columns()
.
Sometimes, the result of a table operation will be a table which
does not have random access. For such tables you can iterate over
the rows, but not get their row values by indexing.
Non-random-access tables are also peculiar in that getRowCount
returns a negative value.
To take a table which may not have random access and make it capable
of random access, use the random
filter: "table=table.cmd_random()
".
To a large extent it is possible to duplicate the functions of the
various STILTS commands by writing your own python code based on these
python-friendly table access methods.
Note however that such python-based processing is likely to be
much slower than the STILTS equivalents.
If performance is important to you, you should try in most cases
to use the various cmd_*
commands etc for table processing.
Second, some additional utility methods are defined:
count_rows()
columns()
getName()
,
getUnitString()
, getUCD()
.
str(column)
will return its name.
coldata(key)
key
argument may be either an integer
column index (if negative, counts backwards from the end),
or the column name or info object.
The returned value will always be iterable (has __iter__
),
but will only be indexable
(has __len__
and __getitem__
) if the table
is random access.
parameters()
StarTable
methods.
Note that as currently implemented, changing the values in the
returned mapping will not change the actual table parameter values.
write(location=None, fmt=None)
location
argument gives a filename
or writable file object,
and the optional fmt
argument gives a format, one of
the options listed in Section 5.1.1.
If location
is not supplied, output is to standard output,
so in an interactive session it will be printed to the terminal.
If fmt
is not supplied, an attempt will be made to guess
a suitable format based on the location.
Third, a set of cmd_*
methods corresponding to the
STILTS filters are available;
these are described in Section 4.4.
Fourth, a set of mode_*
methods corresponding to the
STILTS output modes are available;
these are described in Section 4.5.
Finally, tables are also instances of the StarTable interface defined by STIL, which is the table I/O layer underlying STILTS. The full documentation can be found in the user manual and javadocs on the STIL page, and all the java methods can be used from JyStilts, but in most cases there are more pythonic equivalents provided, as described above.
Here are some examples of these methods in use:
>>> import stilts >>> xsc = stilts.tread('/data/table/2mass_xsc.xml') # read table >>> xsc.mode_count() # show rows/column count columns: 6 rows: 1646844 >>> print xsc.columns() # full info on columns (id(String), ra(Double)/degrees, dec(Double)/degrees, jmag(Double)/mag, hmag(Double)/mag, kmag(Double)/mag) >>> print [str(col) for col in xsc.columns()] # column names only ['id', 'ra', 'dec', 'jmag', 'hmag', 'kmag'] >>> row = xsc[1000000] # examine millionth row >>> print row (u'19433000+4003190', 295.875, 40.055286, 14.449, 13.906, 13.374) >>> print row[0] # cell by index 19433000+4003190 >>> print row['ra'], row['dec'] # cells by col name 295.875 40.055286 >>> print len(xsc) # count rows, maybe slow 1646844 >>> print xsc.count_rows() # count rows efficiently 1646844L >>> print (xsc+xsc).count_rows() # concatenate 3293688L >>> print (xsc*10000).count_rows() 16468440000L >>> for row in xsc: # select rows using python commands ... if row[4] - row[3] > 3.0: ... print row[0] ... 11165243+2925509 20491597+5119089 04330238+0858101 01182715-1013248 11244075+5218078 >>> # same thing using stilts (50x faster) >>> (xsc.cmd_select('hmag - jmag > 3.0') ... .cmd_keepcols('id') ... .write()) +------------------+ | id | +------------------+ | 11165243+2925509 | | 20491597+5119089 | | 04330238+0858101 | | 01182715-1013248 | | 11244075+5218078 | +------------------+
The following are all ways to obtain the value of a given cell in the table from the previous example.
xsc.getCell(99, 0) xsc[99][0] xsc[99]['id'] xsc.coldata(0)[99] xsc.coldata('id')[99]Some of these methods may be more efficient than others. Note that none of these methods will work if the table has sequential-only access.