The tables read by the tread function and produced
by operating on them within JyStilts have a number of methods
defined on them.
These are explained below.
First, a number of special methods are defined which allow a table to behave in python like a sequence of rows:
__iter__
for row in table:" will iterate over all rows.
__len__ (random-access tables only)
len(table)" to count the number of rows.
This method is not available for tables with sequential access only.
__getitem__ (random-access tables only)
table[3]" or table[0:10] to obtain the
row or rows corresponding to a given row index or slice.
This method is not available for tables with sequential access only.
__add__, __mul__, __rmul__
+" and and "*" to be used with
the sense of concatenation.
Thus "table1+table2" will produce a new table with the
rows of table1 followed by the rows of table2.
Note this will only work if both tables have compatible columns.
Similarly "table*3" would produce a table like
table but with all its rows repeated three times.
columns().
Sometimes, the result of a table operation will be a table which
does not have random access. For such tables you can iterate over
the rows, but not get their row values by indexing.
Non-random-access tables are also peculiar in that getRowCount
returns a negative value.
To take a table which may not have random access and make it capable
of random access, use the random
filter: "table=table.cmd_random()".
To a large extent it is possible to duplicate the functions of the
various STILTS commands by writing your own python code based on these
python-friendly table access methods.
Note however that such python-based processing is likely to be
much slower than the STILTS equivalents.
If performance is important to you, you should try in most cases
to use the various cmd_* commands etc for table processing.
Second, some additional utility methods are defined:
count_rows()
columns()
getName(),
getUnitString(), getUCD().
str(column) will return its name.
coldata(key)
key argument may be either an integer
column index (if negative, counts backwards from the end),
or the column name or info object.
The returned value will always be iterable (has __iter__),
but will only be indexable
(has __len__ and __getitem__) if the table
is random access.
parameters()
StarTable methods.
Note that as currently implemented, changing the values in the
returned mapping will not change the actual table parameter values.
write(location=None, fmt=None)
location argument gives a filename
or writable file object,
and the optional fmt argument gives a format, one of
the options listed in Section 5.1.1.
If location is not supplied, output is to standard output,
so in an interactive session it will be printed to the terminal.
If fmt is not supplied, an attempt will be made to guess
a suitable format based on the location.
Third, a set of cmd_* methods corresponding to the
STILTS filters are available;
these are described in Section 4.4.
Fourth, a set of mode_* methods corresponding to the
STILTS output modes are available;
these are described in Section 4.5.
Finally, tables are also instances of the StarTable interface defined by STIL, which is the table I/O layer underlying STILTS. The full documentation can be found in the user manual and javadocs on the STIL page, and all the java methods can be used from JyStilts, but in most cases there are more pythonic equivalents provided, as described above.
Here are some examples of these methods in use:
>>> import stilts
>>> xsc = stilts.tread('/data/table/2mass_xsc.xml') # read table
>>> xsc.mode_count() # show rows/column count
columns: 6 rows: 1646844
>>> print xsc.columns() # full info on columns
(id(String), ra(Double)/degrees, dec(Double)/degrees, jmag(Double)/mag, hmag(Double)/mag, kmag(Double)/mag)
>>> print [str(col) for col in xsc.columns()] # column names only
['id', 'ra', 'dec', 'jmag', 'hmag', 'kmag']
>>> row = xsc[1000000] # examine millionth row
>>> print row
(u'19433000+4003190', 295.875, 40.055286, 14.449, 13.906, 13.374)
>>> print row[0] # cell by index
19433000+4003190
>>> print row['ra'], row['dec'] # cells by col name
295.875 40.055286
>>> print len(xsc) # count rows, maybe slow
1646844
>>> print xsc.count_rows() # count rows efficiently
1646844L
>>> print (xsc+xsc).count_rows() # concatenate
3293688L
>>> print (xsc*10000).count_rows()
16468440000L
>>> for row in xsc: # select rows using python commands
... if row[4] - row[3] > 3.0:
... print row[0]
...
11165243+2925509
20491597+5119089
04330238+0858101
01182715-1013248
11244075+5218078
>>> # same thing using stilts (50x faster)
>>> (xsc.cmd_select('hmag - jmag > 3.0')
... .cmd_keepcols('id')
... .write())
+------------------+
| id |
+------------------+
| 11165243+2925509 |
| 20491597+5119089 |
| 04330238+0858101 |
| 01182715-1013248 |
| 11244075+5218078 |
+------------------+
The following are all ways to obtain the value of a given cell in the table from the previous example.
xsc.getCell(99, 0)
xsc[99][0]
xsc[99]['id']
xsc.coldata(0)[99]
xsc.coldata('id')[99]
Some of these methods may be more efficient than others.
Note that none of these methods will work if the table has sequential-only
access.