public abstract class DataSource
extends java.lang.Object
As well as the ability to return a stream, a DataSource may also have a position, which corresponds to the 'ref' or 'frag' part of a URL (the bit after the #). This is an indication of a location in the stream; it is a string, and its interpretation is entirely up to the application (though may be specified by the documentation of specific DataSource subclasses).
As well as providing the facility for several different objects to
get their own copy of the underlying input stream, this class also
handles decompression of the stream.
Compression types are as understood by the associated Compression
class.
For efficiency, a buffer of the bytes at the start of the stream called the 'intro buffer' is recorded the first time that the stream is read. This can then be used for magic number queries cheaply, without having to open a new input stream. In the case that the whole input stream is shorter than the intro buffer, the underlying input stream never has to be read again.
Any implementation which implements getRawInputStream()
in such
a way as to return different byte sequences on different occasions
may lead to unpredictable behaviour from this class.
Compression
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_INTRO_LIMIT |
static java.lang.String |
MARK_WORKAROUND_PROPERTY |
Constructor and Description |
---|
DataSource()
Constructs a DataSource with a default size of intro buffer.
|
DataSource(int introLimit)
Constructs a DataSource with a given size of intro buffer.
|
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes any open streams owned and not yet dispatched by this
DataSource.
|
DataSource |
forceCompression(Compression compress)
Returns a DataSource representing the same underlying stream,
but with a forced compression mode compress.
|
Compression |
getCompression()
Returns an object which will handle any required decompression
for this stream.
|
java.io.InputStream |
getHybridInputStream()
Returns an input stream which appears just the same as the
one returned by
getInputStream() , but only incurs the
expense of obtaining an actual input stream (by calling
getRawInputStream() if more bytes are read than the
cached magic number. |
java.io.InputStream |
getInputStream()
Returns an InputStream containing the whole of this DataSource.
|
static java.io.InputStream |
getInputStream(java.lang.String location,
boolean allowSystem)
Returns an input stream based on the given location string.
|
byte[] |
getIntro()
Returns the intro buffer, first reading it if this hasn't been
done before.
|
int |
getIntroLimit()
Returns the maximum length of the intro buffer.
|
long |
getLength()
Returns the length of the stream returned by getInputStream
in bytes, if known.
|
static boolean |
getMarkWorkaround()
Returns true if we are working around potential bugs in InputStream
InputStream.mark(int) /InputStream.reset()
methods (common, including in J2SE classes). |
java.lang.String |
getName()
Returns a name for this source.
|
java.lang.String |
getPosition()
Returns the position associated with this source.
|
protected abstract java.io.InputStream |
getRawInputStream()
Provides a new InputStream for this data source.
|
long |
getRawLength()
Returns the length in bytes of the stream returned by
getRawInputStream, if known.
|
java.lang.String |
getSystemId()
Returns a System ID for this DataSource; this is a string
representation of a file name or URL, as used by
Source and friends. |
java.net.URL |
getURL()
Returns a URL which corresponds to this data source, if one exists.
|
static DataSource |
makeDataSource(java.lang.String loc)
Attempts to make a source given a string identifying its location
as a file, URL or system command output.
|
static DataSource |
makeDataSource(java.lang.String loc,
boolean allowSystem)
Attempts to make a source given a string identifying its location
as a file, URL or optionally a system command output.
|
static DataSource |
makeDataSource(java.net.URL url)
Makes a source from a URL.
|
void |
setCompression(Compression compress)
Sets the compression to be associated with this data source.
|
void |
setIntroLimit(int limit)
Sets the maximum size of the intro buffer to a new value.
|
static void |
setMarkWorkaround(boolean workaround)
Sets whether we want to work around bugs in InputStream mark/reset
methods.
|
void |
setName(java.lang.String name)
Sets the name of this source.
|
void |
setPosition(java.lang.String position)
Sets the position associated with this source.
|
java.lang.String |
toString()
Returns a short description of this source (name plus compression type).
|
public static final int DEFAULT_INTRO_LIMIT
public static final java.lang.String MARK_WORKAROUND_PROPERTY
public DataSource(int introLimit)
introLimit
- the maximum number of bytes in the intro bufferpublic DataSource()
protected abstract java.io.InputStream getRawInputStream() throws java.io.IOException
java.io.IOException
public java.net.URL getURL()
URL.openConnection()
method call on the URL
returned by this method should provide a stream with the
same content as the getRawInputStream()
method of this
data source. If no such URL exists or is known, then null
should be returned.
If this source has a non-null position value, it will be appended to the main part of the URL after a '#' character (as the URL's ref part).
public int getIntroLimit()
public void setIntroLimit(int limit)
limit
- the new maximum length of the intro bufferpublic long getRawLength()
public long getLength()
public java.lang.String getName()
getURL()
method
(or some suitable class-specific method) should be used.
If this source has a position, it should probably form part of
this name.public void setName(java.lang.String name)
name
- a namegetName()
public java.lang.String getPosition()
public void setPosition(java.lang.String position)
position
- the new posisition (may be null)public java.lang.String getSystemId()
Source
and friends.
The return value may be null if none is known.
This does not contain any reference to the position.public Compression getCompression() throws java.io.IOException
java.io.IOException
public byte[] getIntro() throws java.io.IOException
The returned buffer is the original not a copy - don't change its contents!
java.io.IOException
public void setCompression(Compression compress)
The effects of setting a compression to a mode (other than NONE) which does not match the actual compression mode of the underlying stream are undefined, so this method should be used with care.
compress
- the compression mode encoding the underlying
streampublic DataSource forceCompression(Compression compress)
setCompression(uk.ac.starlink.util.Compression)
,
the consequences of using a different value of compress
than the correct one (other than Compression.NONE
are unpredictable.compress
- the compression mode to be used for the returned
data sourcepublic java.io.InputStream getInputStream() throws java.io.IOException
java.io.IOException
public java.io.InputStream getHybridInputStream() throws java.io.IOException
getInputStream()
, but only incurs the
expense of obtaining an actual input stream (by calling
getRawInputStream()
if more bytes are read than the
cached magic number. This is an efficient way to read if you
need an InputStream but may only end up reading the first
few bytes of it.java.io.IOException
public void close()
public java.lang.String toString()
toString
in class java.lang.Object
public static DataSource makeDataSource(java.lang.String loc) throws java.io.IOException
If a '#' character exists in the string, text after it will be interpreted as a position value. Otherwise, the position is considered to be null.
Note: this method presents a security risk if the
loc
string is vulnerable to injection.
Consider using the variant method
makeDataSource
(loc,false) in such cases.
This method just calls makeDataSource(loc,true)
.
loc
- the location of the data, with optional positionjava.io.IOException
- if loc does not name
an existing readable file or valid URLpublic static DataSource makeDataSource(java.lang.String loc, boolean allowSystem) throws java.io.IOException
The supplied loc
may be one of the following:
allowSystem=true
:
a string preceded by "<" or followed by "|",
giving a shell command line (may not work on all platforms)If a '#' character exists in the string, text after it will be interpreted as a position value. Otherwise, the position is considered to be null.
Note: setting allowSystem=true
may
introduce a security risk if the loc
string is
vulnerable to injection.
loc
- the location of the data, with optional positionallowSystem
- whether to allow system commands
using the format abovejava.io.IOException
- if loc does not name
an existing readable file or valid URLpublic static DataSource makeDataSource(java.net.URL url)
url
- location of the data streampublic static java.io.InputStream getInputStream(java.lang.String location, boolean allowSystem) throws java.io.IOException
allowSystem=true
:
a string preceded by "<" or followed by "|",
giving a shell command line (may not work on all platforms)Note: setting allowSystem=true
may
introduce a security risk if the loc
string is
vulnerable to injection.
location
- URL, filename, "cmdline|"/"<cmdline", or "-"allowSystem
- whether to allow system commands
using the format abovejava.io.FileNotFoundException
- if location cannot be
interpreted as a source of bytesjava.io.IOException
- if there is an error obtaining the streampublic static boolean getMarkWorkaround()
InputStream.mark(int)
/InputStream.reset()
methods (common, including in J2SE classes).
The return value is dependent on the system property named
MARK_WORKAROUND_PROPERTY
.public static void setMarkWorkaround(boolean workaround)
workaround
- true to employ the workaround