public abstract class TextExtractor extends Object implements AutoCloseable
Provides the base class for text extractors.
Modifier | Constructor and Description |
---|---|
protected |
TextExtractor(InputStream stream,
LoadOptions loadOptions)
Initializes a new instance of the
TextExtractor class. |
Modifier and Type | Method and Description |
---|---|
protected void |
checkDisposed()
Checks whether the extractor is disposed.
|
void |
close()
Releases the unmanaged resources used by the extractor.
|
void |
dispose()
Releases the unmanaged resources used by the extractor.
|
protected void |
dispose(boolean disposing)
Releases the unmanaged resources used by the extractor.
|
String |
extractAll()
Extracts all characters from the current position to the end of the text extractor
and returns them as one string.
|
String |
extractLine()
Extracts a line of characters from the text extractor and returns the data as a string.
|
protected String |
extractText()
Extracts all characters from the current position to the end of the text extractor
and returns them as one string.
|
protected String |
extractTextLine()
Extracts a line of characters from the text extractor and returns the data as a string.
|
Charset |
getEncoding()
Gets an encoding for the document.
|
String |
getMediaType()
Gets a media type for the document.
|
protected String |
getPassword()
Gets a password of the document.
|
boolean |
isDisposed()
Gets a value indicating whether the extractor is disposed.
|
protected abstract String |
prepareLine()
Returns a line of the text.
|
void |
reset()
Resets the current document.
|
void |
setEncoding(Charset value)
Sets an encoding for the document.
|
protected void |
setMediaType(String value)
Sets a media type for the document.
|
protected TextExtractor(InputStream stream, LoadOptions loadOptions)
Initializes a new instance of the TextExtractor
class.
stream
- A stream of the document.loadOptions
- The options of loading the file.ArgumentNullException
- stream
is null.public boolean isDisposed()
Gets a value indicating whether the extractor is disposed.
public String getMediaType()
Gets a media type for the document.
protected void setMediaType(String value)
Sets a media type for the document.
value
- A media type for the document, or null if media type is not specified.public Charset getEncoding()
Gets an encoding for the document.
public void setEncoding(Charset value)
Sets an encoding for the document.
value
- A encoding for the document, or null if encoding is not specified.protected String getPassword()
Gets a password of the document.
public void dispose()
Releases the unmanaged resources used by the extractor.
public void close() throws Exception
Releases the unmanaged resources used by the extractor.
close
in interface AutoCloseable
Exception
public void reset()
Resets the current document.
ExtractLine
method will return the first line of the document.
public String extractLine()
Extracts a line of characters from the text extractor and returns the data as a string.
public String extractAll()
Extracts all characters from the current position to the end of the text extractor and returns them as one string.
protected void dispose(boolean disposing)
Releases the unmanaged resources used by the extractor.
disposing
- A boolean true if invoked from Dispose; otherwise, false.protected void checkDisposed()
Checks whether the extractor is disposed.
protected String extractText()
Extracts all characters from the current position to the end of the text extractor and returns them as one string.
protected String extractTextLine()
Extracts a line of characters from the text extractor and returns the data as a string.
protected abstract String prepareLine()
Returns a line of the text.