com.groupdocs.parser

Interfaces

Classes

Exceptions

com.groupdocs.parser

Class TextExtractor

    • Constructor Summary

      Constructors 
      Modifier Constructor and Description
      protected TextExtractor(InputStream stream, LoadOptions loadOptions)
      Initializes a new instance of the TextExtractor class.
    • Constructor Detail

      • TextExtractor

        protected TextExtractor(InputStream stream,
                     LoadOptions loadOptions)

        Initializes a new instance of the TextExtractor class.

        Parameters:
        stream - A stream of the document.
        loadOptions - The options of loading the file.
        Throws:
        ArgumentNullException - stream is null.
    • Method Detail

      • isDisposed

        public boolean isDisposed()

        Gets a value indicating whether the extractor is disposed.

        Returns:
        A boolean true if reader is disposed; otherwise, false.
      • getMediaType

        public String getMediaType()

        Gets a media type for the document.

        Returns:
        A media type for the document, or null if media type is not specified.
      • setMediaType

        protected void setMediaType(String value)

        Sets a media type for the document.

        Parameters:
        value - A media type for the document, or null if media type is not specified.
      • getEncoding

        public Charset getEncoding()

        Gets an encoding for the document.

        Returns:
        A encoding for the document, or null if encoding is not specified.
      • setEncoding

        public void setEncoding(Charset value)

        Sets an encoding for the document.

        Parameters:
        value - A encoding for the document, or null if encoding is not specified.
      • getPassword

        protected String getPassword()

        Gets a password of the document.

        Returns:
        A string that represents a password of the document.
      • dispose

        public void dispose()

        Releases the unmanaged resources used by the extractor.

      • reset

        public void reset()

        Resets the current document.


        Resets the cursor's position. ExtractLine method will return the first line of the document.

      • extractLine

        public String extractLine()

        Extracts a line of characters from the text extractor and returns the data as a string.

        Returns:
        The next line from the extractor, or null if all characters have been extracted.
      • extractAll

        public String extractAll()

        Extracts all characters from the current position to the end of the text extractor and returns them as one string.

        Returns:
        A string that contains all characters from the current position to the end of the text extractor.
      • dispose

        protected void dispose(boolean disposing)

        Releases the unmanaged resources used by the extractor.

        Parameters:
        disposing - A boolean true if invoked from Dispose; otherwise, false.
      • checkDisposed

        protected void checkDisposed()

        Checks whether the extractor is disposed.

      • extractText

        protected String extractText()

        Extracts all characters from the current position to the end of the text extractor and returns them as one string.

        Returns:
        A string that contains all characters from the current position to the end of the text extractor.
      • extractTextLine

        protected String extractTextLine()

        Extracts a line of characters from the text extractor and returns the data as a string.

        Returns:
        The next line from the extractor, or null if all characters have been extracted.
      • prepareLine

        protected abstract String prepareLine()

        Returns a line of the text.

        Returns:
        A string that represents a line of the text, or null if all characters have been read.