com.groupdocs.parser

Interfaces

Classes

Exceptions

com.groupdocs.parser

Class XmlTextExtractor

  • All Implemented Interfaces:
    AutoCloseable


    public final class XmlTextExtractor
    extends TextExtractor

    Provides the text reader for XML documents.

    Reading a text from XML file:

     // Create a text extractor for XML documents
     XmlTextExtractor extractor = new XmlTextExtractor(stream);
     // Extract a text
     System.out.println(extractor.extractAll());
      
    • Constructor Detail

      • XmlTextExtractor

        public XmlTextExtractor(String fileName)

        Initializes a new instance of the XmlTextExtractor class.

        Parameters:
        fileName - The path to the file.
      • XmlTextExtractor

        public XmlTextExtractor(String fileName,
                        LoadOptions loadOptions)

        Initializes a new instance of the XmlTextExtractor class.

        Parameters:
        fileName - The path to the file.
        loadOptions - The options of loading the file.
      • XmlTextExtractor

        public XmlTextExtractor(InputStream stream)

        Initializes a new instance of the XmlTextExtractor class.

        Parameters:
        stream - The stream of the document.
      • XmlTextExtractor

        public XmlTextExtractor(InputStream stream,
                        LoadOptions loadOptions)

        Initializes a new instance of the XmlTextExtractor class.

        Parameters:
        stream - The stream of the document.
        loadOptions - The options of loading the file.
    • Method Detail

      • reset

        public void reset()

        Resets the current document.


        Resets the cursor's position. ExtractLine method will return the first line of the document.

        Overrides:
        reset in class TextExtractor
      • prepareLine

        protected String prepareLine()

        Returns a line of the text.

        Specified by:
        prepareLine in class TextExtractor
        Returns:
        A string that represents a line of the text, or null if all characters have been read.