com.groupdocs.parser

Interfaces

Classes

Exceptions

com.groupdocs.parser

Class WordsMetadataExtractor



  • public final class WordsMetadataExtractor
    extends MetadataExtractor

    Provides the functionality to extract the metadata from text documents.


    Supported formats:

    .DOCMicrosoft Word Text document
    .DOTMicrosoft Word Text template
    .DOCXMicrosoft Office Open XML Text document
    .DOCMMicrosoft Word 2007 Master document
    .RTFRich Text Format text file
    .ODTOpenDocument text
    .HTML (.XHTML, .HTM)Hypertext Markup Language document
    .MHTML (.MHT)Web Archive Single File

    Extracting the metadata:

     // Create a metadata extractor for text documents
     MetadataExtractor metadataExtractor = new WordsMetadataExtractor();
     // Extract a metadata from the stream
     MetadataCollection metadata = metadataExtractor.extractMetadata(stream);
      
    • Constructor Detail

      • WordsMetadataExtractor

        public WordsMetadataExtractor()

        Initializes a new instance of the WordsMetadataExtractor class.

    • Method Detail

      • extractMetadataFromStream

        protected MetadataCollection extractMetadataFromStream(InputStream stream,
                                                   LoadOptions loadOptions)

        Extracts the metadata from the stream.

        Overrides:
        extractMetadataFromStream in class MetadataExtractor
        Parameters:
        stream - The stream of the document.
        loadOptions - The options of loading the file.


        This method must be override in the inherited classes.

        Returns:
        A collection of the metadata.