GetText

GetText()

Extracts a text from the document.

public TextReader GetText()

Return Value

An instance of TextReader class with the extracted text; null if text extraction isn’t supported.

Remarks

Learn more:

Examples

The following example shows how to extract a text from a document:

// Create an instance of Parser class
using(Parser parser = new Parser(filePath))
{
    // Extract a text into the reader
    using(TextReader reader = parser.GetText())
    {
        // Print a text from the document
        // If text extraction isn't supported, a reader is null
        Console.WriteLine(reader == null ? "Text extraction isn't supported" : reader.ReadToEnd());
    }
}

GetText(TextOptions)

Extracts a text page from the document using text options (to enable raw fast text extraction mode).

public TextReader GetText(TextOptions options)

Parameter	Type	Description
options	TextOptions	The text extraction options.

Return Value

An instance of TextReader class with the extracted text; null if text extraction isn’t supported.

Remarks

Learn more:

Examples

The following example shows how to extract a raw text from a document:

// Create an instance of Parser class
using(Parser parser = new Parser(filePath))
{
    // Extract a raw text into the reader
    using(TextReader reader = parser.GetText(new TextOptions(true)))
    {
        // Print a text from the document
        // If text extraction isn't supported, a reader is null
        Console.WriteLine(reader == null ? "Text extraction isn't supported" : reader.ReadToEnd());
    }
}

GetText(int)

Extracts a text from the document page.

public TextReader GetText(int pageIndex)

Parameter	Type	Description
pageIndex	Int32	The zero-based page index.

Return Value

An instance of TextReader class with the extracted text; null if text page extraction isn’t supported.

Remarks

Learn more:

Extract text in Accurate mode

Examples

The following example shows how to extract a text from the document page:

// Create an instance of Parser class
using(Parser parser = new Parser(filePath))
{
    // Check if the document supports text extraction
    if(!parser.Features.Text)
    {
        Console.WriteLine("Document isn't supports text extraction.");
        return;
    }

    // Get the document info
    IDocumentInfo documentInfo = parser.GetDocumentInfo();
    // Check if the document has pages
    if(documentInfo.PageCount == 0)
    {
        Console.WriteLine("Document hasn't pages.");
        return;
    }
 
    // Iterate over pages
    for(int p = 0; p<documentInfo.PageCount; p++)
    {
        // Print a page number 
        Console.WriteLine(string.Format("Page {0}/{1}", p + 1, documentInfo.PageCount));
 
        // Extract a text into the reader
        using(TextReader reader = parser.GetText(p))
        {
            // Print a text from the document
            // We ignore null-checking as we have checked text extraction feature support earlier
            Console.WriteLine(reader.ReadToEnd());
        }
    }
}

GetText(int, TextOptions)

Extracts a text from the document page using text options (to enable raw fast text extraction mode).

public TextReader GetText(int pageIndex, TextOptions options)

Parameter	Type	Description
pageIndex	Int32	The zero-based page index.
options	TextOptions	The text extraction options.

Return Value

An instance of TextReader class with the extracted text; null if text page extraction isn’t supported.

Remarks

Learn more:

Examples

The following example shows how to extract a raw text from the document page:

// Create an instance of Parser class
using(Parser parser = new Parser(filePath))
{
    // Check if the document supports text extraction
    if(!parser.Features.Text)
    {
        Console.WriteLine("Document isn't supports text extraction.");
        return;
    }

    // Get the document info
    DocumentInfo documentInfo = parser.GetDocumentInfo() as DocumentInfo;
    // Check if the document has pages
    if(documentInfo == null || documentInfo.RawPageCount == 0)
    {
        Console.WriteLine("Document hasn't pages.");
        return;
    }
 
    // Iterate over pages
    for(int p = 0; p<documentInfo.RawPageCount; p++)
    {
        // Print a page number 
        Console.WriteLine(string.Format("Page {0}/{1}", p + 1, documentInfo.RawPageCount));
 
        // Extract a text into the reader
        using(TextReader reader = parser.GetText(p, new TextOptions(true)))
        {
            // Print a text from the document
            // We ignore null-checking as we have checked text extraction feature support earlier
            Console.WriteLine(reader.ReadToEnd());
        }
    }
}

GetText

GetText()

Return Value

Remarks

Examples

See Also

GetText(TextOptions)

Return Value

Remarks

Examples

See Also

GetText(int)

Return Value

Remarks

Examples

See Also

GetText(int, TextOptions)

Return Value

Remarks

Examples

See Also