PageTextAreaOptions

Inheritance: java.lang.Object, com.groupdocs.parser.options.PageAreaOptions

public class PageTextAreaOptions extends PageAreaOptions

Provides the options which are used for page text areas extraction.

An instance of PageTextAreaOptions class is used as parameter in Parser.getTextAreas(PageTextAreaOptions) and Parser.getTextAreas(int, PageTextAreaOptions) methods. See the usage examples there.

Learn more:

Constructors

Constructor Description
PageTextAreaOptions(boolean useOcr) Initializes a new instance of the TextOptions class with the OCR usage option.
PageTextAreaOptions(boolean useOcr, OcrOptions ocrOptions) Initializes a new instance of the TextOptions class with the ability to set OCR options.
PageTextAreaOptions(String expression) Initializes a new instance of the PageTextAreaOptions class with the regular expression.
PageTextAreaOptions(String expression, Rectangle rectangle) Initializes a new instance of the PageTextAreaOptions class with the regular expression and rectangular area.
PageTextAreaOptions(String expression, Rectangle rectangle, double rectangleTolerance) Initializes a new instance of the PageTextAreaOptions class with the regular expression, rectangular area and the size of the ignored border.
PageTextAreaOptions(String expression, boolean matchCase, boolean uniteSegments, boolean ignoreFormatting, Rectangle rectangle) Initializes a new instance of the PageTextAreaOptions class.
PageTextAreaOptions(String expression, boolean matchCase, boolean uniteSegments, boolean ignoreFormatting, Rectangle rectangle, double rectangleTolerance) Initializes a new instance of the PageTextAreaOptions class with the size of the ignored border.

Methods

Method Description
getExpression() Gets the regular expression.
isMatchCase() Gets the value that indicates whether a text case isn’t ignored.
isUniteSegments() Gets the value that indicates whether segments are united.
isIgnoreFormatting() Gets the value that indicates whether text formatting is ignored.
isUseOcr() Gets the value that indicates whether the OCR Connector is used to extract a text.
getOcrOptions() Gets the additional options for OCR functionality.

PageTextAreaOptions(boolean useOcr)

public PageTextAreaOptions(boolean useOcr)

Initializes a new instance of the TextOptions class with the OCR usage option.

Parameters:

Parameter Type Description
useOcr boolean The value that indicates whether the OCR functionality is used to extract a text.

PageTextAreaOptions(boolean useOcr, OcrOptions ocrOptions)

public PageTextAreaOptions(boolean useOcr, OcrOptions ocrOptions)

Initializes a new instance of the TextOptions class with the ability to set OCR options.

Parameters:

Parameter Type Description
useOcr boolean The value that indicates whether the OCR functionality is used to extract a text.
ocrOptions OcrOptions The additional options for OCR functionality.

PageTextAreaOptions(String expression)

public PageTextAreaOptions(String expression)

Initializes a new instance of the PageTextAreaOptions class with the regular expression. Other options are set by default (see remarks for details).

The following properties have default values:

  • MatchCase: false
  • UniteSegments: false
  • IgnoreFormatting: false
  • Rectangle: null

Parameters:

Parameter Type Description
expression java.lang.String The regular expression.

PageTextAreaOptions(String expression, Rectangle rectangle)

public PageTextAreaOptions(String expression, Rectangle rectangle)

Initializes a new instance of the PageTextAreaOptions class with the regular expression and rectangular area. Other options are set by default (see remarks for details).

The following properties have default values:

  • MatchCase: false
  • UniteSegments: false
  • IgnoreFormatting: false

Parameters:

Parameter Type Description
expression java.lang.String The regular expression.
rectangle Rectangle The rectangular area that contains page areas.

PageTextAreaOptions(String expression, Rectangle rectangle, double rectangleTolerance)

public PageTextAreaOptions(String expression, Rectangle rectangle, double rectangleTolerance)

Initializes a new instance of the PageTextAreaOptions class with the regular expression, rectangular area and the size of the ignored border. Other options are set by default (see remarks for details).

Parameters:

Parameter Type Description
expression java.lang.String The regular expression.
rectangle Rectangle The rectangular area that contains page areas.
rectangleTolerance double The size of the border that is ignored when captured by the rectangular area. It’s measured by the fraction of a text item height.

PageTextAreaOptions(String expression, boolean matchCase, boolean uniteSegments, boolean ignoreFormatting, Rectangle rectangle)

public PageTextAreaOptions(String expression, boolean matchCase, boolean uniteSegments, boolean ignoreFormatting, Rectangle rectangle)

Initializes a new instance of the PageTextAreaOptions class.

Parameters:

Parameter Type Description
expression java.lang.String The regular expression.
matchCase boolean The value that indicates whether a text case isn’t ignored.
uniteSegments boolean The value that indicates whether segments are united.
ignoreFormatting boolean The value that indicates whether text formatting is ignored.
rectangle Rectangle The rectangular area that contains page areas.

PageTextAreaOptions(String expression, boolean matchCase, boolean uniteSegments, boolean ignoreFormatting, Rectangle rectangle, double rectangleTolerance)

public PageTextAreaOptions(String expression, boolean matchCase, boolean uniteSegments, boolean ignoreFormatting, Rectangle rectangle, double rectangleTolerance)

Initializes a new instance of the PageTextAreaOptions class with the size of the ignored border.

Parameters:

Parameter Type Description
expression java.lang.String The regular expression.
matchCase boolean The value that indicates whether a text case isn’t ignored.
uniteSegments boolean The value that indicates whether segments are united.
ignoreFormatting boolean The value that indicates whether text formatting is ignored.
rectangle Rectangle The rectangular area that contains page areas.
rectangleTolerance double The size of the border that is ignored when captured by the rectangular area. It’s measured by the fraction of a text item height.

getExpression()

public String getExpression()

Gets the regular expression.

Returns: java.lang.String - A string that represents the regular expression.

isMatchCase()

public boolean isMatchCase()

Gets the value that indicates whether a text case isn’t ignored.

Returns: boolean - true if a text case isn’t ignored; otherwise, false .

isUniteSegments()

public boolean isUniteSegments()

Gets the value that indicates whether segments are united.

Returns: boolean - {code true} if segments are united; otherwise, {code false}.

isIgnoreFormatting()

public boolean isIgnoreFormatting()

Gets the value that indicates whether text formatting is ignored.

Returns: boolean - true if text formatting is ignored; otherwise, false .

isUseOcr()

public boolean isUseOcr()

Gets the value that indicates whether the OCR Connector is used to extract a text.

Returns: boolean - true if the OCR functionality is used; otherwise, false .

getOcrOptions()

public OcrOptions getOcrOptions()

Gets the additional options for OCR functionality.

Returns: OcrOptions - An instance of OcrOptions class with the additional OCR options.