Constants and Enumerations#

Constants and enumerations of MuPDF as implemented by MuPDF.NET. Each of the following variables is accessible.

Constants#

csRGB#

Predefined RGB colorspace ColorSpace(Utils.CS_RGB).

Type:

ColorSpace

csGRAY#

Predefined GRAY colorspace ColorSpace(Utils.CS_GRAY).

Type:

ColorSpace

csCMYK#

Predefined CMYK colorspace ColorSpace(Utils.CS_CMYK).

Type:

ColorSpace

CS_RGB#

1 – Type of ColorSpace is RGBA

Type:

int

CS_GRAY#

2 – Type of ColorSpace is GRAY

Type:

int

CS_CMYK#

3 – Type of ColorSpace is CMYK

Type:

int

MUPDF_VERSION#

MuPDF version as a Tuple of integers, (major, minor, patch).

Type:

Tuple

VERSION#

(mupdfnet_version, mupdf_version, timestamp) – combined version information where timestamp is the generation point in time formatted as “YYYYMMDDhhmmss”.

Type:

Tuple

Document Permissions#

Code

Permitted Action

PDF_PERM_PRINT

Print the document

PDF_PERM_MODIFY

Modify the document’s contents

PDF_PERM_COPY

Copy or otherwise extract text and graphics

PDF_PERM_ANNOTATE

Add or modify text annotations and interactive form fields

PDF_PERM_FORM

Fill in forms and sign the document

PDF_PERM_ACCESSIBILITY

Obsolete, always permitted

PDF_PERM_ASSEMBLE

Insert, rotate, or delete pages, bookmarks, thumbnail images

PDF_PERM_PRINT_HQ

High quality printing

PDF Optional Content Codes#

Code

Meaning

PDF_OC_ON

Set an OCG to ON temporarily

PDF_OC_TOGGLE

Toggle OCG status temporarily

PDF_OC_OFF

Set an OCG to OFF temporarily

PDF encryption method codes#

Code

Meaning

PDF_ENCRYPT_KEEP

do not change

PDF_ENCRYPT_NONE

remove any encryption

PDF_ENCRYPT_RC4_40

RC4 40 bit

PDF_ENCRYPT_RC4_128

RC4 128 bit

PDF_ENCRYPT_AES_128

Advanced Encryption Standard 128 bit

PDF_ENCRYPT_AES_256

Advanced Encryption Standard 256 bit

PDF_ENCRYPT_UNKNOWN

unknown

Font File Extensions#

The table show file extensions you should use when saving font file buffers extracted from a PDF. This string is returned by Document.GetPageFonts(), Page.GetFonts() and Document.ExtractFont().

Ext

Description

ttf

TrueType font

pfa

Postscript for ASCII font (various subtypes)

cff

Type1C font (compressed font equivalent to Type1)

cid

character identifier font (postscript format)

otf

OpenType font

n/a

not extractable, e.g. Base-14-Fonts, Type 3 fonts and others

Inbuilt Fonts#

The following fonts are in-built to MuPDF.NET and can be used when creating fonts without having to reference an external font file:

Reference

Font name

empty

Noto Serif Regular (default)

helv

Helvetica

heit

Helvetica-Oblique

hebo

Helvetica-Bold

hebi

Helvetica-BoldOblique

cour

Courier

coit

Courier-Obliqu

cobo

Courier-Bold

cobi

Courier-BoldOblique

tiro

Times-Roman

tibo

Times-Bold

tiit

Times-Italic

tibi

Times-BoldItalic

symb

Symbol

zadb

ZapfDingbats

Example

MuPDF.NET.Font fontA = new MuPDF.NET.Font(""); // choose the Noto Serif Regular font
MuPDF.NET.Font fontB = new MuPDF.NET.Font("helv"); // choose the Helvetica font

Text Alignment#

TEXT_ALIGN_LEFT#

0 – align left.

TEXT_ALIGN_CENTER#

1 – align center.

TEXT_ALIGN_RIGHT#

2 – align right.

TEXT_ALIGN_JUSTIFY#

3 – align justify.

Text Extraction Flags#

Option bits controlling the amount of data, that are parsed into a TextPage – this class is mainly used only internally in MuPDF.NET.

For the MuPDF.NET programmer, some combination (using C# ‘s | operator, or simply use +) of these values are aggregated in the flags integer, a parameter of all text search and text extraction methods. Depending on the individual method, different default combinations of the values are used. Please use a value that meets your situation. Especially make sure to switch off image extraction unless you really need them. The impact on performance and memory is significant!

TEXT_PRESERVE_LIGATURES#

1 – If set, ligatures are passed through to the application in their original form. Otherwise ligatures are expanded into their constituent parts, e.g. the ligature “ffi” is expanded into three eparate characters f, f and i. Default is “on” in MuPDF.NET. MuPDF supports the following 7 ligatures: “ff”, “fi”, “fl”, “ffi”, “ffl”, , “ft”, “st”.

TEXT_PRESERVE_WHITESPACE#

2 – If set, whitespace is passed through. Otherwise any type of horizontal whitespace (including horizontal tabs) will be replaced with space characters of variable width. Default is “on” in MuPDF.NET.

TEXT_PRESERVE_IMAGES#

4 – If set, then images will be stored in the TextPage. This causes the presence of (usually large!) binary image content in the output of text extractions of types “blocks”, “dict”, “json”, “rawdict”, “rawjson”, “html”, and “xhtml” and is the default there. If used with “blocks” however, only image metadata will be returned, not the image itself.

TEXT_INHIBIT_SPACES#

8 – If set, Mupdf will not try to add missing space characters where there are large gaps between characters. In PDF, the creator often does not insert spaces to point to the next character’s position, but will provide the direct location address. The default in MuPDF.NET is “off” – so spaces will be generated.

TEXT_DEHYPHENATE#

16 – Ignore hyphens at line ends and join with next line. Used internally with the text search functions. However, it is generally available: if on, text extractions will return joined text lines (or spans) with the ending hyphen of the first line eliminated. So two separate spans “first meth-” and “od leads to wrong results” on different lines will be joined to one span “first method leads to wrong results” and correspondingly updated bboxes: the characters of the resulting span will no longer have identical y-coordinates.

TEXT_PRESERVE_SPANS#

32 – Generate a new line for every span. Not used (“off”) in MuPDF.NET, but available for your use. Every line in “dict”, “json”, “rawdict”, “rawjson” will contain exactly one span.

TEXT_MEDIABOX_CLIP#

64 – If set, characters entirely outside a page’s mediabox will be ignored. This is default in MuPDF.NET.

TEXT_CID_FOR_UNKNOWN_UNICODE#

128 – If set, use raw character codes instead of U+FFFD. This is the default for text extraction in MuPDF.NET. If you want to detect when encoding information is missing or uncertain, toggle this flag and scan for the presence of U+FFFD (= chr(0xfffd)) code points in the resulting text.

The following constants represent the default combinations of the above for text extraction and searching:

TEXTFLAGS_TEXT#

TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_WORDS#

TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_BLOCKS#

TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_DICT#

TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_PRESERVE_IMAGES | TEXT_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_RAWDICT#

TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_PRESERVE_IMAGES | TEXT_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_HTML#

TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_PRESERVE_IMAGES | TEXT_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_XHTML#

TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_PRESERVE_IMAGES | TEXT_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_XML#

TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_CID_FOR_UNKNOWN_UNICODE

TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_DEHYPHENATE

Widget Constants#

Widget Types (field_type)#

PDF_WIDGET_TYPE_UNKNOWN 0
PDF_WIDGET_TYPE_BUTTON 1
PDF_WIDGET_TYPE_CHECKBOX 2
PDF_WIDGET_TYPE_COMBOBOX 3
PDF_WIDGET_TYPE_LISTBOX 4
PDF_WIDGET_TYPE_RADIOBUTTON 5
PDF_WIDGET_TYPE_SIGNATURE 6
PDF_WIDGET_TYPE_TEXT 7

Text Widget Subtypes (text_format)#

PDF_WIDGET_TX_FORMAT_NONE 0
PDF_WIDGET_TX_FORMAT_NUMBER 1
PDF_WIDGET_TX_FORMAT_SPECIAL 2
PDF_WIDGET_TX_FORMAT_DATE 3
PDF_WIDGET_TX_FORMAT_TIME 4

Field flags (field_flags)#

Common to all field types:

PDF_FIELD_IS_READ_ONLY 1
PDF_FIELD_IS_REQUIRED 1 << 1
PDF_FIELD_IS_NO_EXPORT 1 << 2

Text widgets:

PDF_TX_FIELD_IS_MULTILINE  1 << 12
PDF_TX_FIELD_IS_PASSWORD  1 << 13
PDF_TX_FIELD_IS_FILE_SELECT  1 << 20
PDF_TX_FIELD_IS_DO_NOT_SPELL_CHECK  1 << 22
PDF_TX_FIELD_IS_DO_NOT_SCROLL  1 << 23
PDF_TX_FIELD_IS_COMB  1 << 24
PDF_TX_FIELD_IS_RICH_TEXT  1 << 25

Button widgets:

PDF_BTN_FIELD_IS_NO_TOGGLE_TO_OFF  1 << 14
PDF_BTN_FIELD_IS_RADIO  1 << 15
PDF_BTN_FIELD_IS_PUSHBUTTON  1 << 16
PDF_BTN_FIELD_IS_RADIOS_IN_UNISON  1 << 25

Choice widgets:

PDF_CH_FIELD_IS_COMBO  1 << 17
PDF_CH_FIELD_IS_EDIT  1 << 18
PDF_CH_FIELD_IS_SORT  1 << 19
PDF_CH_FIELD_IS_MULTI_SELECT  1 << 21
PDF_CH_FIELD_IS_DO_NOT_SPELL_CHECK  1 << 22
PDF_CH_FIELD_IS_COMMIT_ON_SEL_CHANGE  1 << 26

PDF Standard Blend Modes#

For an explanation see Adobe PDF References, page 324:

PDF_BM_Color "Color"
PDF_BM_ColorBurn "ColorBurn"
PDF_BM_ColorDodge "ColorDodge"
PDF_BM_Darken "Darken"
PDF_BM_Difference "Difference"
PDF_BM_Exclusion "Exclusion"
PDF_BM_HardLight "HardLight"
PDF_BM_Hue "Hue"
PDF_BM_Lighten "Lighten"
PDF_BM_Luminosity "Luminosity"
PDF_BM_Multiply "Multiply"
PDF_BM_Normal "Normal"
PDF_BM_Overlay "Overlay"
PDF_BM_Saturation "Saturation"
PDF_BM_Screen "Screen"
PDF_BM_SoftLight "Softlight"

Stamp Annotation Icons#

MuPDF has defined the following icons for rubber stamp annotations:

STAMP_Approved 0
STAMP_AsIs 1
STAMP_Confidential 2
STAMP_Departmental 3
STAMP_Experimental 4
STAMP_Expired 5
STAMP_Final 6
STAMP_ForComment 7
STAMP_ForPublicRelease 8
STAMP_NotApproved 9
STAMP_NotForPublicRelease 10
STAMP_Sold 11
STAMP_TopSecret 12
STAMP_Draft 13