Package org.apache.pdfbox.contentstream
Class PDFStreamEngine
java.lang.Object
org.apache.pdfbox.contentstream.PDFStreamEngine
- Direct Known Subclasses:
PDFGraphicsStreamEngine
,PDFMarkedContentExtractor
,PDFTextStripper
Processes a PDF content stream and executes certain operations.
Provides a callback interface for clients that want to do things with the stream.
- Author:
- Ben Litchfield
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionfinal void
Adds an operator processor to the engine.protected void
applyTextAdjustment
(float tx, float ty) Applies a text position adjustment from the TJ operator.void
beginMarkedContentSequence
(COSName tag, COSDictionary properties) Called when a marked content group beginsvoid
Called when the BT operator is encountered.void
Decrease the level.void
Called when a marked content group endsvoid
endText()
Called when the ET operator is encountered.getAppearance
(PDAnnotation annotation) Returns the appearance stream to process for the given annotation.int
Gets the stream's initial matrix.int
getLevel()
Get the current level.void
Increase the level.protected void
operatorException
(Operator operator, List<COSBase> operands, IOException e) Called when an exception is thrown by an operator.protected void
processAnnotation
(PDAnnotation annotation, PDAppearanceStream appearance) Process the given annotation with the specified appearance stream.protected void
processChildStream
(PDContentStream contentStream, PDPage page) Process a child stream of the given page.void
processOperator
(String operation, List<COSBase> arguments) This is used to handle an operation.protected void
processOperator
(Operator operator, List<COSBase> operands) This is used to handle an operation.void
processPage
(PDPage page) This will initialize and process the contents of the stream.protected void
Processes a soft mask transparency group stream.protected final void
processTilingPattern
(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace) Process the given tiling pattern.protected final void
processTilingPattern
(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace, Matrix patternMatrix) Process the given tiling pattern.protected void
Processes a transparency group stream.protected void
processType3Stream
(PDType3CharProc charProc, Matrix textRenderingMatrix) Processes a Type 3 character stream.void
registerOperatorProcessor
(String operator, OperatorProcessor op) Deprecated.protected final void
restoreGraphicsStack
(Deque<PDGraphicsState> snapshot) Restores the entire graphics stack.void
Pops the current graphics state from the stack.protected final Deque<PDGraphicsState>
Saves the entire graphics stack.void
Pushes the current graphics state to the stack.void
setLineDashPattern
(COSArray array, int phase) void
setTextLineMatrix
(Matrix value) void
setTextMatrix
(Matrix value) void
showAnnotation
(PDAnnotation annotation) Shows the given annotation.protected void
showFontGlyph
(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) Deprecated.useshowFontGlyph(Matrix, PDFont, int, Vector)
insteadprotected void
showFontGlyph
(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) Called when a glyph is to be processed.void
showForm
(PDFormXObject form) Shows a form from the content stream.protected void
Deprecated.useshowGlyph(Matrix, PDFont, int, Vector)
insteadprotected void
Called when a glyph is to be processed.protected void
showText
(byte[] string) Process text from the PDF Stream.void
showTextString
(byte[] string) Called when a string of text is to be shown.void
showTextStrings
(COSArray array) Called when a string of text with spacing adjustments is to be shown.void
Shows a transparency group from the content stream.protected void
showType3Glyph
(Matrix textRenderingMatrix, PDType3Font font, int code, String unicode, Vector displacement) Deprecated.useshowType3Glyph(Matrix, PDType3Font, int, Vector)
insteadprotected void
showType3Glyph
(Matrix textRenderingMatrix, PDType3Font font, int code, Vector displacement) Called when a glyph is to be processed.transformedPoint
(float x, float y) Transforms a point using the CTM.protected float
transformWidth
(float width) Transforms a width using the CTM.protected void
unsupportedOperator
(Operator operator, List<COSBase> operands) Called when an unsupported operator is encountered.
-
Constructor Details
-
PDFStreamEngine
protected PDFStreamEngine()Creates a new PDFStreamEngine.
-
-
Method Details
-
registerOperatorProcessor
Deprecated.UseaddOperator(OperatorProcessor)
insteadRegister a custom operator processor with the engine.- Parameters:
operator
- The operator as a string.op
- Processor instance.
-
addOperator
Adds an operator processor to the engine.- Parameters:
op
- operator processor
-
processPage
This will initialize and process the contents of the stream.- Parameters:
page
- the page to process- Throws:
IOException
- if there is an error accessing the stream
-
showTransparencyGroup
Shows a transparency group from the content stream.- Parameters:
form
- transparency group (form) XObject- Throws:
IOException
- if the transparency group cannot be processed
-
showForm
Shows a form from the content stream.- Parameters:
form
- form XObject- Throws:
IOException
- if the form cannot be processed
-
processSoftMask
Processes a soft mask transparency group stream.- Parameters:
group
- the transparency group.- Throws:
IOException
-
processTransparencyGroup
Processes a transparency group stream.- Parameters:
group
- the transparency group.- Throws:
IOException
-
processType3Stream
protected void processType3Stream(PDType3CharProc charProc, Matrix textRenderingMatrix) throws IOException Processes a Type 3 character stream.- Parameters:
charProc
- Type 3 character proceduretextRenderingMatrix
- the Text Rendering Matrix- Throws:
IOException
- if there is an error reading or parsing the character content stream.
-
processAnnotation
protected void processAnnotation(PDAnnotation annotation, PDAppearanceStream appearance) throws IOException Process the given annotation with the specified appearance stream.- Parameters:
annotation
- The annotation containing the appearance stream to process.appearance
- The appearance stream to process.- Throws:
IOException
- If there is an error reading or parsing the appearance content stream.
-
processTilingPattern
protected final void processTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace) throws IOException Process the given tiling pattern.- Parameters:
tilingPattern
- the tiling patterncolor
- color to use, if this is an uncoloured pattern, otherwise null.colorSpace
- color space to use, if this is an uncoloured pattern, otherwise null.- Throws:
IOException
- if there is an error reading or parsing the tiling pattern content stream.
-
processTilingPattern
protected final void processTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace, Matrix patternMatrix) throws IOException Process the given tiling pattern. Allows the pattern matrix to be overridden for custom rendering.- Parameters:
tilingPattern
- the tiling patterncolor
- color to use, if this is an uncoloured pattern, otherwise null.colorSpace
- color space to use, if this is an uncoloured pattern, otherwise null.patternMatrix
- the pattern matrix, may be overridden for custom rendering.- Throws:
IOException
- if there is an error reading or parsing the tiling pattern content stream.
-
showAnnotation
Shows the given annotation.- Parameters:
annotation
- An annotation on the current page.- Throws:
IOException
- If an error occurred reading the annotation
-
getAppearance
Returns the appearance stream to process for the given annotation. May be used to render a specific appearance such as "hover".- Parameters:
annotation
- The current annotation.- Returns:
- The stream to process.
-
processChildStream
Process a child stream of the given page. Cannot be used withprocessPage(PDPage)
.- Parameters:
contentStream
- the child content streampage
- the current page- Throws:
IOException
- if there is an exception while processing the stream
-
beginText
Called when the BT operator is encountered. This method is for overriding in subclasses, the default implementation does nothing.- Throws:
IOException
- if there was an error processing the text
-
endText
Called when the ET operator is encountered. This method is for overriding in subclasses, the default implementation does nothing.- Throws:
IOException
- if there was an error processing the text
-
showTextString
Called when a string of text is to be shown.- Parameters:
string
- the encoded text- Throws:
IOException
- if there was an error showing the text
-
showTextStrings
Called when a string of text with spacing adjustments is to be shown.- Parameters:
array
- array of encoded text strings and adjustments- Throws:
IOException
- if there was an error showing the text
-
applyTextAdjustment
Applies a text position adjustment from the TJ operator. May be overridden in subclasses.- Parameters:
tx
- x-translationty
- y-translation- Throws:
IOException
- if something went wrong
-
showText
Process text from the PDF Stream. You should override this method if you want to perform an action when encoded text is being processed.- Parameters:
string
- the encoded text- Throws:
IOException
- if there is an error processing the string
-
showGlyph
protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException Deprecated.useshowGlyph(Matrix, PDFont, int, Vector)
insteadCalled when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix
- the current text rendering matrix, Trmfont
- the current fontcode
- internal PDF character code for the glyphunicode
- the Unicode text for this glyph, or null if the PDF does provide itdisplacement
- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException
- if the glyph cannot be processed
-
showGlyph
protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) throws IOException Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix
- the current text rendering matrix, Trmfont
- the current fontcode
- internal PDF character code for the glyphdisplacement
- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException
- if the glyph cannot be processed
-
showFontGlyph
protected void showFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException Deprecated.useshowFontGlyph(Matrix, PDFont, int, Vector)
insteadCalled when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix
- the current text rendering matrix, Trmfont
- the current fontcode
- internal PDF character code for the glyphunicode
- the Unicode text for this glyph, or null if the PDF does provide itdisplacement
- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException
- if the glyph cannot be processed
-
showFontGlyph
protected void showFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) throws IOException Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix
- the current text rendering matrix, Trmfont
- the current fontcode
- internal PDF character code for the glyphdisplacement
- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException
- if the glyph cannot be processed
-
showType3Glyph
protected void showType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, String unicode, Vector displacement) throws IOException Deprecated.useshowType3Glyph(Matrix, PDType3Font, int, Vector)
insteadCalled when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix
- the current text rendering matrix, Trmfont
- the current fontcode
- internal PDF character code for the glyphunicode
- the Unicode text for this glyph, or null if the PDF does provide itdisplacement
- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException
- if the glyph cannot be processed
-
showType3Glyph
protected void showType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, Vector displacement) throws IOException Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix
- the current text rendering matrix, Trmfont
- the current fontcode
- internal PDF character code for the glyphdisplacement
- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException
- if the glyph cannot be processed
-
beginMarkedContentSequence
Called when a marked content group begins- Parameters:
tag
- indicates the role or significance of the sequenceproperties
- optional properties
-
endMarkedContentSequence
public void endMarkedContentSequence()Called when a marked content group ends -
processOperator
This is used to handle an operation.- Parameters:
operation
- The operation to perform.arguments
- The list of arguments.- Throws:
IOException
- If there is an error processing the operation.
-
processOperator
This is used to handle an operation.- Parameters:
operator
- The operation to perform.operands
- The list of arguments.- Throws:
IOException
- If there is an error processing the operation.
-
unsupportedOperator
Called when an unsupported operator is encountered.- Parameters:
operator
- The unknown operator.operands
- The list of operands.- Throws:
IOException
- if something went wrong
-
operatorException
protected void operatorException(Operator operator, List<COSBase> operands, IOException e) throws IOException Called when an exception is thrown by an operator.- Parameters:
operator
- The unknown operator.operands
- The list of operands.e
- the thrown exception.- Throws:
IOException
- if something went wrong
-
saveGraphicsState
public void saveGraphicsState()Pushes the current graphics state to the stack. -
restoreGraphicsState
public void restoreGraphicsState()Pops the current graphics state from the stack. -
saveGraphicsStack
Saves the entire graphics stack.- Returns:
- the saved graphics state stack.
-
restoreGraphicsStack
Restores the entire graphics stack.- Parameters:
snapshot
- the graphics state stack to be restored.
-
getGraphicsStackSize
public int getGraphicsStackSize()- Returns:
- Returns the size of the graphicsStack.
-
getGraphicsState
- Returns:
- Returns the graphicsState.
-
getTextLineMatrix
- Returns:
- Returns the textLineMatrix.
-
setTextLineMatrix
- Parameters:
value
- The textLineMatrix to set.
-
getTextMatrix
- Returns:
- Returns the textMatrix.
-
setTextMatrix
- Parameters:
value
- The textMatrix to set.
-
setLineDashPattern
- Parameters:
array
- dash arrayphase
- dash phase
-
getResources
- Returns:
- the stream' resources. This is mainly to be used by the
OperatorProcessor
classes.
-
getCurrentPage
- Returns:
- the current page.
-
getInitialMatrix
Gets the stream's initial matrix.- Returns:
- the initial matrix.
-
transformedPoint
Transforms a point using the CTM.- Parameters:
x
- x-coordinate of the point to be transformed.y
- y-coordinate of the point to be transformed.- Returns:
- the transformed point.
-
transformWidth
protected float transformWidth(float width) Transforms a width using the CTM.- Parameters:
width
- the width value to be transformed.- Returns:
- the transformed width value.
-
getLevel
public int getLevel()Get the current level. This can be used to decide whether a recursion has done too deep and an operation should be skipped to avoid a stack overflow.- Returns:
- the current level.
-
increaseLevel
public void increaseLevel()Increase the level. Call this before running a potentially recursive operation. -
decreaseLevel
public void decreaseLevel()Decrease the level. Call this after running a potentially recursive operation. A log message is shown if the level is below 0. This can happen if the level is not decreased after an operation is done, e.g. by using a "finally" block.
-
addOperator(OperatorProcessor)
instead