Parser (PDF Clown 0.0.8 API Reference)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PDF Clown
0.0.8

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

it.stefanochizzolini.clown.documents.contents.tokens
Class Parser

java.lang.Object
  it.stefanochizzolini.clown.documents.contents.tokens.Parser

public class Parser
extends Object
extends Object

Content stream parser [PDF:1.6:3.7.1].

Version:: 0.0.8
Author:: Stefano Chizzolini (http://www.stefanochizzolini.it)

Constructor Summary
`Parser(PdfDataObject contentStream)` For internal use only.

Method Summary
`PdfDataObject`	`getContentStream()` Gets the content stream on which parsing is done.
`protected static int`	`getHex(int c)`
`long`	`getLength()`
`long`	`getPosition()`
`IInputStream`	`getStream()` Gets the current stream.
`int`	`getStreamIndex()` Gets the current stream index.
`Object`	`getToken()` Gets the currently-parsed token.
`TokenTypeEnum`	`getTokenType()` Gets the currently-parsed token type.
`protected static boolean`	`isDelimiter(int c)` Evaluates whether a character is a delimiter [PDF:1.6:3.1.1].
`protected static boolean`	`isEOL(int c)` Evaluates whether a character is an EOL marker [PDF:1.6:3.1.1].
`protected static boolean`	`isWhitespace(int c)` Evaluates whether a character is a white-space [PDF:1.6:3.1.1].
`boolean`	`moveNext()` Parse the next token [PDF:1.6:3.1].
`boolean`	`moveNext(int offset)`
`ContentObject`	`parseContentObject()` Parses the next content object [PDF:1.6:4.1], may it be a single operation or a graphics object.
`List<ContentObject>`	`parseContentObjects()`
`Operation`	`parseOperation()`
`protected PdfDirectObject`	`parsePdfObject()` Parse the current PDF object [PDF:1.6:3.2].
`void`	`seek(long position)`
`void`	`skip(long offset)`
`boolean`	`skipWhitespace()` Moves to the last whitespace after the current position in order to let read the first non-whitespace.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

Parser

public Parser(PdfDataObject contentStream)

For internal use only.

Method Detail

getHex

protected static int getHex(int c)

isDelimiter

protected static boolean isDelimiter(int c)

Evaluates whether a character is a delimiter [PDF:1.6:3.1.1].

isEOL

protected static boolean isEOL(int c)

Evaluates whether a character is an EOL marker [PDF:1.6:3.1.1].

isWhitespace

protected static boolean isWhitespace(int c)

Evaluates whether a character is a white-space [PDF:1.6:3.1.1].

getContentStream

public PdfDataObject getContentStream()

Gets the content stream on which parsing is done.

Remarks

A content stream may be made up of either a single stream or an array of streams.

getLength

public long getLength()

getPosition

public long getPosition()

getStream

public IInputStream getStream()

Gets the current stream.

getStreamIndex

public int getStreamIndex()

Gets the current stream index.

getToken

public Object getToken()

Gets the currently-parsed token.

Returns:: The current token.

getTokenType

public TokenTypeEnum getTokenType()

Gets the currently-parsed token type.

Returns:: The current token type.

moveNext

public boolean moveNext(int offset)
                 throws FileFormatException

Parameters:: offset - Number of tokens to be skipped before reaching the intended one.
Throws:: FileFormatException

moveNext

public boolean moveNext()
                 throws FileFormatException

Parse the next token [PDF:1.6:3.1].

Contract

Preconditions:
1. To properly parse the current token, the pointer MUST be just before its starting (leading whitespaces are ignored).
Postconditions:
1. When this method terminates, the pointer IS at the last byte of the current token.
Invariants:
1. The byte-level position of the pointer IS anytime (during token parsing) at the end of the current token (whereas the 'current token' represents the token-level position of the pointer).
Side-effects:
1. See Postconditions.

Returns:: Whether a new token was found.
Throws:: FileFormatException

parseContentObject

public ContentObject parseContentObject()
                                 throws FileFormatException

Parses the next content object [PDF:1.6:4.1], may it be a single operation or a graphics object.

Throws:: FileFormatException
Since:: 0.0.4

parseContentObjects

public List<ContentObject> parseContentObjects()
                                        throws FileFormatException

Throws:: FileFormatException

parseOperation

public Operation parseOperation()
                         throws FileFormatException

Throws:: FileFormatException

parsePdfObject

protected PdfDirectObject parsePdfObject()
                                  throws FileFormatException

Parse the current PDF object [PDF:1.6:3.2].

Contract

Preconditions:
1. When this method is invoked, the pointer MUST be at the first token of the requested object.
Postconditions:
1. When this method terminates, the pointer IS at the last token of the requested object.
Invariants:
1. (none).
Side-effects:
1. See Postconditions.

Throws:: FileFormatException

seek

public void seek(long position)

skip

public void skip(long offset)

skipWhitespace

public boolean skipWhitespace()

Moves to the last whitespace after the current position in order to let read the first non-whitespace.

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PDF Clown
0.0.8

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

it.stefanochizzolini.clown.documents.contents.tokens Class Parser

Parser

getHex

isDelimiter

isEOL

isWhitespace

getContentStream

Remarks

getLength

getPosition

getStream

getStreamIndex

getToken

getTokenType

moveNext

moveNext

Contract

parseContentObject

parseContentObjects

parseOperation

parsePdfObject

Contract

seek

skip

skipWhitespace

it.stefanochizzolini.clown.documents.contents.tokens
Class Parser