MimeDir
extends Parser
in package
MimeDir parser.
This class parses iCalendar 2.0 and vCard 2.1, 3.0 and 4.0 files. This parser will return one of the following two objects from the parse method:
Sabre\VObject\Component\VCalendar Sabre\VObject\Component\VCard
Tags
Table of Contents
- OPTION_FORGIVING = 1
- Turning on this option makes the parser more forgiving.
- OPTION_IGNORE_INVALID_LINES = 2
- If this option is turned on, any lines we cannot parse will be ignored by the reader.
- $charset : string
- By default all input will be assumed to be UTF-8.
- $input : resource
- The input stream.
- $lineBuffer : string|null
- We need to look ahead 1 line every time to see if we need to 'unfold' the next line.
- $lineIndex : mixed
- The real current line number.
- $options : int
- Bitmask of parser options.
- $rawLine : string
- Contains a 'raw' representation of the current line.
- $root : Component
- Root component.
- $startLine : int
- In the case of unfolded lines, this property holds the line number for the start of the line.
- $SUPPORTED_CHARSETS : mixed
- The list of character sets we support when decoding.
- __construct() : mixed
- Creates the parser.
- parse() : Document
- Parses an iCalendar or vCard file.
- setCharset() : mixed
- By default all input will be assumed to be UTF-8.
- setInput() : mixed
- Sets the input buffer. Must be a string or stream.
- unescapeValue() : string|array<string|int, string>
- Unescapes a property value.
- parseDocument() : mixed
- Parses an entire document.
- parseLine() : Node
- Parses a line, and if it hits a component, it will also attempt to parse the entire component.
- readLine() : string
- Reads a single line from the buffer.
- readProperty() : mixed
- Reads a property or component from a line.
- extractQuotedPrintableValue() : string
- Gets the full quoted printable value.
- unescapeParam() : mixed
- Unescapes a parameter value.
Constants
OPTION_FORGIVING
Turning on this option makes the parser more forgiving.
public
mixed
OPTION_FORGIVING
= 1
In the case of the MimeDir parser, this means that the parser will accept slashes and underscores in property names, and it will also attempt to fix Microsoft vCard 2.1's broken line folding.
OPTION_IGNORE_INVALID_LINES
If this option is turned on, any lines we cannot parse will be ignored by the reader.
public
mixed
OPTION_IGNORE_INVALID_LINES
= 2
Properties
$charset
By default all input will be assumed to be UTF-8.
protected
string
$charset
= 'UTF-8'
However, both iCalendar and vCard might be encoded using different character sets. The character set is usually set in the mime-type.
If this is the case, use setEncoding to specify that a different encoding will be used. If this is set, the parser will automatically convert all incoming data to UTF-8.
$input
The input stream.
protected
resource
$input
$lineBuffer
We need to look ahead 1 line every time to see if we need to 'unfold' the next line.
protected
string|null
$lineBuffer
If that was not the case, we store it here.
$lineIndex
The real current line number.
protected
mixed
$lineIndex
= 0
$options
Bitmask of parser options.
protected
int
$options
$rawLine
Contains a 'raw' representation of the current line.
protected
string
$rawLine
$root
Root component.
protected
Component
$root
$startLine
In the case of unfolded lines, this property holds the line number for the start of the line.
protected
int
$startLine
= 0
$SUPPORTED_CHARSETS
The list of character sets we support when decoding.
protected
static mixed
$SUPPORTED_CHARSETS
= ['UTF-8', 'ISO-8859-1', 'Windows-1252']
This would be a const expression but for now we need to support PHP 5.5
Methods
__construct()
Creates the parser.
public
__construct([mixed $input = null ], int $options) : mixed
Optionally, it's possible to parse the input stream here.
Parameters
- $input : mixed = null
- $options : int
-
any parser options (OPTION constants)
Return values
mixed —parse()
Parses an iCalendar or vCard file.
public
parse([string|resource|null $input = null ], int $options) : Document
Pass a stream or a string. If null is parsed, the existing buffer is used.
Parameters
- $input : string|resource|null = null
- $options : int
Return values
Document —setCharset()
By default all input will be assumed to be UTF-8.
public
setCharset(string $charset) : mixed
However, both iCalendar and vCard might be encoded using different character sets. The character set is usually set in the mime-type.
If this is the case, use setEncoding to specify that a different encoding will be used. If this is set, the parser will automatically convert all incoming data to UTF-8.
Parameters
- $charset : string
Return values
mixed —setInput()
Sets the input buffer. Must be a string or stream.
public
setInput(resource|string $input) : mixed
Parameters
- $input : resource|string
Return values
mixed —unescapeValue()
Unescapes a property value.
public
static unescapeValue(string $input[, string $delimiter = ';' ]) : string|array<string|int, string>
vCard 2.1 says:
- Semi-colons must be escaped in some property values, specifically ADR, ORG and N.
- Semi-colons must be escaped in parameter values, because semi-colons are also use to separate values.
- No mention of escaping backslashes with another backslash.
- newlines are not escaped either, instead QUOTED-PRINTABLE is used to span values over more than 1 line.
vCard 3.0 says:
- (rfc2425) Backslashes, newlines (\n or \N) and comma's must be escaped, all time time.
- Comma's are used for delimiters in multiple values
- (rfc2426) Adds to to this that the semi-colon MUST also be escaped, as in some properties semi-colon is used for separators.
- Properties using semi-colons: N, ADR, GEO, ORG
- Both ADR and N's individual parts may be broken up further with a comma.
- Properties using commas: NICKNAME, CATEGORIES
vCard 4.0 (rfc6350) says:
- Commas must be escaped.
- Semi-colons may be escaped, an unescaped semi-colon may be a delimiter, depending on the property.
- Backslashes must be escaped
- Newlines must be escaped as either \N or \n.
- Some compound properties may contain multiple parts themselves, so a comma within a semi-colon delimited property may also be unescaped to denote multiple parts within the compound property.
- Text-properties using semi-colons: N, ADR, ORG, CLIENTPIDMAP.
- Text-properties using commas: NICKNAME, RELATED, CATEGORIES, PID.
Even though the spec says that commas must always be escaped, the example for GEO in Section 6.5.2 seems to violate this.
iCalendar 2.0 (rfc5545) says:
- Commas or semi-colons may be used as delimiters, depending on the property.
- Commas, semi-colons, backslashes, newline (\N or \n) are always escaped, unless they are delimiters.
- Colons shall not be escaped.
- Commas can be considered the 'default delimiter' and is described as the delimiter in cases where the order of the multiple values is insignificant.
- Semi-colons are described as the delimiter for 'structured values'. They are specifically used in Semi-colons are used as a delimiter in REQUEST-STATUS, RRULE, GEO and EXRULE. EXRULE is deprecated however.
Now for the parameters
If delimiter is not set (null) this method will just return a string. If it's a comma or a semi-colon the string will be split on those characters, and always return an array.
Parameters
- $input : string
- $delimiter : string = ';'
Return values
string|array<string|int, string> —parseDocument()
Parses an entire document.
protected
parseDocument() : mixed
Return values
mixed —parseLine()
Parses a line, and if it hits a component, it will also attempt to parse the entire component.
protected
parseLine(string $line) : Node
Parameters
- $line : string
-
Unfolded line
Return values
Node —readLine()
Reads a single line from the buffer.
protected
readLine() : string
This method strips any newlines and also takes care of unfolding.
Tags
Return values
string —readProperty()
Reads a property or component from a line.
protected
readProperty(mixed $line) : mixed
Parameters
- $line : mixed
Return values
mixed —extractQuotedPrintableValue()
Gets the full quoted printable value.
private
extractQuotedPrintableValue() : string
We need a special method for this, because newlines have both a meaning in vCards, and in QuotedPrintable.
This method does not do any decoding.
Return values
string —unescapeParam()
Unescapes a parameter value.
private
unescapeParam(string $input) : mixed
vCard 2.1:
- Does not mention a mechanism for this. In addition, double quotes are never used to wrap values.
- This means that parameters can simply not contain colons or semi-colons.
vCard 3.0 (rfc2425, rfc2426):
- Parameters may be surrounded by double quotes.
- If this is not the case, semi-colon, colon and comma may simply not occur (the comma used for multiple parameter values though).
- If it is surrounded by double-quotes, it may simply not contain double-quotes.
- This means that a parameter can in no case encode double-quotes, or newlines.
vCard 4.0 (rfc6350)
- Behavior seems to be identical to vCard 3.0
iCalendar 2.0 (rfc5545)
- Behavior seems to be identical to vCard 3.0
Parameter escaping mechanism (rfc6868) :
- This rfc describes a new way to escape parameter values.
- New-line is encoded as ^n
- ^ is encoded as ^^.
- " is encoded as ^'
Parameters
- $input : string