#include <Tokenizer.h>

Inheritance diagram for Http::One::Tokenizer:
Collaboration diagram for Http::One::Tokenizer:

Public Member Functions

 Tokenizer (SBuf &s)
 
bool quotedString (SBuf &value, const bool http1p0=false)
 
bool quotedStringOrToken (SBuf &value, const bool http1p0=false)
 
SBuf buf () const
 yet unparsed data More...
 
SBuf::size_type parsedSize () const
 number of parsed bytes, including skipped ones More...
 
bool atEnd () const
 whether the end of the buffer has been reached More...
 
const SBufremaining () const
 the remaining unprocessed section of buffer More...
 
void reset (const SBuf &newBuf)
 reinitialize processing for a new buffer More...
 
bool token (SBuf &returnedToken, const CharacterSet &delimiters)
 
bool prefix (SBuf &returnedToken, const CharacterSet &tokenChars, SBuf::size_type limit=SBuf::npos)
 
bool suffix (SBuf &returnedToken, const CharacterSet &tokenChars, SBuf::size_type limit=SBuf::npos)
 
bool skipSuffix (const SBuf &tokenToSkip)
 
bool skip (const SBuf &tokenToSkip)
 
bool skip (const char tokenChar)
 
bool skipOne (const CharacterSet &discardables)
 
SBuf::size_type skipAll (const CharacterSet &discardables)
 
bool skipOneTrailing (const CharacterSet &discardables)
 
SBuf::size_type skipAllTrailing (const CharacterSet &discardables)
 
bool int64 (int64_t &result, int base=0, bool allowSign=true, SBuf::size_type limit=SBuf::npos)
 

Protected Member Functions

SBuf consume (const SBuf::size_type n)
 convenience method: consumes up to n bytes, counts, and returns them More...
 
SBuf::size_type success (const SBuf::size_type n)
 convenience method: consume()s up to n bytes and returns their count More...
 
SBuf consumeTrailing (const SBuf::size_type n)
 convenience method: consumes up to n last bytes and returns them More...
 
SBuf::size_type successTrailing (const SBuf::size_type n)
 convenience method: consumes up to n last bytes and returns their count More...
 
void undoParse (const SBuf &newBuf, SBuf::size_type cParsed)
 reset the buffer and parsed stats to a saved checkpoint More...
 

Private Member Functions

bool qdText (SBuf &value, const bool http1p0)
 parse the internal component of a quote-string, and terminal DQUOTE More...
 
void checkpoint ()
 
void restoreLastCheckpoint ()
 

Private Attributes

SBuf savedCheckpoint_
 
SBuf::size_type savedStats_
 

Detailed Description

Lexical processor extended to tokenize HTTP/1.x syntax.

See Also
Parser::Tokenizer for more detail

Definition at line 22 of file Tokenizer.h.

Constructor & Destructor Documentation

Http::One::Tokenizer::Tokenizer ( SBuf s)
inline

Definition at line 25 of file Tokenizer.h.

Member Function Documentation

SBuf Parser::Tokenizer::buf ( ) const
inlineinherited

Definition at line 35 of file Tokenizer.h.

References Parser::Tokenizer::buf_.

Referenced by checkpoint(), and testTokenizer::testTokenizerInt64().

void Http::One::Tokenizer::checkpoint ( )
inlineprivate
SBuf Parser::Tokenizer::consume ( const SBuf::size_type  n)
protectedinherited
SBuf Parser::Tokenizer::consumeTrailing ( const SBuf::size_type  n)
protectedinherited

Definition at line 40 of file Tokenizer.cc.

References SBuf::consume(), debugs, and SBuf::npos.

bool Parser::Tokenizer::int64 ( int64_t &  result,
int  base = 0,
bool  allowSign = true,
SBuf::size_type  limit = SBuf::npos 
)
inherited

Extracts an unsigned int64_t at the beginning of the buffer.

strtoll(3)-alike function: tries to parse unsigned 64-bit integer at the beginning of the parse buffer, in the base specified by the user or guesstimated; consumes the parsed characters.

Parameters
resultOutput value. Not touched if parsing is unsuccessful.
baseSpecify base to do the parsing in, with the same restrictions as strtoll. Defaults to 0 (meaning guess)
allowSignWhether to accept a '+' or '-' sign prefix.
limitMaximum count of characters to convert.
Returns
whether the parsing was successful

Definition at line 209 of file Tokenizer.cc.

References INT64_MAX, INT64_MIN, SBuf::rawContent(), xisalpha, xisdigit, and xisupper.

Referenced by GetOtherPid(), Http::One::TeChunkedParser::parseChunkExtension(), Http::One::TeChunkedParser::parseChunkSize(), Security::PeerOptions::parseOptions(), ConnStateData::parseProxy1p0(), Http::One::ResponseParser::parseResponseFirstLine(), Http::One::ResponseParser::parseResponseStatusAndReason(), testTokenizer::testTokenizerInt64(), and Security::PeerOptions::updateTlsVersionLimits().

SBuf::size_type Parser::Tokenizer::parsedSize ( ) const
inlineinherited

Definition at line 38 of file Tokenizer.h.

References Parser::Tokenizer::parsed_.

Referenced by checkpoint(), and Ftp::Server::parseOneRequest().

bool Parser::Tokenizer::prefix ( SBuf returnedToken,
const CharacterSet tokenChars,
SBuf::size_type  limit = SBuf::npos 
)
inherited

Extracts all sequential permitted characters up to an optional length limit.

Note that Tokenizer cannot tell whether the prefix will continue when/if more input data becomes available later.

Return values
trueone or more characters were found, the sequence (string) is placed in returnedToken
falseno characters from the permitted set were found

Definition at line 79 of file Tokenizer.cc.

References debugs, CharacterSet::name, and SBuf::npos.

Referenced by Http::One::Parser::getHeaderField(), Ftp::Server::handleFeatReply(), mainParseOptions(), Http::One::TeChunkedParser::parseChunkExtension(), Http::One::RequestParser::parseMethodField(), Ftp::Server::parseOneRequest(), Security::PeerOptions::parseOptions(), ConnStateData::parseProxy1p0(), Http::One::RequestParser::parseRequestFirstLine(), Http::One::ResponseParser::parseResponseStatusAndReason(), Http::One::RequestParser::parseUriField(), testTokenizer::testTokenizerPrefix(), and testTokenizer::testTokenizerSkip().

bool Http::One::Tokenizer::qdText ( SBuf value,
const bool  http1p0 
)
private
bool Http::One::Tokenizer::quotedString ( SBuf value,
const bool  http1p0 = false 
)

Attempt to parse a quoted-string lexical construct.

Governed by:

  • RFC 1945 section 2.1 " A string of text is parsed as a single word if it is quoted using double-quote marks.

    quoted-string  = ( <"> *(qdtext) <"> )
    
    qdtext         = <any CHAR except <"> and CTLs,
                     but including LWS>
    

    Single-character quoting using the backslash ("\") character is not permitted in HTTP/1.0. "

  • RFC 7230 section 3.2.6 " A string of text is parsed as a single value if it is quoted using double-quote marks.

    quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE qdtext = HTAB / SP /x21 / x23-5B / x5D-7E / obs-text obs-text = x80-FF "

Parameters
escapedHTTP/1.0 does not permit -escaped characters

Definition at line 14 of file Tokenizer.cc.

References checkpoint(), qdText(), and Parser::Tokenizer::skip().

bool Http::One::Tokenizer::quotedStringOrToken ( SBuf value,
const bool  http1p0 = false 
)

Attempt to parse a (token / quoted-string ) lexical construct.

Definition at line 25 of file Tokenizer.cc.

References CharacterSet::TCHAR.

Referenced by Http::One::TeChunkedParser::parseChunkExtension().

void Parser::Tokenizer::reset ( const SBuf newBuf)
inlineinherited
void Http::One::Tokenizer::restoreLastCheckpoint ( )
inlineprivate

Definition at line 69 of file Tokenizer.h.

References savedCheckpoint_, savedStats_, and Parser::Tokenizer::undoParse().

bool Parser::Tokenizer::skip ( const char  tokenChar)
inherited

skips a given single character

Returns
whether the character was skipped

Definition at line 171 of file Tokenizer.cc.

References debugs.

SBuf::size_type Parser::Tokenizer::skipAllTrailing ( const CharacterSet discardables)
inherited

Removes all sequential trailing characters from the set, in any order.

Returns
the number of characters removed

Definition at line 193 of file Tokenizer.cc.

References debugs, CharacterSet::name, and SBuf::npos.

Referenced by Auth::SchemesConfig::expand(), Http::One::RequestParser::parseRequestFirstLine(), and Http::One::RequestParser::skipTrailingCrs().

bool Parser::Tokenizer::skipOneTrailing ( const CharacterSet discardables)
inherited

Removes a single trailing character from the set.

Returns
whether a character was removed

Definition at line 182 of file Tokenizer.cc.

References debugs, and CharacterSet::name.

Referenced by Http::One::RequestParser::parseHttpVersionField(), Http::One::RequestParser::skipTrailingCrs(), and testTokenizer::testTokenizerSuffix().

bool Parser::Tokenizer::skipSuffix ( const SBuf tokenToSkip)
inherited

skips a given suffix character sequence (string) Operates on the trailing end of the buffer.

Note that Tokenizer cannot tell whether the buffer will gain more data when/if more input becomes available later.

Returns
whether the exact character sequence was found and skipped

Definition at line 143 of file Tokenizer.cc.

References debugs, SBuf::length(), and SBuf::npos.

Referenced by Http::One::RequestParser::parseHttpVersionField(), and testTokenizer::testTokenizerSuffix().

SBuf::size_type Parser::Tokenizer::success ( const SBuf::size_type  n)
protectedinherited

Definition at line 33 of file Tokenizer.cc.

SBuf::size_type Parser::Tokenizer::successTrailing ( const SBuf::size_type  n)
protectedinherited

Definition at line 55 of file Tokenizer.cc.

bool Parser::Tokenizer::suffix ( SBuf returnedToken,
const CharacterSet tokenChars,
SBuf::size_type  limit = SBuf::npos 
)
inherited

Extracts all sequential permitted characters up to an optional length limit. Operates on the trailing end of the buffer.

Note that Tokenizer cannot tell whether the buffer will gain more data when/if more input becomes available later.

Return values
trueone or more characters were found, the sequence (string) is placed in returnedToken
falseno characters from the permitted set were found

Definition at line 100 of file Tokenizer.cc.

References SBuf::consume(), i, SBuf::rbegin(), and SBuf::rend().

Referenced by Http::One::RequestParser::parseHttpVersionField(), and testTokenizer::testTokenizerSuffix().

bool Parser::Tokenizer::token ( SBuf returnedToken,
const CharacterSet delimiters 
)
inherited

Basic strtok(3): Skips all leading delimiters (if any), extracts all characters up to the next delimiter (a token), and skips all trailing delimiters (at least one must be present).

Want to extract delimiters? Use prefix() instead.

Note that Tokenizer cannot tell whether the trailing delimiters will continue when/if more input data becomes available later.

Returns
true if found a non-empty token followed by a delimiter

Definition at line 61 of file Tokenizer.cc.

References DBG_DATA, debugs, CharacterSet::name, and SBuf::npos.

Referenced by AppendTokens(), Auth::SchemesConfig::expand(), and testTokenizer::testTokenizerToken().

void Parser::Tokenizer::undoParse ( const SBuf newBuf,
SBuf::size_type  cParsed 
)
inlineprotectedinherited

Member Data Documentation

SBuf Http::One::Tokenizer::savedCheckpoint_
private

Definition at line 71 of file Tokenizer.h.

Referenced by checkpoint(), and restoreLastCheckpoint().

SBuf::size_type Http::One::Tokenizer::savedStats_
private

Definition at line 72 of file Tokenizer.h.

Referenced by checkpoint(), and restoreLastCheckpoint().


The documentation for this class was generated from the following files:

 

Introduction

Documentation

Support

Miscellaneous

Web Site Translations

Mirrors