Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 66 additions & 21 deletions src/asm/lexer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2147,47 +2147,92 @@ static Token yylex_RAW() {
return Token(T_(YYEOF));
}

// This function is called when capturing `REPT`/`FOR` loops and `MACRO` bodies,
// and when skipping unexecuted `IF`/`ELIF`/`ELSE` blocks and `REPT`/`FOR` loops.
// It expects that these constructs' `ENDC`/`ENDR`/`ENDM` closing tokens are only
// valid at the start of their lines, which enables ignoring everything except
// the leading keyword in lines that have one (as well as line continuations).
//
// The only keywords it needs to recognize are case-insensitive `IF`, `ELIF`,
// `ELSE`, `ENDC`, `REPT`, `FOR`, `ENDR`, and `ENDM` (not `MACRO`).
//
// Note that when these constructs are *evaluated*, they can perform expansions
// (for macro args, interpolations, and macro invocations) which may produce
// tokens that would change how these constructs were captured or skipped, if
// they had been produced during the capture/skip non-evaluating phase.
static Token skipToLeadingKeyword() {
assume(!lexerState->enableExpansions);

template<bool Quick, typename PeekFnT, typename ShiftFnT, typename NextLineFnT>
static Token skipToLeadingKeyword(PeekFnT peekFn, ShiftFnT shiftFn, NextLineFnT nextLineFn) {
Comment on lines +2150 to +2151
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing some documentation of what the parameters should be, including the template parameter Quick.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the names are clear from context: they're Fnctions that are stand-ins for peek, shiftChar, and nextLine.

Agreed that bool Quick's effect is unclear. Ideally I'd just be able to do if (peekFn == peek && shiftFn == shiftChar) instead of if (Quick), but that's not valid compile-time C++. The parameter's only use is to possibly increment lexerState->expansionScanDistance, making up for what the complete non-specialized/optimized peek() and shiftChar() do themselves. So maybe call it bool IncScanDistance or bool UpdateScanDistance? Or from a different perspective, bool IsRealPeek (the opposite truth value it has here)? Or bool IsOptimizedPeek? (That was my thought process at first, and then I abbreviated "IsOptimizedPeek/IsQuickPeek" to "IsQuick" or just "Quick".)

for (;;) {
int c = peekFn();
if (lexerState->atLineStart) {
lexerState->atLineStart = false;
if (int c = skipChars(isBlankSpace); c == EOF) {
while (isBlankSpace(c)) {
shiftFn();
c = peekFn();
}
if (c == EOF) {
return Token(T_(YYEOF));
} else if (isLetter(c)) {
std::string keyword(1, c);
for (c = nextChar(); continuesIdentifier(c); c = nextChar()) {
shiftFn();
for (c = peekFn(); continuesIdentifier(c); c = peekFn()) {
keyword += c;
shiftFn();
}
if (auto search = keywords.find(keyword); search != keywords.end()) {
// There has been one more call to `peekFn` than to `shiftFn`.
// If they are optimized "quick" functions for `ViewedContent`,
// instead of `peek` and `shiftChar`, they have not updated
// `lexerState->expansionScanDistance`, so it must be incremented
// if it was previously zero.
if (Quick && lexerState->expansionScanDistance == 0) {
++lexerState->expansionScanDistance;
}
return Token(search->second);
}
}
}
if (int c = bumpChar(); c == EOF) {
shiftFn();
if (c == EOF) {
return Token(T_(YYEOF));
} else if (isNewline(c)) {
handleCRLF(c);
nextLine();
// Like `handleCRLF` but calling generic `shiftFn`
if (c == '\r' && peekFn() == '\n') {
shiftFn();
}
nextLineFn();
lexerState->atLineStart = true;
}
}
}

// This function is called when capturing `REPT`/`FOR` loops and `MACRO` bodies,
// and when skipping unexecuted `IF`/`ELIF`/`ELSE` blocks and `REPT`/`FOR` loops.
// It expects that these constructs' `ENDC`/`ENDR`/`ENDM` closing tokens are only
// valid at the start of their lines, which enables ignoring everything except
// the leading keyword in lines that have one (as well as line continuations).
//
// The only keywords it needs to recognize are case-insensitive `IF`, `ELIF`,
// `ELSE`, `ENDC`, `REPT`, `FOR`, `ENDR`, and `ENDM` (not `MACRO`).
//
// Note that when these constructs are *evaluated*, they can perform expansions
// (for macro args, interpolations, and macro invocations) which may produce
// tokens that would change how these constructs were captured or skipped, if
// they had been produced during the capture/skip non-evaluating phase.
static Token skipToLeadingKeyword() {
assume(!lexerState->enableExpansions);

if (std::holds_alternative<ViewedContent>(lexerState->content)
&& lexerState->expansionStack.empty()) {
// Optimize the common case (a fully-read assembly file without ongoing
// expansions) to avoid the bookkeeping of `peek` and `shiftChar`.
auto &view = std::get<ViewedContent>(lexerState->content);
char const *ptr = view.span.ptr.get();
auto quickPeek = [&]() { return view.offset < view.span.size ? ptr[view.offset] : EOF; };
auto quickNextLine = []() { ++lexerState->lineNo; };
if (lexerState->capturing) {
assume(lexerState->captureBuf == nullptr);
auto quickCaptureShiftChar = [&]() {
++view.offset;
++lexerState->captureSize;
};
return skipToLeadingKeyword<true>(quickPeek, quickCaptureShiftChar, quickNextLine);
} else {
auto quickShiftChar = [&]() { ++view.offset; };
return skipToLeadingKeyword<true>(quickPeek, quickShiftChar, quickNextLine);
}
} else {
return skipToLeadingKeyword<false>(peek, shiftChar, nextLine);
}
}

static Token skipIfBlock(bool toEndc) {
lexer_SetMode(LEXER_NORMAL);

Expand Down
Loading