Commit graph

40 commits

Author SHA1 Message Date
Carl Worth
e9397867dd Collapse multiple spaces in input down to a single space.
This is what gcc does, and it's actually less work to do
this. Previously we were having to save the contents of space tokens
as a string, but we don't need to do that now.

We extend test #0 to exercise this feature here.
2010-05-25 17:08:07 -07:00
Carl Worth
f8ec4e0be8 Add a test #0 to ensure that we don't do any inadvertent token pasting.
This simply ensures that spaces in input line are preserved.
2010-05-25 17:06:17 -07:00
Carl Worth
ae6517f4a8 Implement expansion of object-like macros.
For this we add an "active" string_list_t to the parser. This makes
the current expansion_list_t in the parser obsolete, but we don't
remove that yet.

With this change we can now start passing some actual tests, so we
turn on real testing in the test suite again. I expect to implement
things more or less in the same order as before, so the test suite now
halts on first error.

With this change the first 8 tests in the suite pass, (object-like
macros with chaining and recursion).
2010-05-25 15:24:59 -07:00
Carl Worth
9fb8b7a495 Make the lexer pass whitespace through (as OTHER tokens) for text lines.
With this change, we can recreate the original text-line input
exactly. Previously we were inserting a space between every pair of
tokens so our output had a lot more whitespace than our input.

With this change, we can drop the "-b" option to diff and match the
input exactly.
2010-05-25 15:04:32 -07:00
Carl Worth
808401fd79 Store parsed tokens as token list and print all text lines.
Still not doing any macro expansion just yet. But it should be fairly
easy from here.
2010-05-25 14:52:43 -07:00
Carl Worth
3ff8167084 Starting over with the C99 grammar for the preprocessor.
This is a fresh start with a much simpler approach for the flex/bison
portions of the preprocessor. This isn't functional yet, (produces no
output), but can at least read all of our test cases without any parse
errors.

The grammar here is based on the grammar provided for the preprocessor
in the C99 specification.
2010-05-25 14:38:15 -07:00
Carl Worth
00f1ec421e Add test for '/', '<<', and '>>' in #if expressions.
These operators have been supported already, but were not covered in
existing tests yet. So this test passes already.
2010-05-24 11:41:36 -07:00
Carl Worth
bb9315f804 Add test of bitwise operators and octal/hexadecimal literals.
This new test covers several features from the last few commits.

This test passes already.
2010-05-24 11:33:07 -07:00
Carl Worth
bcbd587b0f Implement all operators specified for GLSL #if expressions (with tests).
The operator coverage here is quite complete. The one big thing
missing is that we are not yet doing macro expansion in #if
lines. This makes the whole support fairly useless, so we plan to fix
that shortcoming right away.
2010-05-24 10:37:38 -07:00
Carl Worth
b20d33c5c6 Implement #if, #else, #elif, and #endif with tests.
So far the only expression implemented is a single integer literal,
but obviously that's easy to extend. Various things including nesting
are tested here.
2010-05-20 22:27:07 -07:00
Carl Worth
323421db65 Remove "unnecessary" whitespace from some tests.
This whitespace was not part of anything being tested, and it
introduces differences (that we don't actually care about) between the
output of "gcc -E" and glcpp.

Just eliminate this extra whitespace to reduce spurious test-case
failures.
2010-05-20 14:05:37 -07:00
Carl Worth
660bda057a Stop ignoring whitespace while testing.
Sometime back the output of glcpp started differing from the output of
"gcc -E" in the amount of whitespace in emitted. At the time, I
switched the test suite to use "diff -w" to ignore this. This was a
mistake since it ignores whitespace entirely. (I meant to use "diff
-b" which ignores only changes in the amount of whitespace.)

So bugs have since been introduced that the test suite doesn't
notice. For example, glcpp is producing "twotokens" where it should be
producing "two tokens".

Let's stop ignoring whitespace in the test suite, which currently
introduces lots of failures---some real and some spurious.
2010-05-20 14:01:59 -07:00
Carl Worth
805ea6afe6 Add test (and fix) for a function argument of a macro that expands with a comma.
The fix here is quite simple (and actually only deletes code). When
expanding a macro, we don't return a ',' as a unique token type, but
simply let it fall through to the generic case.
2010-05-20 12:06:33 -07:00
Carl Worth
9f3d2c4e3d Add support for commas within parenthesized groups in function arguments.
The specification says that commas within a parenthesized group,
(that's not a function-like macro invocation), are passed through
literally and not considered argument separators in any outer macro
invocation.

Add support and a test for this case. This support makes a third
occurrence of the same "FUNC_MACRO (" shift/reduce conflict appear, so
expect that.

This change does introduce a fairly large copy/paste block in the
grammar which is unfortunate. Perhaps if I were more clever I'd find a
way to share the common pieces between argument and argument_or_comma.
2010-05-20 08:46:54 -07:00
Carl Worth
b569383bbd Avoid re-expanding a macro name that has once been rejected from expansion.
The specification of the preprocessor in C99 says that when we see a
macro name that we are already expanding that we refuse to expand it
now, (which we've done for a while), but also that we refuse to ever
expand it later if seen in other contexts at which it would be
legitimate to expand.

We add a test case for that here, and fix it to work. The fix takes
advantage of a new token_t value for tokens and argument words along
with the recently added IDENTIFIER_FINALIZED token type which
instructs the parser to not even look for another expansion.
2010-05-20 08:01:44 -07:00
Carl Worth
5d21142545 Like previous fix, but for object-like macros (and add a test).
The support for an object-like amcro within a macro-invocation
argument was also implemented at one level too high in the
grammar. Fortunately, this is a very simple fix.
2010-05-19 07:57:03 -07:00
Carl Worth
59ca98990f Fix bug as in previous fix, but with multi-token argument.
The previous fix added FUNC_MACRO to a production one higher in teh
grammar than it should have. So it prevented a FUNC_MACRO from
appearing as part of a mutli-token argument rather than just alone as
an argument. Fix this (and add a test).
2010-05-19 07:49:47 -07:00
Carl Worth
69f390d609 Fix bug (and test) for an invocation using macro name as a non-macro argument
This adds a second shift/reduce conflict to our grammar. It's basically the
same conflict we had previously, (deciding to shift a '(' after a FUNC_MACRO)
but this time in the "argument" context rather than the "content" context.

It would be nice to not have these, but I think they are unavoidable
(withotu a lot of pain at least) given the preprocessor specification.
2010-05-19 07:42:42 -07:00
Carl Worth
be0e2e9b2a Fix bug (and add tests) for a function-like macro defined as itself.
This case worked previously, but broke in the recent rewrite of
function- like macro expansion. The recursion was still terminated
correctly, but any parenthesized expression after the macro name was
still being swallowed even though the identifier was not being
expanded as a macro.

The fix is to notice earlier that the identifier is an
already-expanding macro. We let the lexer know this through the
classify_token function so that an already-expanding macro is lexed as
an identifier, not a FUNC_MACRO.
2010-05-19 07:29:22 -07:00
Carl Worth
d476db38fe Add several tests where the defined value of a macro is (or looks like) a macro
Many of these look quite similar to existing tests that are handled
correctly, yet none of these work. For example, in test 30 we have a
simple non-function macro "foo" that is defined as "bar(baz(success))"
and obviously non-function macro expansion has been working for a long
time.  Similarly, if we had text of "bar(baz(success))" it would be
expanded correctly as well.

But when this otherwise functioning text appears as the body of a
macro, things don't work at all.

This is pointing out a fundamental problem with the current
approach. The current code does a recursive expansion of a macro
definition, but this doesn't involve the parsing machinery, so it
can't actually handle things like an arbitrary nesting of parentheses.

The fix will require the parser to stuff macro values back into the
lexer to get at all of the existing machinery when expanding macros.
2010-05-18 22:09:57 -07:00
Carl Worth
1a29500e72 Fix (and add test for) function-like macro invocation with newlines.
The test has a newline before the left parenthesis, and newlines to
separate the parentheses from the argument.

The fix involves more state in the lexer to only return a NEWLINE
token when termniating a directive. This is very similar to our
previous fix with extra lexer state to only return the SPACE token
when it would be significant for the parser.

With this change, the exact number and positioning of newlines in the
output is now different compared to "gcc -E" so we add a -B option to
diff when testing to ignore that.
2010-05-17 13:21:13 -07:00
Carl Worth
acf87bc034 Fix bug (and add test) for a function-like-macro appearing as a non-macro.
That is, when a function-like macro appears in the content without
parentheses it should be accepted and passed on through, (previously
the parser was regarding this as a syntax error).
2010-05-17 10:34:29 -07:00
Carl Worth
420d05a15b Add test and fix bug leading to infinite recursion.
The test case here is simply "#define foo foo" and "#define bar foo"
and then attempting to expand "bar".

Previously, our termination condition for the recursion was overly
simple---just looking for the single identifier that began the
expansion. We now fix this to maintain a stack of identifiers and
terminate when any one of them occurs in the replacement list.
2010-05-17 10:15:23 -07:00
Carl Worth
81f01432bd Don't return SPACE tokens unless strictly needed.
This reverts the unconditional return of SPACE tokens from the lexer
from commit 48b94da099 .

That commit seemed useful because it kept the lexer simpler, but the
presence of SPACE tokens is causing lots of extra complication for the
parser itself, (redundant productions other than whitespace
differences, several productions buggy in the case of extra
whitespace, etc.)

Of course, we'd prefer to never have any whitespace token, but that's
not possible with the need to distinguish between "#define foo()" and
"#define foo ()". So we'll accept a little bit of pain in the lexer,
(enough state to support this special-case token), in exchange for
keeping most of the parser blissffully ignorant of whether tokens are
separated by whitespace or not.

This change does mean that our output now differs from that of "gcc -E",
but only in whitespace. So we test with "diff -w now to ignore those
differences.
2010-05-14 17:13:00 -07:00
Carl Worth
4eb2ccf261 Add test with extra whitespace in macro defintions and invocations.
This whitespace is not dealt with in an elegant way yet so this test
does not pass currently.
2010-05-14 17:03:43 -07:00
Carl Worth
f6ae186cfd Add test invoking a macro with an argument containing (non-macro) parentheses.
The macro invocation is defined to consume all text between a set of
matched parentheses. We previously tested for inner parentheses from a
nested function-like macro invocation. Here we test for inner
parentheses occuring on their own, (not part of another macro
invocation).
2010-05-14 16:51:54 -07:00
Carl Worth
92e7bf0f50 Add test for composed invocation of function-like macros.
This is a case like "foo(bar(x))" where both foo and bar are defined
function-like macros. This is not yet parsed correctly so this test
fails.
2010-05-14 11:50:33 -07:00
Carl Worth
db272e6e6f Add test for function-like macro invocations with multiple-token arguments.
These are not yet parsed correctly, so these tests fail.
2010-05-14 11:50:27 -07:00
Carl Worth
3014073311 Add test where a macro formal parameter is the same as an existing macro.
This is a well-defined condition, but something that currently trips up
the implementation. Should be easy to fix.
2010-05-14 09:53:50 -07:00
Carl Worth
af71ba41bd Add tests exercising substitution of arguments in function-like macros.
This capability is the only thing that makes function-like macros
interesting. This isn't supported yet so these tests fail for now.
2010-05-14 09:53:50 -07:00
Carl Worth
27bc8930ba Add some whitespace variations to test 15.
This shows two minor failures in our current parsing (resulting in
whitespace-only changes, oso not that significant):

  1. We are inserting extra whitespace between tokens not originally
     separated by whitespace in the replacement list of a macro
     definition.

  2. We are swallowing whitespace separating tokens in the general
     content.
2010-05-14 09:20:13 -07:00
Carl Worth
67c27afc16 Add test for an object-like macro with a definition beginning with '('
Our current parser sees "#define foo (" as an identifier token
followed by a '(' token and parses this as a function-like macro.

That would be correct for "#define foo(" but the preprocessor
specification treats this whitespace as significant here so this test
currently fails.
2010-05-14 09:20:13 -07:00
Carl Worth
4abc3dec72 Add tests for the structure of function-like macros.
These test only the most basic aspect of parsing of function-like
macros.  Specifically, none of the definitions of these function like
macros use the arguments of the function.

No function-like macros are implemented yet, so all of these fail for
now.
2010-05-13 09:35:50 -07:00
Carl Worth
a68e668b17 Add test case to define, undef, and then again define a macro.
Happily, this is another test case that works just fine without any
additional code.
2010-05-12 13:14:08 -07:00
Carl Worth
7bdd1f36d9 Add test for #undef.
Which hasn't been implemented yet, so this test fails.
2010-05-12 13:11:23 -07:00
Carl Worth
39cd7c2f2e Add test for an empty definition.
Happily this one passes without needing any additional code.
2010-05-12 12:49:07 -07:00
Carl Worth
df2ab5b992 Add tests defining a macro to be a literal and another macro.
These 3 new tests are modeled after 3 existing tests but made slightly
more complex since now instead of definining a new macro to be an
existing macro, we define it to be replaced with two tokens, (one a
literal, and one an existing macro).

These tests all fail currently because the replacement lookup is
currently happening on the basis of the entire replacement string
rather than on a list of tokens.
2010-05-11 12:39:29 -07:00
Carl Worth
34db0d332e Add a couple more tests for chained #define directives.
One with the chained defines in the opposite order, and one with the
potential to trigger an infinite-loop bug through mutual
recursion. Each of these tests pass already.
2010-05-11 12:35:06 -07:00
Carl Worth
49206ef4c8 Add test for chained #define directives.
Where one macro is defined in terms of another macro. The current
implementation does not yet deal with this correctly.
2010-05-11 12:29:22 -07:00
Carl Worth
e8c790b3ce Add a very simple test for the pre-processor.
Validate desired test cases by ensuring the output of glcpp matches
the output of the gcc preprocessor, (ignoring any lines of the gcc
output beginning with '#').

Only one test case so far with a trivial #define.
2010-05-10 16:21:10 -07:00