Skip to content

Internal IR for compiler args#14748

Draft
dcbaker wants to merge 8 commits intomesonbuild:masterfrom
dcbaker:submit/compile-args-abstraction
Draft

Internal IR for compiler args#14748
dcbaker wants to merge 8 commits intomesonbuild:masterfrom
dcbaker:submit/compile-args-abstraction

Conversation

@dcbaker
Copy link
Member

@dcbaker dcbaker commented Jun 25, 2025

This is set to draft because it's very much a Work in Progress.

The way we handle compiler arguments in Meson is to manipulate raw strings. This works fine when the majority of your compilers are the same, or at least attempt to be compatible. Unfortunately, Meson has to deal with many different toolchains, which often have incompatible argument syntax.

This manifests in a number of different ways:

  • It makes de-duplicating or removing arguments difficult. Consider the high complexity of the CompileArgs class
  • It makes checking for a kind of argument difficult. Ie, we have a check for -Wl,-rlink or -rlink inside of build.py, but that could miss -rlink for many compilers.
  • It means when converting a compiler argument from one language to another, like when proxying linker arguments, we need to determine the format the argument is already in and then attempt to convert.
  • It makes it more difficult to set fields in the XCode and VisualStudio backends, which generally do not expect features to be set by compiler arguments, but by setting specific values. It is very easy to see this in the VS backend code, which has long if blocks converting string command line arguments to XML fields

In order to address this I propose moving to an abstract IR for these arguments. This abstract IR would simplify de-duplicating and cleaning arguments, as well as converting their forms. Each Compiler provides a mechanism to convert arguments from their format into the abstract IR, and a method to convert the abstract IR back into concrete arguments in their expected format. We then manipulate the abstract IR more easily. Consider something like this:

[Define('X'), Opaque('-ffoobar'), Warning('foo'), Warning('foo'), Warning('foo', enable=False), Define('X')]

In this format it's pretty easy to look and see that we have two copies of -DX and two copies of -Wfoo along with -Wno-foo, cancelling both of those out.

Below is a very trivial implementation of this idea, with just enough implemented for GCC or Clang to compile the 1 trivial test (run directly). This also is making use of python features we currently cant rely on, I'll rewrite those later.

Known TODO items:

  • Handle arguments that can be split but not required to be split (such as -D and -U with GCC/Clang)
  • Handle arguments that must be split (like --cfg with Rust)

@dnicolodi
Copy link
Member

FWIW, I like this!

:param enable: If true then enable the warning, otherwise suppress it.
"""

target: str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an enable member went missing here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I the enable docstring is a copy-paste error. There is not -Wno-error=... in any compiler AFAIK, so no need to represent it.

Copy link
Member

@eli-schwartz eli-schwartz Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/any/every/ ?

So I guess yeah, if it can't be universally represented it might as well be opaque whenever it appears.

Copy link
Member Author

@dcbaker dcbaker Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I had Warning() with both enable and error fields, but realized that you could end up with the invalid state of error disabled. I guess we could leave it and have some kind of validation pass later, but I prefer to not be able to model invalid state if possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-Wno-error is valid at least on gcc and clang. Given f.c:

int main()
{
	puts("aa");
}

you have:

$ gcc f.c -Wno-error=implicit-function-declaration -std=c99
f.c: In function ‘main’:
f.c:3:9: warning: implicit declaration of function ‘puts’ [-Wimplicit-function-declaration]


$ clang f.c -Wno-error=implicit-function-declaration -std=c99
f.c:3:2: warning: call to undeclared function 'puts'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]

Without the flag you get an error with both compilers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, GCC does not document that on their warnings page, just -W, -Wnoand-Werror=`

Since it does exist I'll fold it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

-Werror=
    Make the specified  warning into an error. The specifier for a warning is appended; for
    example -Werror=switch turns the warnings controlled by -Wswitch into errors. This switch
    takes a negative form, to be used to negate -Werror for specific warnings; for example
   -Wno-error=switch makes -Wswitch warnings not be errors, even when -Werror is in effect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could have an enum DISABLED, DEFAULT_LEVEL, WARNING, ERROR corresponding to -Wno-, -W, -Wno-error=, -Werror=.

Copy link
Member Author

@dcbaker dcbaker Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was looking at the htmlized version of the info page, It's probably there but not readily obvious

dcbaker added 8 commits July 7, 2025 13:02
Which I've somehow failed to add all of these years.
The goal is to be able to convert arguments from raw arguments that are
passed in via the `c_args` and friends methods, as well as from options
and the environment, into an abstract IR. This IR is then passed around
internally, and the backend is responsible for lowering it into the
format that it wants.

I'm hoping that this will simplify some amount of passing arguments
between languages (like C arguments being passed to non-c compilers, and
linker arguments between languages) as well as simplify the non-ninja
backends which generally use a different mechanism than raw compiler
arguments.
…arguments

These will be used to convert the arguments in many cases.
@dcbaker dcbaker force-pushed the submit/compile-args-abstraction branch from bdccf01 to fd3bc13 Compare July 7, 2025 20:08
@dcbaker
Copy link
Member Author

dcbaker commented Jul 7, 2025

@dnicolodi thanks for the reviews! I've fixed the things you pointed out.

Also in the latest update, linker arguments are now handled, along with a this the check for manually specified -rpath arguments has gotten better, as it no longer needs to check for strings, but can instead just see if there are any instances of Rpath() in the list.

There's still a lot of plumbing work to do here, apart from implementing all of this for various compilers.

elif arg.startswith('-U'):
ret.append(arguments.Undefine(arg.removeprefix('-U')))
elif arg.startswith('-l'):
ret.append(arguments.LinkerSearch(arg.removeprefix('-l')))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search?



@dataclasses.dataclass
class Undefine:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For warnings we have enable=True, why not the same for Define?

Copy link
Contributor

@bonzini bonzini Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact since (EDIT: fixed) -Dfoo and -Dfoo=1 are the same, could Undefine('foo') be replaced by Define('foo', value=None)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are NOT the same, one sets it to implicitly 1 and the other, explicitly the null string.

For both, ifdef says "yes it is defined" and redefining may result in compiler warnings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I really want to make it impossible to represent invalid state. So we basically have three distinct cases:
set define to value
set define to nothing
remove define

So... would representing it as:

class Define:
    name: str
    value: str | bool

work correctly?

You'd have value = False mean -Ufoo value = True mean -Dfoo and value = mean -Dfoo=<str>?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, sorry. I meant that -Dfoo and -Dfoo=1 could be canonicalized to a single form (Define('foo', '1') in the abstract form, whereas the concrete form could be either) and value=None could be used for undefining.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but would that mean the user sets -Dfoo=1 and we translate to -Dfoo (or vice versa)?

Copy link
Contributor

@bonzini bonzini Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes... OTOH you do want to collapse -Dfoo -Dfoo=1 to just one of them. Setting value='1' would also make __eq__ easier to implement.

Maybe some kind of have_equals: bool could be added to make round-trip conversion possible, and it would be ignored by __eq__.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that we have to guarantee we pass exactly the same arguments as long as we pass equivalent arguments. Especially since we already attempt to do some level of de-duplication. I'm perfectly fine with -Dfoo= becoming -Dfoo=1.

k, v = arg.split('=')
else:
k, v = arg, None
ret.append(arguments.Define(k, v))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defines can be spelled on the command line as -D KEY=value (with the space after the -D switch) meaning that they are seen as two arguments, not as one. Thus, I think that arguments parsing needs to be more sophisticated. Maybe using something like argparse may make the code simpler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible we will have to do something more sophisticated before this can land, right now my biggest concern is getting to the point that i can prove that the IR can do everything we currently do with the POSIX style string parsing, and solve some of the difficult problems we can't solve easily. So, for the moment I'd rather not get hung up here, but I will leave these opened to keep them in mind.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also an issue we'll need to resolve for Rust because --cfgs must be passed with a space in them.

k, v = arg, None
ret.append(arguments.Define(k, v))
elif arg.startswith('-U'):
ret.append(arguments.Undefine(arg.removeprefix('-U')))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same is true for -U.

@bonzini
Copy link
Contributor

bonzini commented Feb 23, 2026

#15577 should be revisited once this work is complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants