Project

General

Profile

Eigen compiler suite compared to LLVM etc.

Added by Rochus Keller over 1 year ago

I was made aware of the Eigen Compiler Suite today (https://github.com/rochus-keller/Oberon/discussions/55). This looks like a magnificent project and an incredible amount of work, thanks for sharing it!

Now that I have gained a first insight into the documentation and the source code, please allow me to ask you the following questions.

  1. Functionally, the comparison with LLVM and Clang is apparent. Where do you see the strengths of Eigen compared to the technologies mentioned?
  1. With around 65kSLOC C++ (~90kSLOC including assembler and def files) the project seems closer to e.g. QBE (https://c9x.me/compile/) or MIR (https://github.com/vnmakarov/mir) than to LLVM. Would it make sense to position Eigen as a flexible compiler backend, i.e. as an alternative to the technologies mentioned? Where do you see the strengths compared to QBE and MIR?
  1. The development of Eigen seems to have started around 2020. Can you disclose where Eigen is used, and how far away the project is from an official release?
  1. There is an AMD64 backend. Are there plans for an x86 backend?

Replies (39)

RE: Eigen compiler suite compared to LLVM etc. - Added by Florian Negele over 1 year ago

Hi Rochus, thanks for the interest.

Functionally, the comparison with LLVM and Clang is apparent. Where do you see the strengths of Eigen compared to the technologies mentioned?

It is a much simpler project and thus maintainable by a single person, hence its name. See also https://ecs.openbrace.org/manual/faq.html

Where do you see the strengths compared to QBE and MIR?

I don't know these projects. The ECS is not only a back-end but aims to be a simple and minimalistic but complete and self-contained toolchain, see https://ecs.openbrace.org/manual/manualse2.html

Can you disclose where Eigen is used, and how far away the project is from an official release?

I don't know where it is used or when it will be officially released. A release candidate might be version 0.2, see https://software.openbrace.org/projects/ecs/roadmap

There is an AMD64 backend. Are there plans for an x86 backend?

The AMD64 back-end can generate 16-bit, 32-bit and 64-bit machine code, see https://ecs.openbrace.org/manual/manualse43.html

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

Thanks for your reply.

and thus maintainable by a single person, hence its name.

Given that the single person has enough knowledge about all the different architectures in the necessary detail, as well as all architecture-specific details of the code generator, and also detailed knowledge of the C++23 standard and also all object file formats with their platform-specific details. There are probably only a handful of people on the planet who can do all this themselves. Apparently you're one of these which is impressive.

I don't know these projects.

But how did you come to build this large, comprehensive toolchain if there wasn't a specific customer? Can the project be considered a spin-off from A2 or from another ETH project? Are you also professionally involved in compiler development, or how did you arrive at the choice of architectures, e.g. AVR, M68k, PPC, etc., which tend to indicate a very specific application purpose?

An official release candidate might be version 1.0

Thanks for the link, but I don't see an entry for 1.0; the most recent one I see is 0.5 and there are no line items.

RE: Eigen compiler suite compared to LLVM etc. - Added by Florian Negele over 1 year ago

There are probably only a handful of people on the planet who can do all this themselves.

I don't know about this, but I would be all the more more impressed by people who can understand and maintain the complete code base of big open-source compilers like GCC and Clang.

Can the project be considered a spin-off from A2 or from another ETH project?

It started as a hobbyist project and was later used for rapid prototyping for some personal research at ETH but there is no official affiliation. The architectures are basically chosen based on popularity and personal interest.

Thanks for the link, but I don't see an entry for 1.0

I meant version 0.2, sorry. I changed my previous message accordingly.

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

research at ETH

What projects are you working on at ETH?

I meant version 0.2

Ok, I see, thanks. Now I think I have understood that "versions" represent rather feature sets than the sequential evolution of the system. Btw. according to version 0.1 the ARM32 implementation seems to not have startet, but it exists anyway in the code.

Meanwhile I was able to successfully build the toolchain on Linux x64. I will try to use the generated executable to build the toolchain itself for my i586 machine and continue to use it there. Is there a description how to accomplish this?

Would you consider it to be feasible to port your code to C++11, or is it too heavily depend on C++17 or 14?

RE: Eigen compiler suite compared to LLVM etc. - Added by Florian Negele over 1 year ago

Btw. according to version 0.1 the ARM32 implementation seems to not have startet, but it exists anyway in the code.

It is completed but it only implements ARMv7, not the newer architectures like the other ARM-based back-ends which is why I resetted the progress bar for the moment.

Is there a description how to accomplish this?

The C++ front-end is nowhere near to accomplish this, sorry. You might be more successful compiling directly with GCC on your x64 machine using the -m32 and -march=i586 flags or a cross-target version of gcc and glang.

Would you consider it to be feasible to port your code to C++11, or is it too heavily depend on C++17 or 14?

I think that is a lot of work as the code base changed to C++14 and then C++17 almost six years ago. But I can provide the last C++11 version if that helps your endeavour.

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

it only implements ARMv7

Ok, thanks. That's more than good enough for my purpose.

The C++ front-end is nowhere near to accomplish this, sorry.

So I have mistakenly confused the intention of version 0.1 with the status. I will then try my luck with cross-compiling. Anyway, implementing a C++23 compiler looks like a lifetime's work from my humble perspective.

changed to C++14 and then C++17

What was the primary intention for this?

I can provide the last C++11 version

But would backporting all new features and fixes you added to the backend during the last six years not be much more work than migrating from the C++17 version (possibly using a transpiler)? I personally still implement most of my side projects in C++03 with a minimal Qt kernel (https://github.com/rochus-keller/LeanQt) instead of the standard library, mostly to reduce dependability and complexity.

RE: Eigen compiler suite compared to LLVM etc. - Added by Florian Negele over 1 year ago

What was the primary intention for this?

The code base revealed a lot of shortcomings of the compilers I used at the time. Upgrading to newer versions seemed to be a good opportunity to also switch to the newer standards.

But would backporting all new features and fixes you added to the backend during the last six years not be much more work than migrating from the C++17 version (possibly using a transpiler)?

Yes of course, but do such tools actually exist?

I personally still implement most of my side projects in C++03 with a minimal Qt kernel instead of the standard library, mostly to reduce dependability and complexity.

That is certainly an interesting approach. I sticked to the standard library instead but for the same reasons.

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

but do such tools actually exist?

It can be done by using the Clang libraries and rewriting the AST, as e.g. demonstrated here: https://github.com/neobrain/cftf and https://www.youtube.com/watch?v=Rk2NOee4D7o.

I sticked to the standard library instead but for the same reasons.

Personally I consider the tendency to shift more and more statements to header files a complete madness. I consider GCC 4.8 the best GCC version ever. The generated code is n times faster compared with todays versions and compile times are also much faster. I also find that C++ is turning into an increasingly terrible language, and it's more and more likely that a given code will only work on a subset of the available compilers. C++11 brought a few advantages, but is essentially syntactic sugar throughout. You could do everything before with Boost, or even Qt, without the exorbitant compile times.

RE: Eigen compiler suite compared to LLVM etc. - Added by Florian Negele over 1 year ago

It can be done by using the Clang libraries and rewriting the AST

Thanks, will give it a try.

I consider the tendency to shift more and more statements to header files a complete madness

C++ modules will hopefully reduce that complexity and overhead. I regard move semantics the single most important addition to C++11 which finally completed the resource management facilities and thus can hardly be considered syntactic sugar.

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

C++ modules will hopefully reduce that complexity and overhead

There is no reason for this assumption, as templates must still be present quasi in source form (even if somehow encoded), and as before will be the primary cause of slowing down the compilation process and increasing the size of the machine code.

I regard move semantics the single most important addition to C++11

From my point of view, this is a rather unfortunate extension, which significantly increases the complexity of the specification and language, and leaves even more implicit decisions to the compiler, which are even less comprehensible to the average programmer (often one can only find out via debugger or disassembly whether a given compiler actually makes a move or not.). In addition, the function was already available before, e.g. via swap in the STL, or via Boost.Move (which also demonstrates well that even the new "move semantics" is syntactic sugar). Even in Qt the problem was already sufficiently well solved with implicitly shared classes.

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

Last night I was finally able to build the suite so that the result runs on my old EliteBook main development machine. I eventually had to set up a Debian 12 i365 VM and built with -march i586 and -static. The resulting binaries are around five times bigger than the dynamically linked versions, but at least they run on my old machine.

I must also say that the more I read about and experiment with your compiler suite, the more amazed and impressed I am. This is really magnificent work.

Currently I try to run the Hennessy benchmarks (attached). There doesn't seem to be an Input.Time() equivalent around, so I was thinking about an external call to e.g. the C standard library. Am I right to assume that linking with pre-existing C static or dynamic libraries is not yet supported? What do you recommend I should do?

RE: Eigen compiler suite compared to LLVM etc. - Added by Florian Negele over 1 year ago

Am I right to assume that linking with pre-existing C static or dynamic libraries is not yet supported?

It is supported. You can compile your benchmark together with a C++ source file that includes runtime/linuxlib.hpp and uses the LIBRARY and FUNCTION macros, see runtime/libsdl.cpp for an example. You then just have to add an external forward declaration in Oberon for each C function, see libraries/oberon/api.sdl.mod.

In case you just need a monotonic timer, you can also use the Linux module which already provides the necessary wrappers for some system calls. The following module provides a simple Oberon interface for the sys_clock_getres and sys_clock_gettime system calls. The frequency is expressed in Hertz:

MODULE Timer;

IMPORT SYSTEM, Linux IN API;

CONST Clock = Linux.CLOCK_MONOTONIC;

TYPE Counter* = HUGEINT;

PROCEDURE GetCounter* (): Counter;
VAR timespec: Linux.Timespec; result: Counter;
BEGIN
    ASSERT (Linux.ClockGetTime (Clock, SYSTEM.ADR (timespec)) = 0);
    result := timespec.tv_sec; result := result * 1000000; INC (result, timespec.tv_nsec DIV 1000); RETURN result;
END GetCounter;

PROCEDURE GetFrequency* (): Counter;
VAR timespec: Linux.Timespec;
BEGIN
    IF Linux.ClockGetRes (Clock, SYSTEM.ADR (timespec)) # 0 THEN RETURN 0 END;
    ASSERT ((timespec.tv_sec = 0) & (timespec.tv_nsec = 1)); RETURN 1000000;
END GetFrequency;

END Timer.

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

Great, thanks; and sorry for my seemingly silly questions, but the amount of information is just overwhelming (which is great, but unusual); I should first spend a week to read everything, but I'm impatient.

I was able to build the attached version of Hennessy with the ECS Oberon compiler; I also run the same module (just with the Oakwood Input and Out modules instead) with the C and CIL code generated by my OBX compiler. Please find the results in the attached PDF. It's interesting to note that ECS achieves about the same performance as Mono 5. The results are likely not representative though, since Hennessy only consists of micro-benchmarks. I will try to compile my C++ version of the Are-we-fast-yet with ECS, which produces more reliable performance figures.

Please also note that I get an illegal instruction error in the FFT benchmark with the ECS generated code. I didn't analyze the cause yet.

linking with pre-existing C static or dynamic libraries ... It is supported.

Does this mean that I could link the ECS generated code with a static library (*.a, *.lib) generated e.g. by the GCC or MSVC toolchain, or even with the object files generated by these compilers?

You can compile your benchmark together with a C++ source file

You mentioned above that the ECS C++ compiler is not yet able to compile ECS itself. What is missing? What kind of C++ or C programs can it compile? Would it make sense to port a C compiler to ECS (e.g. chibicc, which is small and includes the preprocessor, see https://github.com/rui314/chibicc)? Or is this already sufficiently covered by your C++ compiler?

RE: Eigen compiler suite compared to LLVM etc. - Added by Florian Negele over 1 year ago

Please also note that I get an illegal instruction error in the FFT benchmark with the ECS generated code.

Your code revealed a stack alignment problem, thanks for reporting. Please use the attached patch file.

Does this mean that I could link the ECS generated code with a static library

I only tested it with dynamic libraries (*.so, *.dll).

What kind of C++ or C programs can it compile?

It is basically C with namespaces but a lot of things are untested or not implemented. Classes, exceptions, and especially templates are not supported.

alignment.diff (770 Bytes) alignment.diff Stack alignment fix

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

Thank you very much for your support.

I applied the fix and recompiled and re-run the benchmark. Now FFT indeed doesn't crash anymore, but it is very slow (a factor 100 slower than the optimized C version, see attachment). The other benchmark maintained the factor.

RE: Eigen compiler suite compared to LLVM etc. - Added by Florian Negele over 1 year ago

Now FFT indeed doesn't crash anymore, but it is very slow

The compiler uses the legacy floating-point unit per default in 32-bit mode, see https://ecs.openbrace.org/manual/manualse44.html. In case your machine supports media floating-point instructions, you can change this behaviour by changing the second to last parameter of line 51 in file tools/compiler.cpp to true. Please note that the compiler is not designed to do a lot of optimising.

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

Thank you again for your support.

This is amazing! Now the generated code is even significantly faster than the unoptimized GCC, and anyway faster than TCC (which does virtually no optimization either), see attached PDF. Factor 2.5 off the optimized GCC 4.8 result is good and makes your compiler backend very attractive!

Please note that the compiler is not designed to do a lot of optimising.

I was aware of that, but I was suspicious because the FFT result was so far off the results of the unoptimized GCC and TCC, which looked like an issue. Even now I wonder why we are a factor two slower in FFT than TCC and unoptimized GCC; I assume ECS has yet another option to keep up, isn't it?

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

PS: attached is the C version of the Hennessy.Mod generated with my current OBX IDE in case you want to make experiments yourself.

RE: Eigen compiler suite compared to LLVM etc. - Added by Florian Negele over 1 year ago

Thanks. I was wondering what the purpose of Uniform11 is because it seems to discard its result?

I assume ECS has yet another option to keep up, isn't it?

None I am aware of, sorry.

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

I was wondering what the purpose of Uniform11 is because it seems to discard its result?

Yes, looks strange; but it's the same as in the original Hennessy.Mod found in OLR or Ofront, and I didn't question it. Even the version migrated to Oberon+ maintains it.

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

Brief update of my journey:

  • Yesterday evening I started to migrate a the subset of your source code which seems relevant for my purpose to C++11. It's about 16kSLOC, and as it seems it doesn't require too many changes. In a few hours I was able to successfully compile amd64.cpp, object.cpp, driver.cpp, amd64assembler.cpp, amd64generator.cpp, code.cpp, and assembly.cpp. So this looks feasible if need be, and I leave it for now.
  • Today I finally managed to run cppamd32 and cppcode with a suitable example. I now study the generated intermediate code in parallel with your book and hope to understand it well enough to be able to re-target the codegen.c file of the chibicc compiler from AMD64 to your intermediate language. Fortunately, there are a few days off. Let's see how far I get.

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

Have you actually tried to compile the ECS code with VisualStudio? I just installed VS2022 (CL.exe version 19.39), added all files required for cdamd32 with defines to a VS console project, and get a couple of errors (e.g. cannot convert AMD64::Operand::Size to Code::Type::Size, or accessing protected enum AMD64::Operand::Model, etc.).

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

I did some experiments with obcode and cppcode (using ecsd -i).

When I try to process the output of obcode with cdamd32, I get syntax errors (see attached simple.cod and Hennessy.cod). The output of cppcode seems to work though (see attached hello.cod). Could you please check.

RE: Eigen compiler suite compared to LLVM etc. - Added by Florian Negele over 1 year ago

Have you actually tried to compile the ECS code with VisualStudio?

Development began with MSVC but I later basically abandoned it because there were a lot of non-standard conformant issues like the one you described. I tried several times with each new version of CL but most of the issues never got fixed.

When I try to process the output of obcode with cdamd32, I get syntax errors

These two compilers are not designed to be used in a chain, since obcode targets a virtual platform which differs from the target of cdamd32. The platform is described using a data structure called layout. obcode uses the so-called standard layout see tools/stdlayout.hpp while cdamd32 uses a layout that describes the x86 platform in tools/amd64generator.cpp line 212. You should be able to use these two compilers in a chain if you make the platform of obcode match that of cdamd32.

RE: Eigen compiler suite compared to LLVM etc. - Added by Rochus Keller over 1 year ago

began with MSVC but I later basically abandoned it because there were a lot of non-standard conformant issues

That's another challenge of "modern modern" C++; the more modern, the more cross-compiler issues. I took this as an opportunity to finish the C++11 migration of the cdamd32 tool; it now compiles on my old machine with GCC 4.8 and generates the same output as the original for my test cases; here is the source code: https://github.com/rochus-keller/EiGen

LeanQt has turned out not to be necessary; I just had to add a cross-platform is_directory function.

Now I will try to make it compile with MSVC 2014 (which is the version I use for my other projects which include some C++11 code bases).

since obcode targets a virtual platform which differs from the target of cdamd32.

Ok, I see, thanks. But apparently the output of the C++ compiler is compatible, which is actually what I want (I'm looking for a robust cross-platform backend for my Micron language which is supposed to use the C ABI). I will check the code for the difference anyhow.

(1-25/39)