Cross Compilation and Module Archive Organization

TODO(aretm,30/10/2016) CHECK THAT THIS PAGE MATCHES THE IMPLEMENTATION!

There exists a separation between the host environment where sac2c runs, and one or more target environments where SaC programs/binaries run.

For example, when compiling SaC programs for freestanding OpenRISC on FPGA, the sac2c binary may run on a x86-Linux host computer. In this example, the host environment is x86-linux-gnu, and there is one target env called or1k-elf.

At this level, the binary format of the Tree files is the same binary format as the host environment, specifically the one used to compile sac2c (because they must be dynamically loadable).

Nevertheless, there may be different implementations of the same SaC module (hence with the same name) for different target envs. Because the implementations are different, the Tree files are different too. As a result, we must support multiple Tree files (potentially up to 1 per target env), although they are all compiled using the same binary format (that of sac2c).

Of course, as soon as the host env and the target env(s) are different, they will require different C tool chains. So one must be careful to compile the Tree C sources with the host env's C tools, not the target envs' C tools.

Within one target env, there is support for zero or more SaC Binary Interfaces (SBIs), which are mutually incompatible with regard to interoperability.

For example, code generated for separate RC protocols are not mutually interoperable, even though they may be binary compatible for the target env's ABI.

Within one SBI, the SaC compiler may be invoked with different code generation options for Module files, which influence how code is generated, but overall produces code that is mutually interoperable within that SBI.

For example, options that hint the cache sizes for with-loop optimizations do not influence interoperability of the generate code.

The different sets of “options” used in this way must be identifiable via configuration, eg. in sac2crc. Also, the binary files generated may need to be separately identified. For this, we introduce the notion of variants.

With the invariant: all variants of one Module file for a given SBI are mutually interoperable.

  1. The target mechanism in sac2crc presents a flattened version of all threee levels! Each target has a host environment specified in TARGET_ENV, and an SBI specified in SBI.
  2. The file structure described below shows that at any given time we only keep a single level 3 version for any combination of level 1 (TARGET_ENV) and level 2 (SBI)! If more than one target with identical TARGET_ENV and SBI need to coexist, two copies of the sources are required or a new SBI needs to be introduced.

SaC's private heap manager (PHM) gets conditionally linked with generated programs. It provides two APIs: the regular malloc-style API, and an “advanced” API with allocation hints.

There are currently four ways to compile programs:

  • generate code to use the malloc-style API, and link directly with libc,
  • generate code to use the malloc-style API, and link with PHM,
  • generate code to use the PHM API, and link with PHM,
  • generate code to use the PHM API, and link with a “compat” module which provides PHM-style wrappers around malloc/free but without PHM functionality.

With the proposed changes the selection happens as follows:

  • which API is used is decided by the SBI. A new sac2crc resource USE_PHM_API is defined for this purpose.
  • all programs are linked with libphm.a by default, or libphm.compat.a if the flag -nophm is provided at link time, or libphm.diag.a if the flag -check h is used.
  • on SBIs without the PHM API, both libphm.a, libphm.compat.a and libphm.diag.a are empty libraries.
  • on SBIs with the PHM API, libphm.a and libphm.diag.a contain the PHM, and libphm.compat.a contains the malloc wrappers.

On one filesystem root, there may be one or more host environments (multiple binary installs of the same version of sac2c). So we identify the host environment like everyone else does, using a canonical specifier string.

We can have multiple versions.

So we adopt the naming scheme used by gcc and everyone else:

$prefix
    `--- bin
    |     `------ sac2c-VERSION
    `--- share
    |        `------ doc
    |        |        `---- sac2c
    |        `------ man
    |        |        `---- man1
    |        |                `----- sac2c.1
    |        `------ sac2c
    |                 `---- VERSION1
    |                 |        `---- sac2crc
    |                 `---- VERSION2
    |                 |        `---- sac2crc
    |                 `---- ...
    `--- lib / sac2c / VERSION / modlibs
    |        `------ target-env1
    |        |           `--------- sbi1 (eg "mt-rc1")
    |        |           `--------- ...
    |        |           `--------- sbi2 (eg "mt-rc2")
    |        |           |            `--------- libScalarArithMod.a
    |        |           |            `--------- libScalarArithMod-bigcaches.a
    |        |           |            `--------- libScalarArithMod-specx1000.a
    |        |           |            `--------- libScalarArithMod-variant1.a
    |        |           |            `--------- libScalarArithMod-variant2.a
    |        |           |            `--------- libScalarArithMod-....a
    |        |           `--------- sbi3 (eg "cuda")
    |        |           |            `--------- libScalarArithMod.so
    |        |           `--------- sbi4 (eg "cuda+omp")
    |        |                        `--------- libScalarArithMod.so
    |        `------ target-env2
    |        `------ ...
    `--- lib / sac2c / VERSION / rt
    |        `------ target-env1
    |        |           `--------- sbi1 (eg "mt-rc1")
    |        |           |            `--------- libsac.a
    |        |           |            `--------- libphm.a
    |        |           |            `--------- libphm.diag.a
    |        |           |            `--------- libphm.compat.a
    |        |           `--------- sbi2 (eg "mt-rc2")
    |        |           |            `--------- libsac.a
    |        |           |            `--------- libphm.a
    |        |           |            `--------- libphm.diag.a
    |        |           |            `--------- libphm.compat.a
    |        |           `--------- ...
    |        `------ target-env2
    |        `------ ...
    |
    `--- libexec / sac2c / VERSION /
                    `----- sac2c
                    `----- sac2c-d
                    `----- sac2c-p
                    `----- tree
                             `--- target-env1
                             |      `----- libFibreIOTree.so
                             |      `----- ...
                             `--- target-env2
                                    `----- libFibreIOTree.so
                                    `----- ...

We want to support installing with –prefix=/usr/local (or maybe even –prefix=/usr) which means all the SaC hierarchy will be overlapping with whatever other tools are already there. This is why it is important to add sub-directories share/doc/sac2c, libexec/sac2c, lib/sac2c, and so on.

Alert: Be aware

Needs re-writing by Hans or Artem!

We distinguish between:

  1. default path(s): the directories populated during installation, which may be “system wide”
  2. custom paths(s): the directories written to by sac2c, which are likely not system wide (especially when the user does not have write permission to the installation directories)

At invocation time, a search path must be decided:

  1. the search path starts with $prefix (the default path)
  2. if a ~/.sac2c file does not exist, then:
    1. sac2c asks the user for a custom destination directory for generated files. If the user has no clue, the user is suggested to use the proposed path ~/.sac;
    2. the decided path is written to ~/.sac2c.
  3. ~/.sac2c is loaded, and the path in there is added at the beginning of the search path.
  4. If the user specifies (on the command line) an additional prefix, prepend that to the search path.
  5. Prepend the directory of -o to the search path.

At invocation time, a output path must be decided:

  1. if -o is specified, use that as output path.
  2. if -o is not specified and -install is not specified, use . as output path.
  3. if -install is specified, then…
  4. the output path is initialized to $prefix (the default path)
  5. if a ~/.sac2c file does not exist, then:
    1. sac2c asks the user for a custom destination directory for generated files. If the user has no clue, the user is suggested to use the proposed path ~/.sac;
    2. the decided path is written to ~/.sac2c.
  6. ~/.sac2c is loaded, and the output path is set to the content of that file.

All this is for modules. If we encounter something that is not a module, then -install is not valid. Moreover, if -o is not specified the output file name defaults to ./a.out if -o is specified, the output file name is set to that.

We want to reuse the existing “target” mechanism of the sac2crc parser, namely:

  • inheritance
  • flexibility: letting each level override variables / append values to variables of levels inherited.

We do this by adding new resources:

  • TARGET_ENV identifies the target environment;
  • SBI identifies the SBI;

We also eliminate LIB_VARIANT; it is obsoleted by the joint use of TARGET_ENV and SBI.

The power of the target mechanism at the same time is its weakness. In order to avoid complete chaos we informally distinguish between 3 different types of targets:

  1. basic targets that define a combination of TARGET_ENV and basis-SBI. For example, we have:
    target default_sbi:
    TARGET_ENV       :=  "host"
    SBI              :=  "seq"
    MT_MODE          :=  0
    BACKEND          :=  "C99"
    RC_METHOD        :=  "local"
    ...
    
    target default_mt :: default_sbi:
    SBI              := "mt"
    MT_MODE          :=  1
    USE_PHM_API      :=  1
    
    target omp :: default_sbi:
    SBI              := "omp"
    BACKEND          := "omp"
    MT_MODE          :=  2
    MT_LIB           := "omp"
    CFLAGS           += " "
    LDFLAGS          += " "
  2. modifier targets that define properties or variants that are mainly orthorgonal to the basic targets. These targets do not inherit from any basic targets. Examples are:
    target add_pth:
    LIBS             +=  " -D_THREAD_SAFE -pthread "
    CFLAGS           +=  " -D_THREAD_SAFE -pthread"
    SBI              +=  "-pth"
    MT_LIB           :=  "pthread"
    
    target add_lpel:
    SBI              +=  "-lpel"
    MT_LIB           :=  "lpel"
    CFLAGS           +=  " "
    LIBS             :=  "  -llpel "
    RC_METHOD        :=  "local_pasync_norc_desc"
    
    
    target add_rt:
    SBI              +=  "-rtspec"
    RTSPEC           :=  1
  3. pre-defined final targets that are often-used combinations of one basic target with a set of modifier targets. These should be defined by means of inheritance only. Examples are:
    target seq :: default_sbi:
    
    target mt_pth :: default_mt :: add_pth:
    
    target mt_pth_rt :: mt_pth :: add_rt:
    
    target mt_lpel :: default_mt :: add_lpel:

For better maintainability of the developer version, the monolithic sac2crc file is split up into several files that contain logically coherent sets of targets. All these can be found in the setup directory. They are being combined via autoconf/configure through the file sac2crc.pre.in in that very directory.