Writing Custom Startup Code

From MorphOS Library

Revision as of 15:37, 29 December 2012 by Krashan (talk | contribs) (More fixes.)

Grzegorz Kraszewski

One can find in every C programming handboook, that program execution starts from main() function. In fact we have such an impression, when we write a program. There is no evidence to disprove it. In fact however, this is not true. There are at least a few, and sometimes even a few teens of kilobytes of code between the start of program execution and the first line of main(). Most of this code is not really needed in most cases.

What is being done by this code? Let's say for the start, it is perfectly possible to have a program without any startup code at all. The system can jump into main() right away. Unfortunately such a program would run only from commandline window. It would crash, when started from Ambient. It is because Ambient sends a special message to a message port of a freshly created process. This message serves two purposes. Firstly, it contains Ambient launch parameters, namely program icon descriptor and optionally descriptors of other icons, which have been shift-clicked, or dropped onto a panel. Secondly, a reply for the message is a signal for Ambient that the program finished its execution. Sending a reply is obligatory. This is the minimal set of things to be done by startup code. In practice it also should open required shared libraries. When one wants to use the C standard library, either with libnix or ixemul.library, the startup code also creates a "standard environment" for the C standard library and POSIX functions (more on this). Because of this, startup code linked when using one of these libraries is quite complex, so also long.

Reasons for Writing Own Startup

The main advantage of an own startup is its shortness. Reducing program startup time is negligible. Very short startup is good for very short programs (for example shell commands), a few kB in size. In this case the standard startup may be easily longer than the program code itself. One can also use own startup just for satisfaction of making the program shorter by those few kilobytes. Custom startup code cannot be used, when the program uses ixemul.library. When the program is linked with libnix, the possibility of using own startup depends on the standard C library functions used. Most of them do not need any preparations and will work with any startup. Some more complex functions however require constructors to be executed in startup. If we use such functions, we will get linker errors of unresolved symbols. In such a case there is a simple choice – one either must replace these functions with something else, or just use the standard startup code. Own startup is then useful mostly when standard C library is not used at all (in favour of the native MorphOS API), or only simple functions from it are used.

If we are still determined to use own startup, it is the time to tell the compiler about it. Skipping standard startup is done with −nostartfiles argument. Then when we try to use our startup with libnix, we use −nostartfiles together with −noixemul. Programmers wanting to go the pure MorphOS API way (without the C library), should use −nostdlib option, which also implies −nostdlib.

Let's Write It

Before we start to write the code, note that except things executed before calling the main() function, some code must be also called after it returns. Then we also have "cleanup code". As this code is usually placed in the same function (the one that calls main()), both the parts are commonly called just startup code.

As mentioned before, program execution does not really start from the main() function. Where does it start then? When an ELF executable is loaded from disk, a section named ".text" is found and operating system jumps to the start of its contents. When a program is written in C, it means start of the first function in code, as in C there is no way to write code outside of a function. It must be noted, that C compiler may reorder functions in a single object file. The GCC 2.95.3 compiler never does it, but aggressive optimizer of GCC 4 can change order of functions. Fortunately it is done only inside a single source file. To make sure that our startup function will be the first, it must be placed in a separate file. Then resulting object file must be linked as the first one, as linking order is always preserved. After this important note it is time for the code:

#include <proto/exec.h>
#include <proto/dos.h>
#include <dos/dos.h>
#include <workbench/startup.h>

The thing starts with including needed header files. We will need two basic system libraries: exec.library and dos.library. It explains why standard startup code, be it libnix or ixemul.library, opens these two libraries – it simply needs them for itself.

struct Library *SysBase;
struct Library *DOSBase;

As our code will use these two libaries, we need to define their bases.

extern ULONG Main(struct WBStartup *wbmessage);

This is a declaration of the main function of our program. As the object file containing startup code should contain only one function (the entry one), the rest of code has to be moved to other object files, for reasons explained above. That is why the main function has to be declared here, as we call it from the startup code. Alternatively its declaration may be placed in some header file and included here. The name Main() is arbitrary, it can be anything. I've just called it typically, capitalizing the first letter to avoid possible name confilct with the standard library. The argument of Main() is startup message (mentioned above) being sent by Ambient. If we do not plan to use it inside Main(), we can just declare it this way:

extern ULONG Main(void);

The next important thing is to define a mysterious global symbol __abox__.

ULONG __abox__ = 1;

While not needed in the code, this symbol is used by the system executable loader to differentiate between MorphOS ELF executables and other possible PowerPC ELF binaries. If there is no __abox__ defined, our code won't run.

ULONG Start(void)
{
  struct Process *myproc = 0;
  struct Message *wbmessage = 0;
  BOOL have_shell = FALSE; 
  ULONG return_code = RETURN_OK;

Start() is the code entry point. Again, name of this function is not important, it may be anything. It just has to be the first function in the linked executable. Some local variables are declared here, which will be needed later. myproc will contain a pointer to our process, wbmessage will hold the Ambient startup message pointer. Variable have_shell will be used to detect if the program has been started from shell console or from Ambient. Finally return_code is just the return code of the program, it will be returned to the system. The return value is usually 0 when the program executed succesfully and RETURN_OK constant is just 0.

  SysBase = *(struct Library**)4L;

Time for initialization of the SysBase, the base of exec.library. The library is always open. For historical and backward compatibility reasons the base pointer is always placed by the system at address $00000004, so we just take it from there. Having exec.library available, our code can check whether it has been started from shell or from Ambient:

  myproc = (struct Process*)FindTask(0);
  if (myproc->pr_CLI) have_shell = TRUE;

This information is taken from the Process structure being just system process descriptor. The exec.library call FindTask() returns own descriptor of the caller if 0 is passed as its argument. In case we are started from Ambient, receiving its message is compulsory:

  if (!have_shell)
  {
    WaitPort(&myproc->pr_MsgPort);
    wbmessage = GetMsg(&myproc->pr_MsgPort);
  }

The startup message is being sent to process system port, so we receive it there. The message may be then passed to our Main() function, if we plan to make some use of it, like handling additional icon arguments.

  if (DOSBase = OpenLibrary((STRPTR)"dos.library", 0))
  {

The next step is opening dos.library, this library is opened in a pretty standard way. In fact this minimal startup code does not need it. There are two reasons to open it anyway. First, it is hard to imagine a program, which does not need dos.library – even "Hello world!" needs it. Secondly, all standard startup codes open it, so usually main code takes it for granted. Then my startup behaves conventionally and opens dos.library as well.

    return_code = Main((struct WBStartup*)wbmessage);

Yes, after these few lines we are ready to call the main code. As stated above, passing the startup message from Ambient is optional. On the other hand, receiving the result and passing it back to the system later is obligatory.

    CloseLibrary(DOSBase);
  }
  else return_code = RETURN_FAIL;

From this point the startup code becomes cleanup one. Note also that proper error handling must be done. dos.library is being closed, but if its opening failed before, the result of execution is changed to RETURN_FAIL. This is the hardest fail and means total inability to execute. In practice MorphOS can't boot if dos.library is not present in the system. But OpenLibrary() may fail for other reasons, for example simple lack of free memory. Then the startup code has to handle it in some reasonable way.

  if (wbmessage)
  {
    Forbid();
    ReplyMsg(wbmessage);
  }

This snippet of code handles the Ambient startup message. Even if we make no use of it, it must be replied at exit. But what does Forbid() do here? This function halts system multitasking, specifically it prevents the system process scheduler to switch our process away. Usually it may be done for a very short period of time only and followed by a matching Permit(). At the first glance this code makes no sense then, a process stops process switching and... exits. We have to know one important thing however: process switching is automatically reenabled when the process which called Forbid() ends. Then here is what happens:

  • Our task calls Forbid(), so no other process can interrupt it.
  • It replies the Ambient startup message. As multitasking is stopped, Ambient is unable to receive yet. The message just waits at its message port.
  • Our task exits. Then the system restores multitasking.
  • Ambient gets CPU time and receives the message. Note that at this point it is absolutely certain, that our task does not exist anymore. Possibility of a race condition is eliminated. Without Forbid() it could be possible that our process is removed from the system while it still executes.

Of course multitasking halt period is extremely short, because our cleanup code ends immediately after replying to Ambient:

  return return_code;
}

$VER: – program identification string

This topic is not strictly related to startup code, but the version string is usually placed in it, so I've decided to write a few words about it. The version string is a short text in some defined format. This string contains the program name, version and revision number, compilation date and optionally copyrigth or author info. The version string is decoded by many applications including Ambient, the system command version, the Installer program and more. The text starts with $VER:, so it can be easily found in the program executable. As version tools search for the version string from the start of the executable file, it is best if version string is placed as close to the beginning of the file as possible. If the version string is declared as a simple string constant, it is unfortunately placed in one of the ELF data sections. These sections are placed after the code section by the linker. However we can force the version string to be placed in the code section:

__attribute__ ((section(".text"))) UBYTE VString[] =
  "$VER: program 1.0 (21.6.2011) © 2011 morphos.pl\r\n";

Using a GCC specific extension __attribute__ we can push the string into the ELF section named .text, which is the code section. As the startup code object is linked as the first object, the version string will appear at the beginning of the executable, just after the code of the Start() function. Why after? It is simple, if we place it before the real code, the operating system will jump "into" the string, trying to execute it, and then of course it will crash.

A Complete Example

A complete "Hello world!" example with custom startup code shows the described ideas at work. It only uses the MorphOS API, so is compiled with −nostdlib option. Executable size is 1 592 bytes. For comparision, libnix startup and printf() gives 30 964 bytes, when one replaces printf() with MorphOS Printf() from dos.library it is still 13 500 bytes.

As the project consists of two *.c files, a simple makefile is added to it. Example may be compiled just by entering make in a console.