Table of Contents

Question

In all of the calling conventions explained, the return value is stored in a 32-bit register(EAX). What happens when the return value does not fit in a 32-bit register? Write a program to experiment and evaluate your answer. Does the mechanism change from compiler to compiler?

Answer

Let us consider the following C code:

extern "C" __declspec(noinline) unsigned __int64 __stdcall fun(
  void
) {
  return 0x4141414142424242;
}

Compiling it with x86 msvc v19.latest C/C++ compiler on godbolt.org generates the following assembly code:

_fun@0  PROC
        push    ebp
        mov     ebp, esp
        mov     eax, 1111638594                     ; 42424242H
        mov     edx, 1094795585                     ; 41414141H
        pop     ebp
        ret     0
_fun@0  ENDP

And compiling it with x86-64 icx 2022.0.0(Intel next-gen LLVM-based C/C++ compiler) generates identical assembly code(minus creation of the base frame pointer controlled by compiler options):

fun:                                    # 
        mov     eax, 1111638594
        mov     edx, 1094795585
        ret

From the above, we can clearly see that the hard-coded 64-bit integer value is returned in EDX:EAX(the high-order DWORD is returned in EDX register and the low-order DWORD is returned in EAX register).

Although it certainly can change from compiler to compiler, this is all governed by a standardized calling convention(here seen using Microsoft stdcall). On the 32-bit x86 architecture, all of the C/C++ calling conventions adhere to the following rule(s) for returning integral types:

  1. sizeof(return value) <= 32 bits, return value is stored in EAX
  2. 32 bits < sizeof(return value) <= 64 bits, return value is stored in EDX:EAX

When sizeof(return value) >= 64 bits or when returning large structures, the return value is stored in EAX register as a pointer to caller-allocated space on the stack that is passed to the callee as an implicit/hidden first parameter on the stack.

For returning floating-point types, either the SSE registers(XMM) are used or are returned via the x87 FPU register stack.