A recent question in one of the discussion forum led to the general question of using some of the performance libraries in Mac OS X with a specific emphasis on mixing C and Fortran, as well as implied data types in LAPACK. In this mini-tutorial I’m going to cover some of the basics of mixed C and Fortran using LAPACK (really the clapack interface) from the Accelerate framework. I’ll cover more of Accelerate when I resume that tutorial series.
C and Fortran
Calling functions/subroutines across C and Fortran can be a mixed bag. This is almost always due to name mangling of the routines that occurs at compile time. In general, without user intervention of any kind (such as typdefs) expect the C frontend to preserve the routine symbols in the form defined in the source file (plus a leading underscore). For example:
void foo_(){ return; }
void FoO(){ return; }
Will produce symbols in the compiled file that look like this (‘man nm’ for more information on symbol lookup):
_foo_
_FoO
You’ll note that in both cases the symbols have a leading underscore. C compilers typically do this to all external scope items to avoid name clashes from the runtime language support mechanisms. However, you’ll want to otherwise note that the case and trailing underscores are preserved as is.
Fortran on the other hand is slightly different. What you type and what you get are often very different. It can be complicated by the fact that the behavior can also be different depending on the compiler you use or the operating system. For example with gfortran:
subroutine foo()
subroutine FoO()
Will produce the same symbol:
_foo_
You’ll notice the following. First, a trailing underscore was automatically added. Second, in both cases the code produced a symbol in all lowercase. Now if you actually tried to write a Fortran program with both of those subroutines defined and compiled as is, the compiler would complain and halt. The trailing underscores with Fortran compilers are there for historical reasons and compatibility across compiler vendors. To suppress trailing underscores you would use a flag like -fno-underscoring in gfortran/g77. To use underscores (or to keep a second underscore from getting appended if you decide to name your subroutines with underscores for C compatability) or not depends on the situation. If you decide that you will manually include underscores in your names you may need to use -fno-second underscore. It can be tricky and confusing…
But what about the case sensitivity issue? There are ways to preserve case so that foo and FoO are treated differently although the settings vary depending on the compiler (for example with g77 you can use -fcase-preserve, the same flag does not appear to work with gfortran).
What to Expect
All things being equal, which they never are in practice, my recommendation is to expect that the C compiler will create symbols from routine names exactly as they appear in the source and for Fortran expect the names to be converted to lowercase with a trailing underscore. If you assume that, and depending on the code, mix of C and Fortran sources that may not be a valid assumption.
So let’s create an example, you’ll need gfortran installed on your computer and gcc which comes with Xcode or from here. We will do all of this from terminal for simplicity.
In the folder called part1 are a C and Fortran file. We are going to start from Fortran so open that first.
We’ve got our main program, and a subroutine called foo, foo then calls cfoo. So open the C file. In the C file, there is function called cfoo_ (note the trailing underscore). We’ve added the trailing underscore because, if you’ll recall, we are expecting the Fortran compiler to generate a trailing underscore.
Let’s compile them and see what happens:
gcc -c cfile.c
gfortran -o test fortran.f cfile.o
Before we run it, let’s check all of the symbols that contain foo:
[buster:part1] bubb% nm test | grep -i foo
00002a64 T _cfoo_
00002a38 T _foo_
And then let’s run the program:
[buster:part1] bubb% ./test
Hello from C
Sweet! We’ve now got a program that will make calls from Fortran to C. Let’s play some ping pong. Look at the C and Fortran files in part2. Basically we added a new subroutine in Fortran and call to it from C. Again, we are relying on the fact (in C) that the Fortran compiler will append a trailing underscore. You can get around having to mangle your functions using typedefs and compiler flags (but I won’t cover that here).
[buster:part2] bubb% gcc -c cfile.c
[buster:part2] bubb% gfortran -o test fortran.f cfile.o
[buster:part2] bubb% nm test | grep -i foo
000029cc T _cfoo_
00002948 T _ffoo_
0000291c T _foo_
[buster:part2] bubb% ./test
Hello from C
Hello from Fortran
Passing Variables
This is simple. By default, Fortran (77 at least) passes everything by address. In C it’s by value. So, when calling from Fortran to C you either need to explicitly send in a value (typically using %VAL(myVariable)) or more commonly in C, you take the value as a pointer.
Look in the folder called part3. We’ve modified foo (fortran.f) to send in a variable by address and by value. You’ll need g77 for this example as I’m not sure if gfortran supports the %VAL extension yet.
[buster:part3] bubb% gcc -c cfile.c
[buster:part3] bubb% g77 -o test fortran.f cfile.o -lSystemStubs
[buster:part3] bubb% ./test
Value from Fortran foo by reference: 1
Value from Fortran foo by value : 1
Similarly if you wanted to call the same C function from within C, you’d either pass in the address of the variable. Look at part4 to see more:
[buster:part4] bubb% ./test
Value from Fortran foo by reference: 1
Value from Fortran foo by value : 1
Calling from cfoo2
Value from Fortran foo by reference: 1
Value from Fortran foo by value : 1
Note in the C code we passed in the address of the variable as the first argument and the variable directly in the second.
Finally you can do pass through from Fortran to C to call another function in C. Loot at part5 for more information:
[buster:part5] bubb% gcc -c cfile.c
[buster:part5] bubb% g77 -o test fortran.f cfile.o -lSystemStubs
[buster:part5] bubb% ./test
Value from Fortran foo by reference: 1
Value from Fortran foo by value : 1
Note that we passed the first argument straight through (we already have a pointer and we need one for the first argument) and for the second we dereferenced it to send in the value at the address of that pointer.
“That’s great, but what does this have to do with LAPACK”
Ok, Ok… I’m getting to it. Let’s fast forward to LAPACK as implemented in Accelerate using the clapack entry.
The LAPACK header file is located in:
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Headers/
Open the file clapack.h in that directory.
At the top you’ll see:
#if defined(__LP64__) /* In LP64 match sizes with the 32 bit ABI */
typedef int __CLPK_integer;
typedef int __CLPK_logical;
typedef float __CLPK_real;
typedef double __CLPK_doublereal;
typedef __CLPK_logical (*__CLPK_L_fp)();
typedef int __CLPK_ftnlen;
#else
typedef long int __CLPK_integer;
typedef long int __CLPK_logical;
typedef float __CLPK_real;
typedef double __CLPK_doublereal;
typedef __CLPK_logical (*__CLPK_L_fp)();
typedef long int __CLPK_ftnlen;
#endif
typedef struct { __CLPK_real r, i; } __CLPK_complex;
typedef struct { __CLPK_doublereal r, i; } __CLPK_doublecomplex;
So to the first part of the question in the post/thread mentioned above. What are the data types that you need. That’s easy. The data types are typedef’d for you by Apple, so in reality you send in the the data type that corresponds to the normal C type. Some might argue that you should use the data type as it’s typedef’d by Apple. And they may be right. But unless I have to recreate some complex structure, it makes the most sense to simply use the intrinsic type that C (or Fortran provide).
Now the first place to look for the actual usage and implementation is in the documentation:
The documentation will cover in some detail what LAPACK can (and cannot) do. It will also cover issues related to sizing. In particular if you download the source of LAPACK and look at the file for the subroutine of interest you will see information regarding sizing and size dependencies, in addition to information about the variables and function purpose. What I’ll cover is how to access it in Accelerate from Fortran and C.
Look in the folder called part6. You will see a single Fortran source file with a coll to cgeql2. To compile the code and get it to link against Accelerate do the following:
[buster:part6] bubb% gfortran -o test fortran.f -Wl,-framework -Wl,Accelerate
[buster:part6] bubb% ./test
Of course the calculation doesn’t really do anything or print anything out. But if you wanted to use it with your own data, that is how you would.
How does this look from C?
From C you can use either the direct entry to a function if you are using Accelerate. For example a call from C to CGELQ2, can be done as:
cgelq2(&m;,&n;,a,&lda;,tau,work,&info;);
Or as:
cgelq2_(&m;,&n;,a,&lda;,tau,work,&info;);
Note that in both cases scalar values (m,n,lda and info) are passed in by address. Remember that in C arrays (either static or dynamic) are really pointers to a location in memory, so those don’t need any modification.
One other note, if you do as I said above and do NOT use the typedef’s in Accelerate for the data types, the compiler will complain. In particular if you define your own complex or double complex data type. Although the warning is harmless (because the data types are actually correct), this is one of those instances where it might be better to use the type defined by the header.
Am I giving mixed advice? Yes. But in the end you need to evaluate what is most important for YOUR code. Clean, self documenting code is important, but not at the expense of compile time errors or checks that can make debugging harder later on. So when in doubt use the typdef in the header, and make a note about that in your code (so someone else looking at your code doesn’t have to dig into the framework headers to see what is going on).
Ok that’s it for now.
Leave a Reply