fredag, oktober 10, 2008

FPU issues when interoping Delphi and .net

In a project recently, we needed to do some interop between Codegear Delphi and MS .Net, that is, we have .net code that needed to be wrapped as an COM object so it can be called from native code or Delphi.

The developer of the .net library created an C++ client to test the library and everything working fine.
Then we develop the Delphi client, and after some googling etc we are good to go. Here are some articles describing the details:

http://interop.managed-vcl.com/netinterop_csharp.php
http://www.drbob42.com/examines/examin36.htm
http://www.blong.com/Conferences/BorCon2004/Interop2/COMNetInterop.htm#CCW

But then... one method throws an exception:

Project TestClient.exe raised exception class EOleException with message 'Overflow or underflow in the arithmetic operation'.

What is this?, the method are doing some serious calculations, but it works fine from C++, and from other .net clientes that have been usiing the library for years, newer a hitch, so what gives?
I did not VisualStudio on the computer where I run Delphi, so I could not debug it in a proper fashion, and if I had, I guess I would struggle a bit to get it working, but anyway, I started to pepper the code in the library with some loggingstatement, to see what was happening, and surely, I found one point in the code that throwed the exception.
It was just one problem.. We do not get any exception at that point in the code...

Boiled down to basics, this is the code in C#:

double x = Double.NaN;
bool b = Double.IsNaN(x);

At this point, it's the IsNaN that throws, so modified a bit:

double x = Double.NaN;
bool b = (x==2.0);

And now it is the x==2.0 that throws??
Then I try:

double x = 3.2;
bool b = (x==2.0);

This worked..., and this:

double x = 3.2;
bool b = Double.IsNaN(x);

that worked to..... hm.. firing up Lutz Roeder's Reflector (yes, I know, it's Red-Gate now, but for me, it's always Lutz's Reflector..), and searching for the Double.NaN I find:

public const double NaN = (double) 1.0 / (double) 0.0;

WHAT?, but, isn't 1/0 an divide by zero?, so why does this work in C# plain, but not when called from Delphi?
At this point I remembered some years ago, when Delphi was the tool of choise.. there have always been some funky stuff going on in the RTL package of Delphi, but what could this be..

I started to toss some ideas to a friend of me, mashi, who is an serious bitfiddler, out of the blue,
he just asked me:

"maybe you need to flip a bit or two int the FPU's control register"

I was stoked.. the FPU??, why is that?, and then he could tell me that some DirectX libraries also changes the FPU settings to give more juice when doing some calculations..

Whoha, that's funny, but after looking at this article:

http://webster.cs.ucr.edu/AoA/Windows/HTML/RealArithmetic.html

I find that there are 6 bits in the FPU's control register that actually controls exception handling by the FPU..

image (It’s the yellow one’s whe are after..)

So, to confirm, we need to find the current state, and luckily both Delphi and Visual Studio have debug-windows, that show us the value of the CTRL register, and they have very different values, in Delphi it’s 1272, and in C# it’s 027F (both number is hex), so, if we look at them as binary numbers:

.Net   027F = 0000 0010 0111 1111
Delphi 1272 = 0001 0010 0111 0010



For us, bit number 2 (it's zerobased so for you nongeeks (which lost this before reading this far anyway’s) it's bit 3 from the right), this bit is set to 1 in .net, and zero in Delphi..




After some more googling, I found this:


http://www.stats.uwo.ca/faculty/murdoch/software/compilingDLLs/pascal.html


Which also say something abut this under the title Preserving Registers:

The only problem that is likely to arise is with the floating point processor (FPU) registers. Some versions of Delphi change the FPU control word upon entry to a DLL (but this is not true of Delphi 5);





Yeah, I know, the R thingy that that article is about, is probably some other thing that is far away from thiw, but it told me what I needed to hear. :)




So, armed with knowledge, I wanted to change the CTRL registers to see what would happen, and what do you know, even MS have an Q article about this very problem, but in an little bit different context:




PRB: System.Arithmetic Exception Error When You Change the Floating-Point Control Register in a Managed Application

http://support.microsoft.com/kb/326219




So, after implementing this simple line in C#:


_controlfp(_CW_DEFAULT, 0xfffff);

everything works..


What we still don't know though.. will this have any sideeffects?, I guess time will tell..





I'm guess we should save the currect CTRL word, and restore it when we return from our method, but I'm not yet sure if Delphi also does it, so we need to do some test before we conclude, but at least, the NaN code is not giving us any problems now.

Ingen kommentarer:

Legg inn en kommentar