Sunday, November 11, 2012

SAS XPT Files, IBM Floats, and C#

Well, I recently was asked if I could support XPT files, you know, the SAS transport format. Well, SAS transport formats are documented and fairly straight-forward to read so 'why not?'.

Got started coding, client provided a nice sample, all went well. Around 8-10 hours after starting, I was happily reading SAS transport files and writing them to a tab-delimited file.

However, things weren't happy on Big Rock Candy Mountain.Seems the numbers were off. A little bit of reading in the SAS XPORT Documentation indicated the problem:

"All floating-point numbers in the file are stored using the IBM mainframe representation. If
your application is to read from or write to transport files, it will be necessary to
convert native floating-point numbers to or from the transport representation."


Not a problem. Let's go track down a C# library that handles IBM 370 floats and does the conversion. One small problem...I couldn't find one. What I did find was that this was a nasty little problem. I like to dabble in CompSci stuff up until the point of pain and we were at a severe pain threshhold. Anyone wishing to read what is involved, check the SAS link above.

After playing with lots of things for several days, I settled on using John Hou's great article on CodeProject:

Transform Between IEEE, IBM, or VAX floating point

Fortunately, I had a license for Tangible Software  C++ to C# convertor. That did a lot of lift (80-90%) and left me to do final cleanup. Once I got that all down, next up was to get the endianness correct on the parms passing in and then suddenly I saw a correct value passed back.

Long weekend dealing with it but glad it is over.

The final C# code that creates the conversion is available on John's aoriginal article as an alternative and I will post it on Savian as well. I learned more about SEF, floats, 4/8 byte variances, and everythig else than I ever wanted to know.

However, tonight I sent my client a translated XPT file and it appears to be correct. And I now have a .NET dll for handling the issue. That made all of the difference in the world...

9 comments:

Lex Jansen said...

FYI:

Usage Note 46944: Try newly powered transport macros that support the SAS Version 5 transport format and new data set features:

http://support.sas.com/kb/46/944.html

These provide an extension to the XPT format.

Alan Churchill said...

The macros are not usable on a system where SAS is not installed.

That was the need here. XPT files should be consumable outside of SAS.

Nasir Amin said...

hmm,

you could have used the SaS local provider which can be installed standlone and is free.

Using this provider you just have to use OLEDB to read the SaS xpt files. Just 5 lines of code in total!

Nasir Amin said...

Alan,

Another issue that I haven't seen resolved by the provider is the numeric format. In Sas XPT files there can be a Format for a numeric column. For example you can store date as a numeric value but provide a format in the meta data in the xpt header to say which format to use for it. The sas Viewer will actually format such a column for you.

The problem is the OLEDB provider for sas does not support formatting.

The custom code you wrote, how do you think you are going to deal with this situation because IBM floats code will actually take the numeric data and convert it to completely different values in C#.

Alan Churchill said...

Nasir,

I did not want a reliance on any SAS software for the conversion. The code was going into a commercial tool and I could not use the SAS OleDb provider in this case. While a user could use the SAS provider, you cannot wrap their dll into another product.

You can read numerics using the SAS provider but the numbers come in as text.

On the format front, I associate the format name to the numeric but you have to be able to read the sas7bcat to convert the numeric to the formatted values. That is the last challenge: good you caught it.

Anonymous said...

Alan,
Have you shared the c# code for this anywhere?

Alan Churchill said...

The C# code to do the conversions is on CodeProject. The C# code to do the whole XPT format is available for purchase. Just contact me if needed.

Anonymous said...

Hi Allan - I have the exact same issue you had. I am trying to read in IBM 4-byte floating points. I also found Johns article but dont have a c# converter.

Your link to skydrive doesnt work for me. Is there any other way you could make the c# code avialable??

Any help would be very appreciated before I loose my mind!!

Alan Churchill said...

I fixed the link on the article. Here it is in case:

http://www.codeproject.com/KB/applications/492449/Converters.zip

Cisco Information Server (CIS) and MS OLAP

Ok, so my recent issue was integrating CIS with Microsoft OLAP cubes. The normal way to do this is to set up MS OLAP (SSAS) with a website a...