Well, I recently was asked if I could support XPT files, you know, the SAS transport format. Well, SAS transport formats are documented and fairly straight-forward to read so 'why not?'.
Got started coding, client provided a nice sample, all went well. Around 8-10 hours after starting, I was happily reading SAS transport files and writing them to a tab-delimited file.
However, things weren't happy on Big Rock Candy Mountain.Seems the numbers were off. A little bit of reading in the SAS XPORT Documentation indicated the problem:
"All floating-point numbers in the file are stored using the IBM mainframe representation. If
your application is to read from or write to transport files, it will be necessary to
convert native floating-point numbers to or from the transport representation."
Not a problem. Let's go track down a C# library that handles IBM 370 floats and does the conversion. One small problem...I couldn't find one. What I did find was that this was a nasty little problem. I like to dabble in CompSci stuff up until the point of pain and we were at a severe pain threshhold. Anyone wishing to read what is involved, check the SAS link above.
After playing with lots of things for several days, I settled on using John Hou's great article on CodeProject:
Transform Between IEEE, IBM, or VAX floating point
Fortunately, I had a license for Tangible Software C++ to C# convertor. That did a lot of lift (80-90%) and left me to do final cleanup. Once I got that all down, next up was to get the endianness correct on the parms passing in and then suddenly I saw a correct value passed back.
Long weekend dealing with it but glad it is over.
The final C# code that creates the conversion is available on John's aoriginal article as an alternative and I will post it on Savian as well. I learned more about SEF, floats, 4/8 byte variances, and everythig else than I ever wanted to know.
However, tonight I sent my client a translated XPT file and it appears to be correct. And I now have a .NET dll for handling the issue. That made all of the difference in the world...
This blog is designed to show various ways to use Data Virtualization, technologies, and SAS with Microsoft technologies with an eye toward outside of the box thinking.
Subscribe to:
Post Comments (Atom)
SAS throwing RPC error
If you are doing code in C# and get this error when creating a LanguageService: The RPC server is unavailable. (Exception from HRESULT:...
-
I am finally ready with my SAS dataset reader/writer for .NET. It is written in 100% managed code using .NET 3.5. The dlls can be found here...
-
I was just tasked to read in LDAP records so we could cross-reference userids with login identifiers and general ledger information. Using...
-
Well, around 14 months ago, I started on a journey to understand the SAS dataset so I could read and write one independently. Originally, I ...
8 comments:
The macros are not usable on a system where SAS is not installed.
That was the need here. XPT files should be consumable outside of SAS.
hmm,
you could have used the SaS local provider which can be installed standlone and is free.
Using this provider you just have to use OLEDB to read the SaS xpt files. Just 5 lines of code in total!
Alan,
Another issue that I haven't seen resolved by the provider is the numeric format. In Sas XPT files there can be a Format for a numeric column. For example you can store date as a numeric value but provide a format in the meta data in the xpt header to say which format to use for it. The sas Viewer will actually format such a column for you.
The problem is the OLEDB provider for sas does not support formatting.
The custom code you wrote, how do you think you are going to deal with this situation because IBM floats code will actually take the numeric data and convert it to completely different values in C#.
Nasir,
I did not want a reliance on any SAS software for the conversion. The code was going into a commercial tool and I could not use the SAS OleDb provider in this case. While a user could use the SAS provider, you cannot wrap their dll into another product.
You can read numerics using the SAS provider but the numbers come in as text.
On the format front, I associate the format name to the numeric but you have to be able to read the sas7bcat to convert the numeric to the formatted values. That is the last challenge: good you caught it.
Alan,
Have you shared the c# code for this anywhere?
The C# code to do the conversions is on CodeProject. The C# code to do the whole XPT format is available for purchase. Just contact me if needed.
Hi Allan - I have the exact same issue you had. I am trying to read in IBM 4-byte floating points. I also found Johns article but dont have a c# converter.
Your link to skydrive doesnt work for me. Is there any other way you could make the c# code avialable??
Any help would be very appreciated before I loose my mind!!
I fixed the link on the article. Here it is in case:
http://www.codeproject.com/KB/applications/492449/Converters.zip
Post a Comment