Sunday, July 26, 2009

Created my first SAS dataset without using SAS

Well, around 14 months ago, I started on a journey to understand the SAS dataset so I could read and write one independently. Originally, I needed the functionality for a client but it became an obsession to figure it out. Client work and personal matters interfered non-stop but I have finally written my first SAS dataset using C# and .NET.

It isn't pretty (no data is in it) but it is a dataset and that is what matters. Putting data in is minor so I am not concerned about that.

A lot of effort still remains to get it all worked out but I am happy to be at this position. After cleaning the code up and finishing it, I plan on making interoperability tools to simplify interfacing with SAS, expecially in areas where SAS needs help (i.e. SSIS, MS Office).

It is a good day but it has taken a brutal amount of hours to accomplish this feat. I would not suggest to others to spend this amount of time on this issue: it is simply mind-numbing.

[Update 07/27/2009]

I just wrote my first data into the dataset. It is only numerics right now but it is a start.

[Update 07/29/2009, 3AM]

I just created a dataset that contains text fields. I had an extraneous bug that made me think it was failing when it wasn't. Such is the life of a coder. Anyway, on to cleaning up code that I left hard-coded and making the system a bit more robust.

Let me be clear: I am tired of looking at hex code.
476F6F646E69676874

[Update 08/27/2009]

Progress continues. After I created my first dataset, the goal was to get rid of any hard-coded values and make it more dynamic. While doing that work, I hit a major logic issue and had to focus on figuring that out. I have made it through that and am now mopping up the remnants and tightening up the logic. Since the goal is to ship a .NET dll for developers, I also need to consider how a developer codes to the dll. Basically, I am in cleanup mode and testing additional variables, labels, etc. to see where things break or do not work as intended.

Stay tuned.

11 comments:

Alex Lyman said...

Congrats, Alan!

Savian said...

Thanks Alex.

It was tough. A lot harder than you and I figured when we were driving from Grand Junction.

Trying to pay the bills and work on it at the same time was also challenging. I figured I was going to finalize it last December but client work leapt onto the stage.

Phil Rack said...

Excellent work! I bet there's a good demand to be able to access the sas7bdat file from other products, etc...

Perhaps you can replace the broken SAS/ODBC & SAS/OLE db drivers so one can read AND write sas data sets.

Anonymous said...

An excellent achievement Alan.

Alex Thomas

救援部 said...
This comment has been removed by a blog administrator.
メル友 said...
This comment has been removed by a blog administrator.
Unknown said...

Alan,

You're writing directly to the sas7bdat format with no intermediate provider? You are brave.

There are many nuances with encoding to be careful of. Even though the sas7bdat file is compatible across platforms, each platform/architecture combo has a native encoding to deal with.

Chris (@SAS)

Savian said...

Chris,

Yes, I am aware of that issue. I ma focused on the Windows format and will ignore the complexities found elsewhere for now.

One step at a time...

素人 said...
This comment has been removed by a blog administrator.
Peter Souk said...

How did you go about doing this? I'd appreciate some codes samples.

Savian said...

No chance on code samples. Without SAS's permission, I violate copyright laws. How did I do it? I am now at around 1200 hours of wasted time looking at binary and hex and trying to randomly guess number patterns. Think about the hardest crossword puzzle ever, no Google, and it is 10x the size and all numeric. Sheesh!!

If I knew what was involved when I started, I probably would have passed.

Because there is potential legal issues here, I will skip out on revealing anything but it was a major bear to get to where I am. My goal is to build a .NET library that people can buy and use on their own but it will be compiled.

SAS throwing RPC error

If you are doing code in C#  and get this error when creating a LanguageService: The RPC server is unavailable. (Exception from HRESULT:...