I have been spending the last week working on how the interface to the sas dataset can be used by a .NET developer. It is of particular concern because once the initial dll is released, it will be hard to change it. Getting the interface 'mostly' correct is important.
After a lot of thought, here is what I have so far:
//Initiate logging
Savian.SaviDataSet.Logging.StartLog(@"c:\temp\SaviDataSetErrors.log");
//Create SAS dataset object
SasDataSet ds = new SasDataSet();
//Add 4 variables
ds.AddVariable("AA1", "Var1");
ds.AddVariable("AA2", "Var2");
ds.AddVariable("AA3", "Var3", 10, SasVariableType.Character,"$5.","$7.");
ds.AddVariable("AA4", "Var4", 10, SasVariableType.Character);
//Add observations
//Bulk insertion
for (int i = 0; i < 1000; i++)
{
object[] values = new object[] {1, 2, "Test_" + (i + 1).ToString("00#"), "Test2", "Test3", "Test4", "Test5"};
ds.AddObservation(i, values );
}
//Modify observation - Similar to a .NET dataset
ds.Observations[0]["AA1"].Value = "Test";
//Write dataset
ds.WriteDataSet("TEMP", @"c:\temp\test.sas7bdat");
//Stop logging
Savian.SaviDataSet.Logging.CloseLog();
The area that was a challenge was adding observations since they are really an array across existing variable metadata.
The other area I am focused on is keeping the wording the same as how SAS would refer to things. Hence, observation instead of row and variable instead of column.
I have also been working on validation so that invalid data does not make it into the dataset. I can't prevent everything, but I am making a good faith effort to minimize it. Also, formats/informats have to be checked, name lengths, length values, dataset name, etc. all have to be verified to make sure they comply. So far, so good but more checking is underway.
While working on this, you realize how much effort has been put into the dataset by SAS over the years and how much work they have to go through to make things compliant and workable.
This blog is designed to show various ways to use Data Virtualization, technologies, and SAS with Microsoft technologies with an eye toward outside of the box thinking.
Subscribe to:
Post Comments (Atom)
SAS throwing RPC error
If you are doing code in C# and get this error when creating a LanguageService: The RPC server is unavailable. (Exception from HRESULT:...
-
I am finally ready with my SAS dataset reader/writer for .NET. It is written in 100% managed code using .NET 3.5. The dlls can be found here...
-
I was just tasked to read in LDAP records so we could cross-reference userids with login identifiers and general ledger information. Using...
-
Well, around 14 months ago, I started on a journey to understand the SAS dataset so I could read and write one independently. Originally, I ...
4 comments:
Hi, Alan, do you want to publish the dll file when it is good enough?
I will be opening up an alpha soon. I want to run some more tests and make sure it is doing what I expect within reason. I know things will be encountered when it goes to the field but I would like to minimize it.
Getting the interface for the programmers sorted out was critical, IMO.
Hey, Alan, can you share dll?..
Contact CozyRoc at CozyRoc.com or contact me directly via my website at savian.net
Post a Comment