Friday, December 07, 2007

SAS and LINQ - Part 1

So, I am just getting my feet wet with LINQ, the new integrated query language within .NET. There are loads of articles on LINQ elsewhere. What I want to do is to illustrate LINQ with SAS and this is my starting point.

Look at the following code:


using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Xml.Linq;
using System.Windows.Forms;

namespace WindowsFormsApplication2
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}

private void btnGetData_Click(object sender, EventArgs e)
{
XElement table =XElement.Parse(
@"<TABLE>
<CLASS>
<Name> Alfred </Name>
<Sex> M </Sex>
<Age> 14 </Age>
<Height> 69 </Height>
<Weight> 112.5 </Weight>
</CLASS>
<CLASS>
<Name> Henry </Name>
<Sex> M </Sex>
<Age> 14 </Age>
<Height> 63.5 </Height>
<Weight> 102.5 </Weight>
</CLASS>
<CLASS>
<Name> James </Name>
<Sex> M </Sex>
<Age> 12 </Age>
<Height> 57.3 </Height>
<Weight> 83 </Weight>
</CLASS>
<CLASS>
<Name> Jeffrey </Name>
<Sex> M </Sex>
<Age> 13 </Age>
<Height> 62.5 </Height>
<Weight> 84 </Weight>
</CLASS>
<CLASS>
<Name> John </Name>
<Sex> M </Sex>
<Age> 12 </Age>
<Height> 59 </Height>
<Weight> 99.5 </Weight>
</CLASS>
<CLASS>
<Name> Philip </Name>
<Sex> M </Sex>
<Age> 16 </Age>
<Height> 72 </Height>
<Weight> 150 </Weight>
</CLASS>
<CLASS>
<Name> Robert </Name>
<Sex> M </Sex>
<Age> 12 </Age>
<Height> 64.8 </Height>
<Weight> 128 </Weight>
</CLASS>
<CLASS>
<Name> Ronald </Name>
<Sex> M </Sex>
<Age> 15 </Age>
<Height> 67 </Height>
<Weight> 133 </Weight>
</CLASS>
<CLASS>
<Name> Thomas </Name>
<Sex> M </Sex>
<Age> 11 </Age>
<Height> 57.5 </Height>
<Weight> 85 </Weight>
</CLASS>
<CLASS>
<Name> William </Name>
<Sex> M </Sex>
<Age> 15 </Age>
<Height> 66.5 </Height>
<Weight> 112 </Weight>
</CLASS>
</TABLE>");


var obs = (from o in table.Elements("CLASS")
where (string)o.Element("Sex") == " M "
select new
{
Name = (string) o.Element("Name"),
Age = (string) o.Element("Age")
}
);
List observations = new List();
foreach (var ob in obs)
{
observations.Add(ob.Name) ;
}
}
}
}



The above is C# code.

Notice that the input data is a SAS XML representation (I am doing it inline right now as a demo. It could easily come from a file)? Notice the SQL-like code toward the bottom? That is LINQ.

One of the oft-used criticisms of low-level languages is that they were fine for a lot of things except processing data. For that, we needed languages such as SAS or SQL. LINQ doesn't substitute for the database but what it does do is enable us to combine data processing with the full-blown power of a language like C#. The syntax isn't as elegant as SAS (by any means) but what it does provide is a way to juice up the power of data processing against SAS data when needed.

SAS is OleDb and ODBC compliant. As of yet, Microsoft has not provided support for either one within LINQ. When it does, I will post examples of using LINQ against a SAS dataset directly rather than via XML.

The above represents a fraction of the power of LINQ and it is in its first iteration. Using the same query language against any data source (SAS, XML, SQL Server, etc.) and then combine that with the power of a language such as C# holds a lot of potential. Power SAS users, especially on the ETL side, should keep an eye on LINQ because it offers up a great way to do data processing.

Monday, November 26, 2007

Demarcation

Good systems practice is to have demarcations between systems layers. This involves separation of the following layers (at a minimum):

Application
Business logic
Data access
Data

SAS programmers though tend to muddle all together:

libname indata ...; <-- Data Access
data mydata...; <--- Business Logic
ods html; proc report...; <--- Application Layer / UI

A better way to handle is to treat all layers as standalone and get them out of each other's space. The easiest way for a SAS coder to accomplish this is to use macro libraries to get it all started:

%GetFinanceData() ;
%DetermineProfit() ;
%OutputResults() ;

However, even this is constrained since everything is called together and a single program controls what happens top-to-bottom. Rather than doing the final piece as a part of the batch job, consider doing it on demand and perhaps through other means:

%GetFinanceData() ;
%DetermineProfit() ;

...wait until use requests information....

Get information on demand and make the presentation logic extraneous to SAS. Someone could even keep it in SAS ODS but the idea would be to pull it out to an application layer that is completely separated from how the information was formulated.

SAS programmers would be better off putting in extremely strong lines of demarcations between layers. This makes code easier to maintain, easier to change, and easier to understand.

Longer term, web service calls will simplify this even more but the concept of processing silos can be done now. Break your code apart so everything isn't glued together so tightly that code reuse is hard and code libraries are non-existent or little used.

C# and DBF files

I have a client that asked me to read DBF files generated by their SAS application. Well, this posting is to explain to someone what i have learned in reading DBF files using C#. The main thing I learned was no spaces in the file path.

Here is the code that worked for me:

string connectionString = "Driver={Microsoft dBase Driver (*.dbf)};SourceType=DBF;SourceDB=C:\projects\Client\Data\WeeklyAvailableComparison;Exclusive=No; Collate=Machine;NULL=NO;DELETED=NO;BACKGROUNDFETCH=NO;";
string selectCommand = @"SELECT * FROM C:\projects\Client\Data\WeeklyAvailableComparison\ac_141.dbf";
DataTable dt = new DataTable();

OdbcConnection oConn = new OdbcConnection();
oConn.ConnectionString = connectionString;
oConn.Open();
OdbcCommand oCmd = oConn.CreateCommand();
oCmd.CommandText = selectCommand;

dt.Load(oCmd.ExecuteReader());

I tried doing a 8.3 conversion and it wouldn't work for the select clause.

I hope this helps someone.

Alan

Wednesday, July 11, 2007

Using SAS\Share and C#

Ok, I went a little crazy getting SAS\Share working with OleDb from C#. Since I got it figured out, someone else may have the same issue.

Here is how I did it. First of all, make sure you client machine and the SAS\Share server can talk to each other. I tested from a development machine that has SAS on it. Otherwise, a test app will have to be built.

In the web.config, I have these settings:

<add key=\"ShareConnectionString\" value=\"Provider=sas.ShareProvider.1;Data Source=share1;Location=192.168.1.199;User ID=Administrator;Password=<span color=\">*****</span>;Mode=ReadShare Deny None\"/>
<add key=\"SasLibrary\" value=\"demo\">
<add key=\"SasDataSet\" value=\"user_info\">

Here is the C# code:

private static DataTable LoadSasDataSet()
{
try
{
string library = ConfigurationManager.AppSettings["SasLibrary"] ;
string dataset = ConfigurationManager.AppSettings["SasDataSet"] ;
string conn = ConfigurationManager.AppSettings["ShareConnectionString"];

DataSet sasDs = new DataSet();
OleDbConnection sas = new OleDbConnection(conn);
sas.Open();

OleDbCommand sasCommand = sas.CreateCommand();
sasCommand.CommandType = CommandType.TableDirect;
sasCommand.CommandText = library + "." + dataset;

OleDbDataAdapter da = new OleDbDataAdapter(sasCommand);
da.Fill(sasDs);
sas.Close();

return sasDs.Tables[0];
}
catch (Exception ex)
{
Common.AddLogEntry("ERROR", "Unable to load SAS dataset", "", DateTime.Now, ex.ToString());
return null;
}
}

Keep in mind that there is also SQL capability within Share so that would be an option as well.

Tuesday, February 13, 2007

SAS, ASP.NET AJAX, Web Services

In preparation for SAS Global Forum 2007, I have posted 2 videos showing 1) How to set up a web service to read a SAS dataset, 2) How to consume that service in an ASP.NET AJAX page. Simple demos but they should help get people started:

There are several videos demonstrating the concepts involved. They can be found here:

http://utilities.savian.net

Look under the video tab.

Friday, January 19, 2007

Code generation, macros, et al

As a SAS programmer evolves, they start to think:

"Hey, I can create SAS code using data step or macros!"

So they start to go down the happy path of code generation:

data _null_;
file "myplace";
put "data test ;
/ " set a; "
/ " x = wereHavingFunFunc(y)" ;
... etc.
run;

...and pretty soon they have elaborate SAS code in macros, etc. that does nothing but generate code. If you don't use my utility SasEncase to help you do this, you are doing a LOT of extra work but that is another story.

Well, I think the next level of thought (at least for me) was:

"Hey! Why can't I just use ANY programming language to generate SAS code"

Now, think about that for a sec. ANY programming language can be used to generate SAS code.

The reason why SAS is an entry into this area is because that is what all of us know and love. But, don't confine yourself to just using SAS for code generation. Instead, pick other languages as the need arises.

public void CreateSomeSasCode()
{
StreamWriter sw = new StreamWriter(@"c:\temp\mySasCode.sas") ;
sw.WriteLine("data test;") ;
....
}

Now, that's better, a little C# to play with.

Ok, so you would have to wrap a lot of code.

I think I have evolved. Now I actually write the SAS code with macro parms and store them on a server. Then you can just call the stored process and pass the parameters using web services:

DataSet ds = SasServer.Services.ExecuteStoredProcedure("mySasCode.sas", @"%let outdata = 'c:\temp'");

Now, I think we have nirvana: SAS doing what it does best being driven by a modern OOP environment.

Dropping flyballs in left field,
Alan

Cisco Information Server (CIS) and MS OLAP

Ok, so my recent issue was integrating CIS with Microsoft OLAP cubes. The normal way to do this is to set up MS OLAP (SSAS) with a website a...