This blog is designed to show various ways to use Data Virtualization, technologies, and SAS with Microsoft technologies with an eye toward outside of the box thinking.
Sunday, April 22, 2012
SAS & Excel Via Local Provider
A question came up on SAS-L for how to get Excel to read SAS datasets using the Data Sources within Excel. Here are some screenshots showing how it works:
Wednesday, April 18, 2012
Hash a SAS Value
Sometimes, it is good to be able to hash a value so that a unique key can be made into the data. For example, say you were looking at a system performance log. You have a PID, a process name, and a user. PIDs are reused by a system all of the time so trying to narrow down uniqueness throughout a day is hard.
It order to get a unique value, you could concatenate the values into one:
000789654 || WeeklyProcess || gertre5
We are assuming that there is no need to ever reverse the values. This is a key assumption.
There is an undocumented function in SAS called CRCXX1 that can create a unqiue hash. Here is some code illustrating it:
The results:
This could be very valuable for situations where you need to tighten up processing and have some throwaway field values. The person who mentioned the undocumented function says it is good to about 1 million unique values before it starts to have collisions. Above that, go with the MD5 function.
It order to get a unique value, you could concatenate the values into one:
000789654 || WeeklyProcess || gertre5
We are assuming that there is no need to ever reverse the values. This is a key assumption.
There is an undocumented function in SAS called CRCXX1 that can create a unqiue hash. Here is some code illustrating it:
data A; input name :$200. gender :$8. state :$20.; x = compress(name||gender||state); y = CRCXX1(x); put x= y=32. ; datalines; Churchill,Alan Male Colorado Churchill,John Male Colorado ; run;
The results:
data A; 884 data A; 885 input name :$200. gender :$8. state :$20.; 886 x = compress(name||gender||state); 887 y = CRCXX1(x); 888 put x= y=32. ; 889 datalines; x=Churchill,AlanMaleColorado y=1558070123 x=Churchill,JohnMaleColorado y=837584169 NOTE: The data set WORK.A has 2 observations and 5 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 892 ; 893 run;
This could be very valuable for situations where you need to tighten up processing and have some throwaway field values. The person who mentioned the undocumented function says it is good to about 1 million unique values before it starts to have collisions. Above that, go with the MD5 function.
Subscribe to:
Posts (Atom)
SAS throwing RPC error
If you are doing code in C# and get this error when creating a LanguageService: The RPC server is unavailable. (Exception from HRESULT:...
-
I was just tasked to read in LDAP records so we could cross-reference userids with login identifiers and general ledger information. Using...
-
I am finally ready with my SAS dataset reader/writer for .NET. It is written in 100% managed code using .NET 3.5. The dlls can be found here...
-
Well, around 14 months ago, I started on a journey to understand the SAS dataset so I could read and write one independently. Originally, I ...