Friday, August 5, 2011

Macro: May I Have This Date?

In my last job I wrote a SAS program that read in 150+ quotes (mostly snarky, oxymoronic, and anti-religious that were all of my choosing), randomly selected one, then printed it at the bottom of a daily email (batch verification) sent to the project biostatisticians.  (I'm not sure how I justified doing this when there were more pressing tasks to be completed.  Oh wait -- I remember -- a little levity was needed to temper the dysfunction infecting the project.)  

At any rate, the random selection of the daily quote in my SAS program required just a couple of steps.  First, I assigned a 'random' number from the uniform distribution to each quote with the 'seed' being the current date.  (If you don't specify a seed, I think SAS may default to using the current date, but in an effort to minimize confusion, I chose to be explicit.)  I then sorted the quotes in ascending order according to the randomly generated numbers with the first one being selected for output to the daily email.  I wanted to duplicate this process in Stata (probably should have been working on my proposal...oh well) but instead of outputting the quote to an automatically generated email, I sought to write a program (.ado file) that would display a randomly selected quote into the output window with the calling of my user-written command, -quote-.  Seemed straightforward enough, until I had to assign the current date to a macro in Stata...

In both SAS and Stata, dates are numbers in that the value assigned to the date is the number of days lapsed (or preceding) January 1, 1960.  This means that unless you format the date variables (or macros) you create to display a date format (e.g. "08/05/2011", "20110805", "August 5, 2011", etc.), you'll simply get the number of days since 01/01/1960 (e.g. 18,844).  This makes working with dates relatively simple and intuitive, assuming that the system date is stored as number.  In SAS, you can assign the current date to a variable or macro by invoking the system-stored current date via the today() function (per SAS documentation:  "a function that returns a SAS date value corresponding to the date on which the SAS program is initiated").  In Stata, however, the system date value is accessed by invoking either the global macro, $S_DATE, or the system value for the current date [c(current_date)].  If I were interested in simply printing the current date on, say, an updated daily graph, I could just assign the system date to a macro, let's call it cdate, and be done with it since it defaults to printing the actual date ("5 Aug 2011").  But since I wanted the date to be a number (for seeding purposes), I needed the value to be the numeric date (i.e. the number of days since 01/01/1960).  This required a quick search of the Statalist Archives to find out if any other users have dealt with the issue and if so, how.  Fortunately for me, others had and the workaround was rather simple:  remove the spaces from the date value then declare the value a Stata date formatted as day-month-year ("DMY").  (I didn't find exactly what I needed on the listserve, although I was able to tweak the suggestions proffered to get what I needed.)  I then assigned this date value to a Stata macro.  The code I used to assign a random number from the uniform distribution for each quote for both Stata and SAS follows:


Stata
local cdate = date(subinstr("$S_DATE" , " " , "" , .), "DMY")
* set seed to current date...
set seed `cdate'
* assign random number from uniform distribution
gen xselect = runiform()

SAS
*Generate random num for each quote using the uniform distrib.;
data RandomNumbers;
  do i=1 to &NumQuotes;
    r=ranuni(today());
    output;
  end;
run;

In the SAS program, I had to generate the random numbers in a separate dataset with the number of observations equaling the number of quotes (&NumQuotes) then merge said dataset with the dataset containing the quotes.  In Stata this step wasn't necessary -- I was able to create a new variable, xselect, containing randomly generated numbers from the uniform distribution directly into the dataset containing the quotes.  In both the Stata and SAS programs, I then sorted by the variable containing the randomly generated numbers -- xselect in Stata and r in SAS -- and chose the first quote (i.e. the quote with the lowest random number value).  

After identifying the first quote in each program, I assigned the quote to a macro then output the quote to either an automatically-generated email (SAS) or the output window (Stata).  With my departure from my previous job, I'm obviously no longer subjecting myself to a barrage of daily emails, although the snark, oxymoronica, or anti-religious quotes are just one short command away in Stata...

. quote
Quote of the day: 
    Maybe this world is just another planet's hell. (Aldous Huxley)

No comments:

Post a Comment