SAS macro variable and Python arguments

Python function arguments and SAS macro variables | 18 January 2022

Python functions are a lot like SAS macros. What is the analogy to SAS macro variable in Python?

The general idea of entering arguments to a Python function is similar to SAS macro variable for SAS functions, although the details are different. We see “args” and “kwargs” a lot in Python code. They are naming convention representing * and ** bring us:

  • positional arguments, and
  • keyword arguments.

In Python, the *, the asterisk (not be mistakened as “asteroid”) and ** can be used in two different context of a function:

  1. when defining a function inputs
  2. when calling the function

When they are used in defining functions, they allow us to use unlimited number of arguments of their respective type.

When used in calling a function, they allow unpacking an iterable (list, tuple, or dictionary). When used with a dictionary, * represents the key, whereas ** represents the value.

1. When defining a function

In Python, *: mean that the argument can be any length of positional arguments (conventionally written as *arg).

**: means that the argument can be any length of keyword arguments (conventionally epresented by **arg.

Positional arguments

fixed number of positional arguments

Positional arguments are explanatory. For a refresher in SAS, here is an example from Russ Tyndall’s SAS Blog.

codepositional argument.sas
%macro test(var1,var2,var3);                                                                                                            
 %put &=var1;                                                                                                                           
 %put &=var2;                                                                                                                           
 %put &=var3;                                                                                                                           
%mend test;                                                                                                                             
 
/** Each value corresponds to the position of each variable in the definition. **/ 
/** Here, I am passing numeric values.                                         **/                                                            
%test(1,2,3)                                                                                                                            
/** The first position matches with var1 and is given a null value.            **/                                                             
%test(,2,3)                                                                                                                             
/** I pass no values, so var1-var3 are created with null values.               **/                                                             
%test()                                                                                                                                 
/** The first value contains a comma, so I use %STR to mask the comma.         **/                                                             
/** Otherwise, I would receive an error similar to this: ERROR: More           **/
/** positional parameters found than defined.                                  **/                                                             
%test(%str(1,1.1),2,3)                                                                                                                  
/** Each value corresponds to the position of each variable in the definition. **/ 
/** Here, I am passing character values.                                       **/                                                            
%test(a,b,c) 
/** I gave the first (var1) and second (var2) positions a value of             **/
/** b and c, so var3 is left with a null value.                                **/                                                             
%test(b,c)

For comparison, here is an example in Python. Except the null value case that raises an error instead of being ignored, Python results are similar to SAS.

codepositional argument.py
def test(a,b,c):
    print('var1 =',a, 'var1 =', b,'var1 =',c)
test(1,2,3)
# var1 = 1 var1 = 2 var1 = 3
test( ,2,3)
#   File "<ipython-input-40-977619a1b726>", line 1
#     test( ,2,3)
#           ^
# SyntaxError: invalid synta
test("(1,1,1)",2,3)
# var1 = (1,1,1) var1 = 2 var1 = 3

The above comparisons were done using a fixed number of arguments.

Unlimited positional arguments

For unlimited number of positional arguments, in Python we just add *.

codeunlimited positional argument.py

def test2(*args
    for n, i in enumerate(args):
        print('var%d='%n,i)
test2(1,2,3,4,5,6)
# var0= 1
# var1= 2
# var2= 3
# var3= 4
# var4= 5
# var5= 6

In comparison, for unlimited positional arguments in SAS, the PARMBUFF option creates a macro variable called &SYSPBUFF that contains the entire list of parameter values. This let us pass in a varying number of parameter values.

Some practical examples are many of the functions in matplotlib.

plt.subplots(nrows=1, ncols=1, *, sharex=False, sharey=False, squeeze=True, subplot_kw=None, gridspec_kw=None, **fig_kw)

Here, the * and **fig_kw allow us to give unlimited positional and keyword arguments, for example, like this one, where dict() converts an assignment statement into a dictionary. Using dict() is easier than typing {}.

# as in ax.annotate()
textprops=dict(color="w")
# plotting style as in plt.plot()
style = dict(color='k', alpha=0.6)

sns.boxplot(x=tips["total_bill"],
                medianprops={'color':'white'},
                boxprops = dict(linestyle='--', linewidth=3, color='darkgoldenrod'))

Keyword arguments

They are defined with an “=” sign. This is common in both Python and SAS.

2. When calling a function

Can only supply it with exactly the same number of parameters as in function, and the same type

I think this is unique to Python. And it can be confusing without an example.

codecalling function.py
def test3(a,b):
    print(a,b)
test3(1,2)
# 1 2

test3(*{'a':1,'b':2})
# a b

test3(**{'a':1,'b':2})
# 1 2

When the argument is given in the format of a dictionary, * tells Python to use the keys in the dictionary for the function, two **, tells Python to use the values in the dictionary and plug into the function. We can think of it as if the first * locates the key, and then the second * locates the value associated with the key.

Python SAS
‘keyword’, i.e.use the key * &
use the value: ** &&

The object that holds that arguments in Python can be any iterable: list, tuple, and dictionary.

codearguments_mixed.py
In [1]: def sum(a,b):
   ...:     return a+b
   ...:

In [2]: values=[1,2]

In [3]: sum(*values)
Out[3]: 3

In [4]: values=(1,2)

In [5]: sum(*values)
Out[5]: 3

In [6]: values={'a':1,'b':2}

In [7]: sum(*values)
Out[7]: 'ab'

In [8]: sum(**values)
Out[8]: 3

In [9]: def sum(a, b, c, d):
   ...:     return a + b + c + d
   ...:

In [10]: values1 = (1, 2)
    ...: values2 = { 'c': 10, 'd': 15 }
    ...: s = sum(*values1, **values2)
    ...:

In [11]: s
Out[11]: 28

In [12]: args = (1, 2)
    ...: kwargs={ 'c': 10, 'd': 15 }
    ...: s = sum(*args, **kwargs)
    ...:
    ...:

In [13]: s
Out[13]: 28

The last two examples are identical,except that the names of the inputs changed from “value1” and “value2” to “args” and “kwargs”, respectively.

In SAS the && is used on composite macro variable.

Whereas in Python, the ** means the values associated with the keys.

In a sense it is like in SAS where it is kind of like composite function in math g(f(x)), you plug in f(x) first and then get g(x)). The first * gets the keys, and the second ** use the keys to get the values.

3. Scan or loop through

Of course, in SAS, we can also use the %scan method to process an unlimited number of parameters that are held by one macro variables.

codemacro to transform time series.sas
%macro ts_transform(dsn);
%do i =2 to &n;
%let var = %scan(%quote(%var_list), &i, " ");
%put &var;
PROC EXPAND DATA = &dsn OUT = transformed METHOD= NONE;
D date;
CONVERT &var. = &var._ma4/TRANSOUT = (MOVAVE 4);  #moving average
CONVERT &var. = &var._cma4/TRANSOUT = (CMOVAVE 4); #center moving average
CONVERT &var. = &var._wma4/TRANSOUT = (MOVAVE 1 2 3 4); #weighted moving average
CONVERT &var. = &var._log/TRANSOUT = (LOG);
CONVERT &var. = &var._1/TRANSOUT = (LAG);
CONVERT &var. = &var._2/TRANSOUT = (LAG, 2);
CONVERT &var. = &var._3/TRANSOUT = (LAG, 3);
CONVERT &var. = &var.1_/TRANSOUT = (LEAD);
CONVERT &var. = &var.2_/TRANSOUT = (LEAD, 2);
CONVERT &var. = &var.3_/TRANSOUT = (LEAD, 3);
RUN;

DATA transformed;
SET transformed;
&var._yoy = DIF4(&var)/&var._4*100;
&var._yoy1 = LAG1(&var._yoy);
&var._qoq = DIF1(&var)/&var._1*100;
&var._qoq1 = LAG1(&var._qoq);
&var._myoy = DIF4(&var._ma4)/&var._ma4*100;
%END;
%MEND;

PROC SQL;
SELECT NAME INTO: v_lt SEPARATED BY " "
FROM DICTIONARY.COLUMNS
WHERE LIBNAME = LOWCASE("sc") AND MEMNAME =  LOWCASE("my_data") AND LOWCASE(NAME) NOT LIKE "%date";
QUIT;
%PUT &v_lt;
PROC SQL;
SELECT NVAR INTO: n 
FROM DICTIONARY.TABLES
WHERE LIBNAME = LOWCASE("sc") AND MEMNAME =  LOWCASE("my_data");
QUIT;
/* run the macro */
%ts_transform(SC.my_data);

In Python similar logic that are commonly used. For example,

countries = ['usa','china',...,'moon']
def country_func(some_list):
    for i in some_list:
        ...

Summary

We see “args” and “kwargs” a lot in Python code. They are naming convention representing * and ** bring us:

  • positional arguments, and
  • keyword arguments.

When they are used in defining functions, they allow us to use unlimited number of arguments of their respective type.

When used in calling a function, they allow unpacking an iterable (list, tuple, or dictionary). When used with a dictionary, * represents the key, whereas ** represents the value.

Appendix: SAS macro review

Lastly, the two common ways of using macro variable that holds a flexible number of inputs in SAS is to use

  1. DATA NULL
  2. PROC SQL After creating the list, we loop through the list to process operations, and this is often accomplished using a combination of the %scan function and the %DO statements.

%scan, and in various combination with PROC SQL, INTO, CALL SYMPUT, %DO,&&VAR&I type of syntax.

The following are from Arthur L. Carpenter Storing and Using a List of Values in a Macro Variable. To build a macro list:

  1. DATA NULL: the character variable ALLVAR is used to accumulate the list of variable names, and it is only after all the values are stored in ALLVAR is the macro variable (&VARLIST) created.

In the example, the list comes from meta data “metaclass” extracted using PROC CONTENTS.

codemacro list using data _null_.sas
proc contents data=sashelp.class noprint out=metaclass;
 run;

data _null_;
 length allvars $1000;
 retain allvars ' ';
 set metaclass end=eof;
 allvars = trim(left(allvars))||' '||left(name);
 if eof then call symput('varlist', allvars);
 run;
%put &varlist;
  1. PROC SQL. In this example, we are extracting each kind of the meta data information into a separate macro variable, each holding its respective list of information: name, type and length.
codemacro list using proc sql.sas
proc sql noprint;
 select name ,type, length
 into :varlist separated by ' ',
 :typlist separated by ' ',
 :lenlist separated by ' '
 from metaclass;
 quit;
%let cntlist = &sqlobs;
%put &varlist;
%put &typlist;
%put &lenlist;
%put &cntlist;