<sub>2025-06-01</sub> <sub>#data-science #sas-programming </sub> <sup>[[maps-of-content|🌐 Maps of Content β€” All Notes]] </sup> <sup>Series: [[sas-programming-1-essentials|SAS Programming 1 β€” Essentials]]</sup> <sup>Topic: [[sas-programming-1-essentials#Lesson 1 Essentials|Lesson 1: Essential]]</sup> # SAS Programming Syntax and Structure > [!abstract]- Overview > > SAS programs are built like well-organized conversations---each step performs a specific task, each statement communicates a clear instruction, and proper punctuation (semicolons) ensures the message is understood. > > **Key Concepts**: > > - **Program Architecture**: DATA steps (create/modify data) and PROC steps (analyze data) > - **Statement Structure**: Keywords + instructions + semicolon termination > - **Error Diagnosis**: Log messages reveal exactly where and why code fails > > **Critical Connections**: Steps contain statements, statements require semicolons, and the log provides feedback for everything that happens---master these three relationships and you control SAS. > > **Must Remember**: All statements end with semicolons, spacing aids humans (not SAS), and when something is underlined in the log but looks correct, check immediately before it for the real problem. > [!code]- SAS Syntax Reference > > > |Command/Syntax|Purpose|Example| > |---|---|---| > |**Program Structure**||| > |`DATA tablename;`|Begin DATA step|`data myclass;`| > |`PROC procedure;`|Begin PROC step|`proc print data=myclass;`| > |`RUN;`|End most steps|`run;`| > |`QUIT;`|End some PROC steps|`quit;`| > |**Data Manipulation**||| > |`SET tablename;`|Read input data|`set sashelp.class;`| > |`variable = expression;`|Create/modify variable|`heightcm = height * 2.54;`| > |`WHERE condition;`|Filter rows|`where age < 13;`| > |`VAR variables;`|Specify variables|`var age height weight;`| > |**Global Statements**||| > |`TITLE "text";`|Add report title|`title "My Report";`| > |`LIBNAME ref "path";`|Define data library|`libname mylib "/data/";`| > |`OPTIONS settings;`|Set session options|`options nodate;`| > |**Comments**||| > |`* comment text;`|Single line comment|`* This calculates BMI;`| > |`/* comment */`|Multi-line comment|`/* Data prep section */`| ## Understanding SAS Architecture Think of SAS programs as **assembly lines** where each station (step) has a specific job. Just as a car moves from welding to painting to final inspection, your data flows through carefully designed steps that transform, analyze, and present information. ### The Two Essential Building Blocks **DATA Steps: The Workshop** DATA steps are where the magic of data transformation happens. Like a skilled craftsperson, DATA steps can: - Read raw materials (input data) - Shape and refine them (filter, calculate, combine) - Create finished products (new SAS tables) **PROC Steps: The Specialists** PROC steps are like calling in experts for specific tasks. Need statistical analysis? Call PROC MEANS. Want a formatted report? PROC PRINT is your specialist. Each procedure has been perfected for particular analytical tasks. > [!tip] The Assembly Line Analogy > > **DATA β†’ PROC β†’ PROC** is like **Prepare β†’ Analyze β†’ Report** > > A typical workflow: DATA step cleans and structures your data, first PROC analyzes it, second PROC creates a polished report. Each step builds on the previous one's output. ## Program Structure: The Foundation ### Basic Step Anatomy Every SAS step follows a predictable pattern, like a well-written paragraph: ```sas data myclass; /* Topic sentence: "I'm creating a table" */ set sashelp.class; /* Supporting detail: "Using this source" */ heightcm = height * 2.54; /* Supporting detail: "Adding this calculation" */ run; /* Conclusion: "Execute these instructions" */ ``` ### The Three-Step Program Pattern Here's a complete program demonstrating the classic prepare-analyze-report workflow: ```sas /* Step 1: Prepare the data */ data myclass; set sashelp.class; heightcm = height * 2.54; run; /* Step 2: Create a detailed report */ proc print data=myclass; run; /* Step 3: Generate summary statistics */ proc means data=myclass; var age heightcm; run; ``` > [!note] Global Statements: The Background Players > > Some statements work **outside** the step structure: > > > ```sas > title "Student Height Analysis"; /* Sets title for all following output */ > options nodate; /* Removes date from output */ > > data myclass; /* Now the regular steps begin */ > set sashelp.class; > /* ... */ > run; > ``` > > Global statements affect your entire SAS session and don't need RUN statements. ## Statement Structure: The Grammar of SAS ### The Semicolon Rule: Non-Negotiable In SAS, the semicolon is like the period in English---it marks the end of a complete thought. **Every statement must end with a semicolon.** This isn't a suggestion; it's the law of SAS. ```sas /* Correct */ data test; set mydata; newvar = oldvar * 2; run; /* Incorrect - missing semicolons will cause errors */ data test set mydata newvar = oldvar * 2 run ``` ### Spacing and Style: For Humans, Not SAS SAS doesn't care about your formatting, but your future self (and colleagues) will thank you for clean, readable code: ```sas /* SAS understands this perfectly but humans struggle */ data test;set mydata;newvar=oldvar*2;run;proc print;run; /* Much better - same logic, human-friendly format */ data test; set mydata; newvar = oldvar * 2; run; proc print data=test; run; ``` > [!tip] Quick Formatting > > **SAS Studio**: Click the Format Code button > **SAS Enterprise Guide**: Edit β†’ Format Code (or Ctrl+I) > > Let the software handle the tedious formatting while you focus on the logic. ### Case Flexibility: SAS is Relaxed Unlike some programming languages, SAS is case-insensitive for unquoted values: ```sas /* All of these are equivalent */ DATA MyClass; data myclass; Data MYCLASS; dAtA mYcLaSs; ``` **Exception**: Text inside quotes is case-sensitive: ```sas title "My Report"; /* Different from */ title "MY REPORT"; /* this one */ ``` ## Comments: Your Code's Documentation Comments are like sticky notes for your future self---use them generously to explain **why** you're doing something, not just what you're doing. ### Single-Line Comments ```sas * This entire statement is a comment; data myclass; set sashelp.class; * Calculate height in centimeters for international use; heightcm = height * 2.54; run; ``` ### Multi-Line Comments ```sas /* This program analyzes student height data Created: January 2024 Author: Your Name Purpose: Demonstrate metric conversions */ data myclass; set sashelp.class; heightcm = height * 2.54; run; ``` > [!warning] Comment Placement > > Single-line comments (`*`) must be complete statements ending with semicolons. You **cannot** add them to the end of other statements: > > > ```sas > /* Incorrect */ > data myclass; * This won't work > > /* Correct */ > * Create analysis dataset; > data myclass; > ``` ## Error Diagnosis ### Reading the Crime Scene (SAS Log) When SAS encounters problems, it leaves detailed evidence in the log. Learning to read these clues is crucial for successful programming. > [!note] The Golden Rule of Error Diagnosis > > **When something is underlined in the log but looks syntactically correct, the real problem is usually immediately before it.** ### Common Syntax Errors and Their Signatures > [!caution]- Missing Semicolon > **Missing Semicolon** > > ```plaintext > ERROR: Syntax error, expecting one of the following: ;, DATA, PROC, RUN, QUIT > ``` > > **Diagnostic clue**: The next valid keyword is underlined, but it looks correct. > [!caution]- Misspelled Keyword > **Misspelled Keyword** > > > ```plaintext > WARNING: The statement 'DAAT' is not recognized and will be ignored. > ``` > > **Diagnostic clue**: SAS might guess what you meant, or it might just ignore the statement. > [!caution]- Invalid Option > **Invalid Option** > > > ```plaintext > ERROR: Option AVERAGE is not recognized in the PROC MEANS statement. > ``` > > **Diagnostic clue**: The error message often suggests valid alternatives. ### Real-World Debugging Example > [!example]- Example Scenario > Let's examine a program with multiple errors: > > ```sas > daat mycars; /* Error 1: Misspelled DATA */ > set sashelp.cars; > AvgMPG = mean(mpg_city, mpg_highway); > run; > > proc print data=mycars /* Error 2: Missing semicolon */ > var make model type avgmpg; > where AvgMPG > 35; > run; > > proc means data=mycars average min max; /* Error 3: Invalid option */ > var avgmpg; > run; > ``` > > **Log Analysis**: > > 1. **DAAT warning**: SAS recognizes this as a likely typo for DATA and continues > 2. **VAR underlined**: Looks correct, so check before it---missing semicolon after PROC PRINT > 3. **AVERAGE underlined**: Invalid option; should be MEAN ## Practical Application: Building Your First Program > [!example]- Example Scenario > Let's create a complete program that demonstrates all these concepts: > > > ```sas > /* ======================================== > Student Height Analysis Program > Purpose: Convert heights and analyze data > ======================================== */ > > * Set report title; > title "International Student Height Analysis"; > > * Create enhanced dataset; > data student_metrics; > set sashelp.class; > > * Convert height to centimeters for international use; > height_cm = height * 2.54; > > * Categorize students by height; > if height_cm < 150 then height_category = "Short"; > else if height_cm < 170 then height_category = "Average"; > else height_category = "Tall"; > > * Round to one decimal place for cleaner display; > height_cm = round(height_cm, 0.1); > run; > > * Display first 10 records to verify calculations; > proc print data=student_metrics (obs=10); > var name age height height_cm height_category; > run; > > * Generate summary statistics by height category; > proc means data=student_metrics mean min max n; > var height_cm; > class height_category; > run; > > * Clear the title for future programs; > title; > ``` > > > [!tip] Program Organization Best Practices > > > > 1. **Header comments** explain the program's purpose > > 2. **Section comments** mark major program divisions > > 3. **Inline comments** explain complex logic > > 4. **Consistent indentation** shows statement relationships > > 5. **Descriptive variable names** make code self-documenting > ## Connecting It All Together The **fundamental pattern** is always the same: 1. **Structure your work** into logical steps (DATA for preparation, PROC for analysis) 2. **Communicate clearly** with proper statement syntax and semicolons 3. **Listen to feedback** through log messages and respond appropriately -- Reference: - SAS Programming 1 β€” Essentials