PROC SQL in SAS: Writing Queries Like a Pro

Published: March 15, 2026 | Author: Editorial Team | Last Updated: March 15, 2026
Published on sasbase.com | March 15, 2026

PROC SQL brings the full power of Structured Query Language into SAS, allowing you to query, join, and transform data using syntax familiar to anyone with database experience. For many operations, PROC SQL is more concise than equivalent DATA step code, and understanding both approaches makes you a more versatile SAS programmer.

PROC SQL Basics

PROC SQL operates within a single PROC block — you can run multiple SELECT statements without repeating PROC SQL and QUIT. The basic structure: PROC SQL; SELECT col1, col2 FROM library.dataset WHERE condition ORDER BY col1; QUIT;. Unlike most PROC steps, PROC SQL does not require a RUN statement — it executes immediately, and QUIT ends the PROC SQL session.

SAS datasets are referenced using the library.dataset notation in the FROM clause, just as in DATA steps. You can create new datasets with the CREATE TABLE statement: CREATE TABLE work.summary AS SELECT dept, AVG(salary) AS avg_sal FROM mydata.employees GROUP BY dept;

Joining Tables

PROC SQL supports all standard SQL join types. An INNER JOIN: SELECT a.*, b.dept_name FROM employees a INNER JOIN departments b ON a.dept_id = b.dept_id;. Unlike DATA step MERGE, SQL joins do not require pre-sorting. LEFT JOIN keeps all rows from the left table with NULLs where no match exists on the right. FULL JOIN and RIGHT JOIN are also supported.

Performance note: For very large datasets, DATA step MERGE with pre-sorted data can outperform SQL joins because SAS can process sequential sorted reads more efficiently than hash-based join operations. For moderate datasets and complex multi-table queries, PROC SQL is usually preferable for readability and conciseness.

Subqueries and Calculated Variables

PROC SQL supports correlated and non-correlated subqueries in WHERE clauses, FROM clauses (inline views), and HAVING clauses. Calculated columns use the CALCULATED keyword to reference columns defined in the same SELECT: SELECT salary, salary * 1.1 AS new_salary, CALCULATED new_salary - salary AS raise FROM employees;. This avoids repeating expressions.

Using Macro Variables with SQL

The INTO clause stores query results into macro variables: SELECT COUNT(*) INTO :nrows FROM mydata.employees; stores the count in macro variable &nrows, which can then be used in subsequent code or messages. For multiple values, INTO :varname SEPARATED BY ',' creates a comma-delimited list useful for dynamic WHERE IN() clauses.

Read our overview of SAS statistical procedures to move from data preparation to analysis, or explore our resources for recommended SAS learning materials.

← Back to Blog | Home

Subscribe to Our Newsletter

Join 10,000+ subscribers. Get expert insights delivered weekly.

No spam. Unsubscribe anytime.