PROC SQL    

 e-Guide

Mind Map 

 Getting Started

Training Videos: 

Proc SQL e-Guide I, Proc SQL e-Guide I

 Best of Both Worlds - Proc SQL and Data Step Joins

Ready to Become Really Productive using Proc SQL?
 SQL Tutorial
  Maximizing Productivity and Efficiency using Proc SQL ($120)

Smarter Proc SQL searches - It is the only SAS® Procedure that most closely resembles the DATA Step. Learn how to create simple queries to complex joins from these top SAS® papers or by access the Proc SQL Summary Sheet.  

See also DATA Step and MergeSAS Macro Programming, Dictionary tables and SAS Program Efficiency, Large Dataset and System Options.  See SAS® blog on adding group descriptive statistics.  See also Lab Data blog for examples of complex clinical trials data analysis.   

Basic Structure of SELECT statement:

SELECT name, sex FROM sashelp.class WHERE sex = 'F'
GROUP BY sex HAVING weight > avg(weight) ORDER BY name;

Basic Structure of JOINs: Worksheet

CREATE table mytable as SELECT class.name, students.sex FROM sashelp.class as class, mylib.students as students WHERE class.name = students.name and class.sex = students.sex;

CREATE table mytable as SELECT class.name, students.sex FROM sashelp.class as class LEFT JOIN mylib.students as students ON class.name = students.name and class.sex = students.sex;

  ______________________________________________

Proc SQL


______________________________________________

Proc SQL features by clauses 

 

 CALCULATED

(Required before new variable)


SUMMARY FUNCTIONS 


SUBQUERIES

X 


Detail/Summary Level 

 

5. SELECT

 

 X (For CASE clause, specify as SELECT CASE CALCULATED NEW_VAR WHEN ...)

 X 

(Can re-merge fixed value by group variable to calculate differences in fixed and variable values)

 

 

 1. FROM

or JOIN

   

 X

(Dynamic SELECT clause of KEY and outer SELECT variables)

 

2. WHERE

or ON

 

X

 X

(When joining tables, first detail subset then apply summary function in second PROC SQL)

 

X

 Along with joins by key variables, any additional conditons with 'AND' will subset at the detail or summary level.  Order is first detail level subset and then summary level subset.

 3. GROUP BY

 

X


 X

   

 

4. HAVING

 

 X

 X

(When joining tables, can summary subset to exclude records post join)

 

X

 Second Summary Subset
 6. ORDER BY

 X

 X

   
 
Horizontal Summary Functions (Across multiple variables, Detail level subset)

Vertical Group Summary Functions (Across records for 1 variable, Summary level subset)

SELECT SEX, MAX(WEIGHT, HEIGHT) FROM SASHELP.CLASS WHERE AGE > 10 ORDER BY SEX;

 
SELECT SEX, COUNT(NAME) FROM SASHELP.CLASS GROUP BY SEX HAVING COUNT(NAME) > 2;

 ORDER BY, WHERE

GROUP BY, HAVING 

______________________________________________

______________________________________________


______________________________________________

           

______________________________________________

Steps for Building Datasets with Proc SQL

______________________________________________


______________________________________________

The Ultimate Proc SQL Example

______________________________________________

Complex and Fuzzy Join with Repeated Table, Summary Variables and Conditons

Not easily possible with DATA Step: 1. _DDAY dataset is referenced twice with different conditions.  2. DATEPART functions are applied in ON conditions.  3. COALESCE() function is applied to assure non-missing variable. 4. Summary functions are applied in SELECT statement to assure on record by GROUP BY variables SUBJECT, PKDTF and PKREFID.  5. CALCULATED is applied in GROUP BY clause since SUBJECT was created in SELECT clause.  6. Can JOIN by variables from any of the four datasets.  7. Can have flexibility in JOINs such as FULL, LEFT and RIGHT.

CREATE table qc_tpt as
SELECT unique coalesce(a.subject, b.subjid, substr(c.usubjid, 17, 4))
 as subject, b.pkdtf, a.drawdt, a.timepoint, a.pk_code, b.pkrefid,
 max(c.ddayf) as nddayf format=datetime16., (b.pkdtf - calculated
 nddayf)/(60*60) as hrdiff format=4.1,
 max(d.ddayf) as nddayf3 format=datetime16., (b.pkdtf - calculated
 nddayf3)/(60*60) as hrdiff3 format=4.1
FROM _pk2 as a
FULL JOIN _pkcrf as b on a.subject=b.subjid and b.pkrefid=a.pk_code
FULL JOIN _dday as c on b.subject=substr(c.usubjid, 17, 4)
 and ( (a.timepoint ^= 'Predose' and c.ddayf <= b.pkdtf) or
 (a.timepoint = 'Predose' and (datepart(c.ddayf) = datepart(b.pkdtf))
 or (c.ddayf <= b.pkdtf) ) )
FULL JOIN _dday as d on c.subject=substr(d.usubjid, 17, 4)
 and (d.ddayf <= b.pkdtf)
GROUP BY calculated subject, b.pkdtf, b.pkrefid;  

Fuzzy Join with EQT to link related dates

CREATE TABLE fpgms as
SELECT UNIQUE scan(a.pgname, 1) as spgname, (coalesce(d.pgname,
f.pgname, g.pgname)) as pgname,
max(d.dslabel) as dslabel,
max(d.spname) as spname, max(d.spgdt) as spgdt format=date9.,
max(a.pdttm) as spgdtm format=datetime16., max(e.pdttm) as lpgdtm
format=datetime16.,
max(d.vpname) as vpname, max(d.vpgdt) as vpgdt format=date9.,
max(b.pdttm) as vpgdtm format=datetime16.,
max(c.pdttm) as vfrdtm format=datetime16.,
max(f.pdttm) as rtfdtm format=datetime16.,
max(g.pdttm) as dpgdtm format=datetime16.
FROM spgms as a
FULL JOIN vpgms as b on strip(scan(a.pgname, 1)) EQT strip(scan(b.pgname, 1))
FULL JOIN vfrms as c on strip(scan(b.pgname, 1)) EQT strip(scan(c.pgname, 1))
FULL JOIN pindex as d on strip(scan(c.pgname, 1)) EQT strip(scan(d.pgname, 1))
FULL JOIN lpgms as e on strip(scan(d.pgname, 1)) EQT strip(scan(e.pgname, 1))
FULL JOIN rtfs as f on strip(scan(e.pgname, 1)) EQT strip(scan(f.pgname, 1))
FULL JOIN dpgms as g on strip(scan(f.pgname,1)) EQT strip(scan(g.pgname, 1))
GROUP BY calculated pgname
;

   * Drop non-targets that already have a target visit match;
   CREATE new_vwm3nt (where=(xtra = .)) as
   SELECT a.*, b.t_vn as xtra FROM new_vwm3nta as a LEFT JOIN new_vwm3t as b
   on a.subjid=b.subjid and a.lbtestid=b.lbtestid and a.t_vn=b.t_vn;

   * Remerge group counts - <group_variable>n or other group descriptive stats;    * Select all source variables;
   * Left join to keep all records;
   * Subquery - count and link by name, group by sex;
   create table class2 as
   select  a.*, b.sexn from sashelp.class as a
   left join
   (select name, count(name) as sexn from sashelp.class  where sex > ' ' group by sex
   ) as b on a.name=b.name;

 See SAS blog - Something for nothing?  Adding group descriptive statistics. 

______________________________________________ 

Many-to-Many Join SAS Example - Note that LEFT JOIN is applied to keep all AE records, dataset WHERE option can be applied to AE_MED final dataset if needed.

**** MERGE MEDICATIONS WITH ADVERSE EVENTS;
proc sql;
   create table ae_meds as
   select a.subject_id, a.ae_start, a.ae_stop, a.adverse_event,
          c.cm_start, c.cm_stop, c.conmed from
   aes as a  left join  conmeds as c
   on (a.subject_id = c.subject_id) and
      ( (a.ae_start <= c.cm_start <= a.ae_stop) or
        (a.ae_start <= c.cm_stop <= a.ae_stop) or
       ((c.cm_start < a.ae_start) and (a.ae_stop < c.cm_stop)) );
quit;

Generally requires AE starting and AE ending date variables and is applied to combine AEs with ConMeds or PK with Dosing.

General Steps: 1. Merge all records with each other. 

2. Condition keeps any overlap of ConMeds visit dates in-between starting and ending AE visit dates based on three rules: a. Starting ConMeds between AE start and AE stop visit dates, b. Stopping ConMeds between AE start and AE stop visit dates, c. Starting ConMeds before AE start and stopping ConMeds after AE stop visit dates. 

* Code to standardize list of values with sequence numbers starting from 1;

proc sql;
 create table _seq_n as
   select unique sex
   from sashelp.class  where sex > ' '
   group by sex;
quit;

* Create sequence variable;
data _seq_n;
 set _seq_n;
 seq_n + 1;
run;

* Attach sequence variable to dataset;
proc sql;
 create table class2 as
   select a.*, b.seq_n
   from sashelp.class  as a left join _seq_n as b on a.sex=b.sex;
quit;

______________________________________________

SAS® Institute Book Guide PROC SQL Examples

SAS® Reference PROC SQL Syntax

SAS® Reference Proc SQL Summary Functions - ExamplesSyntax

 

 Beginner SAS Programmer

Beginner Paper PROC SQL for DATA Step Die-hards, Christianna S. Williams

Animated Guide PROC SQL, Russ Lavery

HOW Paper Ready to Become Really Productive Using Proc SQL?

SQL Set Operators Presentation

An Intro to PROC SQL: Simple & Complex Queries Melissa Navarro

 

 Advanced SAS Programmer

Unique PROC SQL PROC SQL - Is it a Required Tool for Good SAS Programming?, Ian Whitlock

The Power of PROC SQL, Amanda Reilly [Presentation] [Paper]

Subqueries in - WHERE, HAVING, FROM, SELECT, SET, and JOINS

Subquery Paper 1 The Many Users of SQL Subqueries, Tasha Chapman

Subquery Paper 2 Working with Subquery in the SQL Procedure, Lei Zhang and Danbo Yi [Correlated Subquery]

Subquery Paper 3  SQL SUBQUERIES: Usage in Clinical Programming, Pavan Vemuri

Subquery Paper 4 An In-Line View to a SQL, Darryl Putnam

Validation Paper Clinical Trial Data Validation: Using SAS PROC SQL Effectively

SAS® Paper Dear Miss SASAnswers: A Guide to Efficient PROC SQL Coding

Database Accessing Using The SAS System

Compare Efficiency with DATA Step

 

 Macro SAS Programmer

HOW Macro Paper How To Use Proc SQL select into for List Processing, Ronald Fehd [Manage Lists]

 

SAS Slides by SAS Techies

Fun  Puzzle  Answer

Search for Proc SQL Papers by Kirk Lafler

______________________________________________

1. A Hands-On Tour Inside the World of PROC SQL, Kirk Lafler

2. Undocumented and Hard-to-find SQL Features, Kirk Lafler

3. Kirk’s Ten Best PROC SQL Tips and Techniques, Kirk Lafler

4. Ten Great Reasons to Learn SAS Software's SQL Procedure, Kirk Lafler

5. Introduction to Using Proc SQL, Thomas J. Winn

6. Merging Tables in DATA Step vs. PROC SQL: Convenience and Efficiency Issues, Gajanan Bhat, Raj Suligavi

7. Proc SQL – A Primer for SAS® Programmers, Jimmy DeFoor [Having]

8. Top Ten Reasons to Use PROC SQL, Weiming Hu

9. Using the Magical Keyword INTO: in PROC SQL, Thiru Satchi

10. What would I do without PROC SQL and the Macro Language, Jeff Abolafia

11. Application of Some Advanced PROC SQL, Features in Clinical Trial Programming, Changhong Shi, Sylvianne B. Roberge

12. Simplifying NDA Programming with Proc SQL, Aileen L. Yam

13.DATA Step vs. PROC SQL: Issues with common variables when combining data sets, Curt Edmonds, Sunil Gupta

14. PROC SQL: From SELECT to Pass-Through SQL, Christopher Schacherer, Michelle Detry

15. PROC SQL: A Powerful tool to Impove Your Data Quality, Keh-Dong Shiang  [COUNT(), NMISS(), MISSING(), UNIQUE()]

16. Get SAS®sy with PROC SQL, Amie Bissonett

17. Top 10 Most Powerful Functions for PROC SQL, Chao Huang, Yu Fu [COUNT(), NMISS(), MISSING(), UNIQUE()]

18. Getting More Out of “INTO” in PROC SQL: An Example for Creating Macro
Variables, Mary-Elizabeth Eddlestone

19. A Visual Introduction to SQL Joins, Kirk Paul Lafler

20. SQL or not SQL When Performing Many-to-Many Merge – A Realistic Example, Wenjie Wang, Jade Huang

21. Using the Power of SAS SQL, Jessica Wang

22. Principles of Writing Readable SQL, Kenneth W. Borowiak

23. Introduction to Proc SQL, Katie Minten Ronk [Proc SQL options, Remerge, Percentages]

24. Combining Summary Level Data with Individual Records, Frank Ivis

25. The SQL Procedure: Tricks and Techniques for Efficiency and Simplicity, Sreekanth Middela, Suhas Sanjee [MONOTONIC()]

26. Helpful Undocumented Features in SAS®, Wei Cheng [MONOTONIC(), EQT]

27. Best Practices: Subset Without Getting Upset, Mary Rosenbloom, Kirk Lafler

28. SAS Data Views: A Virtual View of Data, John C. Boling

29. Solving Business Problems with the SQL Procedure, Kirk Lafler, Sunil Gupta

30. Frame Your View of Data with the SQL Procedure, Kirk Lafler

31. Diving into SAS Software with the SQL Procedure, Kirk Lafler

32. Five Little Known, But Highly Valuable and Widely Usable, PROC SQL Programming Techniques, Kirk Lafler

33. Nifty Uses of SQL Reflexive Join and Sub-query in SAS®, Cynthia Trinidad

34. A Comparison Between SQL and Data Step and Some SAS® Procedures, LaTonya Murphy, Lan Tran

35. Top 10 SQL Tricks in SAS®, Chao Huang [MEDIAN]

36. The SQL Procedure: When and How to Use It?, Ying Feng

37. Improving Maintenance and Performance of SQL queries, Bas van Bakel, Rick Pagie [Index]

38. Leveraging SQL Return Codes When Querying Relational Databases, John Bentley [SQLOBS]

39. Taking Advantage of Missing Values in Proc SQL, Ya Huang

40. PROC SQL for SQL Die-hards, Barbara Ross, Mary Jessica Bennett

41. Matching SAS® Data Sets with PROC SQL: If at First You Don’t Succeed, Match, Match Again, Imelda Go

42. Mastering Data Summarization with PROC SQL Christianna Williams

43. PROC SQL for PROC SUMMARY Stalwarts, Christianna Williams

44. SQL Step by Step: An advanced tutorial for business users, Judy Loren, Gregory Barnes Nelson

45. Efficient SQL for Pharma… and Other Industries, Chris Olinger [Advance]

46. MERGING vs. JOINING: Comparing the DATA Step with SQL, Malachy Foley

47. When PROC SQL Is Overwhelmed, How to Use the SyncJoin Algorithm to Match Millions of Records Fast, Houliang Li

48. Proc SQL Tips and Techniques - How to get the most out of your queries, Kevin McGowan, Brian Spruell

49. Ditch the Data Memo: Using Macro Variables and Outer Union Corresponding in PROC SQL to Create Data Set Summary Tables, Andrea Shane [Append table rows]

50. DATA Step vs. PROC SQL: What’s a neophyte to do?, Craig Dickstein, Ray Pass

51. Yes, Proc SQL is Great, But I Love Data Step Programming, What's a SAS User To Do?, Mira J. Shapiro

52. Queries, Joins, and WHERE Clauses, Oh My!! Demystifying PROC SQL, Christianna Williams [Advanced Joins, Subquery]

53. SQL SET OPERATORS: SO HANDY VENN YOU NEED THEM, Howard Schreier

54. Intermediate PROC SQL, Thomas Winn [Correlated Subquery]

55. Advanced Programming Techniques with PROC SQL, Kirk Paul Lafler [First., Last.]

56. Using Data Set Options in PROC SQL, Kenneth Borowiak

57. Relational Database Schemes and SAS Software SQL Solutions, Sigurd W. Hermansen

58. Update, Insert, and Carry-Forward Operations in Database Tables Using SAS® Enterprise Guide, Thomas E. Billings, Sreenivas Mullagiri [Integrity Constraints]

59. PROC SQL: Tips and Translations for Data Step Users, Susan Marcella, Gail Jorgensen

60. Labor-saving SQL Constructs, Brian Fairfield-Carter, David Carr

61. Cool SQL Tricks Russ Lavery

62. Fun with Address Matching: Use of the COMPGED Function and the SQL Procedure, Bianca Salas, Alexandra Varga, and Elizabeth Shuster

63. Fuzzy Matching Programming Techniques Using SAS® Software, Stephen Sloan, Kirk Paul Lafler

64. Getting Your Random Sample in Proc SQL, Richard Severino

65. Confessions of a PROC SQL Instructor, Charu Shankar

66. Fun with PROC SQL, Darryl Putnam [having count(sales)>1) ]

______________________________________________

Connect to key SAS® books and references for more information

PROC SQL: Beyond the Basics using SAS by Kirk Lafler

The Essential PROC SQL Handbook for SAS Users by Katherine Prairie

PROC SQL by Example: Using SQL within SAS by Howard Schreier

Webinar: Proc SQL Tips and Techniques by Kirk Lafler

Webinar: Dictionary Tables by Kirk Lafler

Exploring the Contents of the DICTIONARIES Table and VDCTNRY SASHELP View for SAS 9.2, Kirk Lafler

Visual SQL by Tasha Chapman

Proc SQL vs. DATA Step: Issues with common variables when combining data sets

SAS Blog - Five Reasons to use the SAS Data Step or Proc SQL

WHERE expressions 1 2 3

SAS Functions:

Referential Integrity Constraints:


SAS® Savvy Class:Maximizing Productivity and Efficiency using Proc SQL

Powered by Wild Apricot Membership Software