PEOC SQL（五）——内联接

一、了解joins

二、笛卡尔积

三、PROC SQL按照以下步骤处理联接

四、内联接

1、功能：

2、两种方法

3、语法：

4、举例：

一、了解joins

是一种查询，被联接的表不需要具有相同的行数或者列数。

联接一般有两种类型：

内联接——取两个表公共的部分
外联结

二、笛卡尔积

概念：处理任何类型的联接时，proc sql首先生成一个笛卡尔积，其中包含来自所有表中行，在一起的所有的可能的组合。

生成一个笛卡尔积：如果在from语句中指定多个表，但没有where语句，则proc sql 将返回表的笛卡尔积。

注意：

在笛卡尔积中，第一个表中的每一行与第二个表中的每一行组合在一起
具有公共名称的列不会被覆盖
笛卡尔积中的行数等于提供数据的表中的行数的乘积
当您运行涉及无法优化的笛卡尔产品的查询时，PROC SQL将以下警告消息写入SAS日志：
NOTE: The execution of this query involves performing one or more Cartesian product joins that cannot be optimized.
在许多情况下，PROC SQL可以优化联接的处理，从而最小化生成笛卡尔产品所需的资源。

举例：

work.one work.two
x a x b
—————————————
1 A 2 x
2 B 3 y
3 D 5 v

cartesian product
x a x b
———————–
1 A 2 x
1 A 3 y
1 A 5 v
2 B 2 x
2 B 3 y
2 B 5 v
3 D 2 x
3 D 3 y
3 D 5 v

三、PROC SQL按照以下步骤处理联接

从指定的表中构建行的笛卡尔积
基于WHERE语句中指定的联接条件以及任何其他子设置条件（on语句），计算笛卡尔积中的每一行，并删除不满足指定条件的任何行
如果指定了汇总函数，则汇总符合条件的行
返回要在输出中显示的行

四、内联接

1、功能：

将两个表之间匹配的行组合在一起，又称为：常规联接

2、两种方法

指定要在FROM语句中连接的两个表，子句由INNER JOIN关键字分隔。通过ON子句，指示应该如何匹配行
根据在WHERE子句中指定的匹配条件连接这两个表

3、语法：

from：指定第一个表，指定联接类型（INNER JOIN），指定第二个表

on:指定联接条件（连接条件中比较的列需要具有相同的数据类型，否则proc sql 不会执行连接）

table：源表的名称

<other clauses>：可选的proc sql子句

SELECT column-1<,...column-n> 
	FROM table-1 | view-1 
		INNER JOIN table-2| view-2 
		<INNER JOIN...table-n | view-n> 
		ON table1.column=table2.column 
		<other clauses>;

4、举例：

使用带有INNER JOIN关键字的FROM子句

proc sql; 
	select *                     /*列名分别为：x a x b*/
    	from work.one inner join work.two 
    	on one.x=two.x; 
quit;

/*消除重复的列*/
proc sql; 
	select one.x, a, b           /*列名分别为：x a b*/
    	from work.one inner join work.two 
    	on one.x=two.x; 
quit;


proc sql; 
	select one.*, b              /*列名分别为：x a b*/
	    from work.one inner join work.two 
	    on one.x=two.x; 
quit;

proc sql; 
	select one.x as ID, two.x, a, b  /*列名分别为：ID x a b*/
	    from work.one inner join work.two 
	    on one.x=two.x; 
quit;

联接具有匹配值的行的表
work.three work.four
x a x b
—————————————
1 A1 2 x1
1 A2 2 x2
2 B1 3 y
2 B2 5 v
4 D
```
proc sql; 
    select * 
        from work.three inner join work.four 
        on three.x=four.x; 
quit;
```
输出结果：
x a x b
———————–
2 B1 2 x1
2 B2 2 x2
2 B1 2 x1
2 B2 2 x2

关键字AS用于在FROM子句中定义表别名

proc sql; 
    title 'Employee Names and Job Codes'; 
    select staffmaster.empid, lastname, firstname, jobcode
        from certadv.staffmaster inner join certadv.payrollmaster 
        on staffmaster.empid=payrollmaster.empid; 
quit;

proc sql; 
    title 'Employee Names and Job Codes'; 
    select s.empid, lastname, firstname, jobcode 
        from sasuser.staffmaster as s inner join sasuser.payrollmaster as p 
        on s.empid=p.empid; 
quit;

一下两种情况，会使用as关键字：
1、表与自身链接（自连接或自反连接）
from sasuser.staffmaster as s1,sasuser.staffmaster as s2
2、需要引用不同库中相同名称的表中的列
from certadv.flightdelays as af,certadvf.flightdelays as wf
on af.delay > wf.delay

复杂的PROC SQL内部连接

/*
创建一个报告，其中显示居住在纽约的所有员工的姓名，包括姓名的首字母和姓(R. Long)、工作代码和年龄。报告还应该按照工作代码和年龄进行排序。
sasuser.Staffmaster----EmpID, LastName, FirstName, State
sasuser.Payrollmaster--EmpId, JobCode, DateOfBirth
*/
proc sql outobs=15; 
title 'New York Employees'; 
	select substr(firstname,1,1) || '. ' || lastname as Name, 
	       jobcode, int((today() - dateofbirth)/365.25) as Age 
		from sasuser.payrollmaster as p inner join 
		     sasuser.staffmaster as s 
		on p.empid = s.empid 
		where state='NY' 
		order by 2,3 ;
quit;

SELECT子句使用函数和表达式创建两个新列

PROC SQL内部连接与摘要函数

/*
总结了每个工作代码中针对纽约员工的列:员工数量和平均年龄
sasuser.Staffmaster----EmpID, LastName, FirstName, State
sasuser.Payrollmaster--EmpId, JobCode, DateOfBirth
*/
proc sql outobs=15; 
title 'Average Age of New York Employees'; 
	select jobcode,count(p.empid) as Employees, 
	       avg(int((today() - dateofbirth)/365.25)) format=4.1 as AvgAge 
		from sasuser.payrollmaster as p inner join
		     sasuser.staffmaster as s 
		on p.empid= s.empid 
		where state='NY' 
		group by jobcode
		order by jobcode;
quit;

本文地址：https://blog.csdn.net/weixin_44450031/article/details/107134926

PEOC SQL（五）——内联接

一、了解joins

二、笛卡尔积

三、PROC SQL按照以下步骤处理联接

四、内联接

1、功能：

2、两种方法

3、语法：

4、举例：

相关推荐