【postgresql 基础入门】多表联合查询 join与union 并，交，差等集合操作，两者的区别之处-Toy模板网

这篇具有很好参考价值的文章主要介绍了【postgresql 基础入门】多表联合查询 join与union 并，交，差等集合操作，两者的区别之处。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

多表数据联合查询

专栏内容：

postgresql内核源码分析

手写数据库toadb

并发编程

开源贡献：

toadb开源库

个人主页：我的主页
管理社区：开源数据库
座右铭：天行健，君子以自强不息；地势坤，君子以厚德载物.

系列文章

入门准备
postgrersql基础架构
快速使用
初始化集群
数据库服务管理
psql客户端使用
pgAdmin图形化客户端
数据库的使用
创建数据库
数据库操作
表的使用
表的创建
表的操作
数据查询
数据查询
多表联合查询
数据操作
插入数据的方式

前言

postgresql 数据库是一款通用的关系型数据，在开源数据库中能与商业数据媲美，在业界也越来越流行。

因为是开源数据库，不仅公开源码，还有很多使用案例，好用的插件，所以它的慢慢变成了数据库的先驱和标准，通过postgresql可以很好从使用到原理，彻底搞懂；

如果是学习编程，也可以学到丰富的编程知识，数据结构，编程技巧，它里面还有很多精妙的架构设计，分层思想，可以灵活定制的思想。

本专栏主要介绍postgresql 入门使用，数据库维护管理，通过这些使用来了解数据库原理，慢慢了解postgresql是什么样的数据库，能做那些事情，以及如何做好服务，最关键的是这些知识都是面试的必备项。

概述

我们在实际应用中查询数据，往往涉及到多表的数据，如何使用一条SQL就能得到结果呢？

本文就来分享一下，多表数据的查询方法，并举例说明它们使用的技巧；

原理介绍

多表数据的联合查询，在postgresql 中有两个基本方法：

join 连接操作；
union 联合子查询；

多表join连接，其实就是通过某个列作为纽带，将多个实际的表连接成一张大表，然后在大表上进行查询；

而union 与 join 完全不同， union 通过联合多个子查询结果，也就是说union 操作的是查询结果，将多个结果集合并成一个结果集，然后在这个总结果集上再进行二次查询处理；
也就是我们数学中的集合的几种

而更加总结的话，就涉及到关系代数中对于集合的操作：
集合操作主要包括以下几种：

并集操作（Union）：将两个集合合并成一个集合，包括所有属于两个集合的元素。
交集操作（Intersection）：将两个集合的公共元素组成一个新的集合。
差集操作（Difference）：从一个集合中去掉属于另一个集合的元素，剩下的元素组成一个新的集合。
对称差集操作（Symmetric Difference）：将属于一个集合但不属于另一个集合的元素，以及属于另一个集合但不属于一个集合的元素组成一个新的集合。
笛卡尔积操作（Cartesian Product）：将两个集合的所有可能有序对组成一个新的集合。

多表 join 连接操作

在PostgreSQL中，多表查询是通过使用连接（JOIN）和交叉连接（CROSS JOIN）等操作来实现的。

连接操作是指将两个或多个表按照指定的条件进行关联，以获得它们之间的关系数据。
下面我们举例来说明，首先创建两张表custom 和 order;

以下是一个使用PostgreSQL进行JOIN操作的案例：

假设我们有两个表：customers和orders。customers表包含客户的信息，而orders表包含订单的信息。这两个表通过一个共同的字段customer_id相关联。

首先，让我们创建这两个表并插入一些数据：

CREATE TABLE customers (
  customer_id INT PRIMARY KEY,
  name VARCHAR(50),
  email VARCHAR(50)
);

CREATE TABLE orders (
  order_id INT PRIMARY KEY,
  customer_id INT,
  order_date DATE,
  total_amount DECIMAL(10,2)
);

INSERT INTO customers (customer_id, name, email)
VALUES (1, 'John Doe', 'john@example.com'),
       (2, 'Jane Smith', 'jane@example.com'),
       (3, 'Bob Johnson', 'bob@example.com'),
       (4, 'Steven John', 'steven@example.com'),
       (5, 'Kenidy', 'Kenidy@example.com');

INSERT INTO orders (order_id, customer_id, order_date, total_amount)
VALUES (1, 1, '2023-01-01', 100.00),
       (2, 1, '2023-02-01', 200.00),
       (3, 2, '2023-02-15', 150.00),
       (4, 3, '2023-03-01', 75.00);

1. 内连接（INNER JOIN）

将两个表中的行进行匹配，返回满足连接条件的行。语法如下：

postgres=# SELECT * FROM customers INNER JOIN orders ON customers.customer_id = orders.customer_id;
 customer_id |    name     |      email       | order_id | customer_id | order_date | total_amount
-------------+-------------+------------------+----------+-------------+------------+--------------
           1 | John Doe    | john@example.com |        1 |           1 | 2023-01-01 |       100.00
           1 | John Doe    | john@example.com |        2 |           1 | 2023-02-01 |       200.00
           2 | Jane Smith  | jane@example.com |        3 |           2 | 2023-02-15 |       150.00
           3 | Bob Johnson | bob@example.com  |        4 |           3 | 2023-03-01 |        75.00
(4 rows)

这里特意用 * 查出结果集中的所有列，让我们可以清晰看到结果集的全貌，可以看到join后的结果集，是两表的所有列的合并；

对于内联连，只是列出了符合连接条件的行，大家想一想，还有什么写法可以达到这种效果。

对于，条件写到where子句中也可以，所以内联接与where条件是等价的；

postgres=# select * from customers,orders where customers.customer_id = orders.customer_id;
 customer_id |    name     |      email       | order_id | customer_id | order_date | total_amount
-------------+-------------+------------------+----------+-------------+------------+--------------
           1 | John Doe    | john@example.com |        1 |           1 | 2023-01-01 |       100.00
           1 | John Doe    | john@example.com |        2 |           1 | 2023-02-01 |       200.00
           2 | Jane Smith  | jane@example.com |        3 |           2 | 2023-02-15 |       150.00
           3 | Bob Johnson | bob@example.com  |        4 |           3 | 2023-03-01 |        75.00
(4 rows)

2. 左连接（LEFT JOIN）

在内连接的基础上，将左侧表中的所有行都包含在结果集中，即使右侧表中没有匹配的行。语法如下：

postgres=# select * from customers left join orders on  customers.customer_id = orders.customer_id;
 customer_id |    name     |       email        | order_id | customer_id | order_date | total_amount
-------------+-------------+--------------------+----------+-------------+------------+--------------
           1 | John Doe    | john@example.com   |        1 |           1 | 2023-01-01 |       100.00
           1 | John Doe    | john@example.com   |        2 |           1 | 2023-02-01 |       200.00
           2 | Jane Smith  | jane@example.com   |        3 |           2 | 2023-02-15 |       150.00
           3 | Bob Johnson | bob@example.com    |        4 |           3 | 2023-03-01 |        75.00
           5 | Kenidy      | Kenidy@example.com |          |             |            |
           4 | Steven John | steven@example.com |          |             |            |
(6 rows)

左联接后的结果集，列也是两表的合并，而行数与之前不同，左边表列表出所有行，而右边的表只列出了符合条件的行，对于左表多出的行，右表以空代替；

3. 右连接（RIGHT JOIN）

在内连接的基础上，将右侧表中的所有行都包含在结果集中，即使左侧表中没有匹配的行。语法如下：

postgres=# select * from customers right join orders on  customers.customer_id = orders.customer_id;
 customer_id |    name     |      email       | order_id | customer_id | order_date | total_amount
-------------+-------------+------------------+----------+-------------+------------+--------------
           1 | John Doe    | john@example.com |        1 |           1 | 2023-01-01 |       100.00
           1 | John Doe    | john@example.com |        2 |           1 | 2023-02-01 |       200.00
           2 | Jane Smith  | jane@example.com |        3 |           2 | 2023-02-15 |       150.00
           3 | Bob Johnson | bob@example.com  |        4 |           3 | 2023-03-01 |        75.00
(4 rows)

右联接与左联接类似，结果集的行包括右表的所有行，左表只有符合联接表条件行；

4. 全连接（FULL JOIN）

相当于在左连接和右连接的基础上，同时做左连接和右连接，并返回两侧表中所有满足条件的行。语法如下：

postgres=# select * from customers full join orders on  customers.customer_id = orders.customer_id;
 customer_id |    name     |       email        | order_id | customer_id | order_date | total_amount
-------------+-------------+--------------------+----------+-------------+------------+--------------
           1 | John Doe    | john@example.com   |        1 |           1 | 2023-01-01 |       100.00
           1 | John Doe    | john@example.com   |        2 |           1 | 2023-02-01 |       200.00
           2 | Jane Smith  | jane@example.com   |        3 |           2 | 2023-02-15 |       150.00
           3 | Bob Johnson | bob@example.com    |        4 |           3 | 2023-03-01 |        75.00
           5 | Kenidy      | Kenidy@example.com |          |             |            |
           4 | Steven John | steven@example.com |          |             |            |
(6 rows)

全联接就是包括左右两条的所有行，没有符合条件的行以空代替；

多表union 操作

PostgreSQL中的联合查询是一种将多个SELECT语句的结果组合成一个结果集的方法。它允许您从多个表或查询中获取数据，并根据指定的条件将它们组合在一起。

联合查询的基本语法如下：

SELECT column1, column2, ...
FROM table1
UNION/UNION ALL/EXCEPT/INTERSECT
SELECT column1, column2, ...
FROM table2
WHERE condition;

这里有几个关键部分：

SELECT语句：用于指定要检索的列和表。
UNION、UNION ALL、EXCEPT和INTERSECT：这些关键字用于指定要执行的联合操作类型。
WHERE子句：可选的条件，用于筛选结果。

联合类型说明

UNION：返回两个查询结果的并集，但会删除重复的行。
UNION ALL：返回两个查询结果的并集，包括重复的行。
EXCEPT：返回第一个查询结果中存在但在第二个查询结果中不存在的行。
INTERSECT：返回两个查询结果中共有的行。

请注意，使用联合查询时，确保每个查询中选择的列数和列类型是一致的，否则可能会导致错误。

1. union

合并两个表的数据并删除重复行

postgres=# select  customer_id from customers union select customer_id from orders ;
 customer_id
-------------
           2
           3
           5
           4
           1
(5 rows)

这将返回一个结果集，其中包含两个表中所有不重复的行；

两个select 子句中的列数和类型必须一致才行，这样两个结果集才能合并到一起。

2. union all

合并两个表的数据并保留重复行

如果我们希望保留两个表中的所有行，包括重复的行，那么可以使用 UNION ALL 运算符。

postgres=# select  customer_id from customers union all select customer_id from orders ;
 customer_id
-------------
           1
           2
           3
           4
           5
           1
           1
           2
           3
(9 rows)

这将返回一个结果集，其中包含两个表中所有的行，包括重复的行。
同时，如果想要对结果进行排序；

可以使用 ORDER BY 子句。例如：

postgres=# select  customer_id from customers union all select customer_id from orders order by customer_id asc;
 customer_id
-------------
           1
           1
           1
           2
           2
           3
           3
           4
           5
(9 rows)

当然也可以加where 等其它子句；

3. except

获得两个集合的差，也就是前者集合中包括，而不属于后者集合的行；

postgres=# select  customer_id from customers except select customer_id from orders ;
 customer_id
-------------
           5
           4
(2 rows)

也就是查询还没有产生订单的客户ID列表；

3. intersect

INTERSECT运算符用于找出两个SELECT语句结果集的交集。它的语法如下：

postgres=# select  customer_id from customers intersect select customer_id from orders order by customer_id asc;
 customer_id
-------------
           1
           2
           3
(3 rows)

这个类似于inner join，找到有订单的客户id列表；