首页 > 技术文章 > SQL-数据库刷题

Frank99 2018-08-02 13:10 原文

因是个人总结,只列出对自己有用的或较难的:

下面这道题,第一次拿到,我尝试用 开窗函数 ROW_NUMBER()OVER() 编号,但是发现不能够处理好连续的问题,

上网查找了别人的解法记录下来,其实原理 是 把 Logs 看成 三张表,每张表之间关联 -1 ,然后筛选出 Num 相等的

编写一个 SQL 查询,查找所有至少连续出现三次的数字。

+----+-----+
| Id | Num |
+----+-----+
| 1  |  1  |
| 2  |  1  |
| 3  |  1  |
| 4  |  2  |
| 5  |  1  |
| 6  |  2  |
| 7  |  2  |
+----+-----+
例如,给定上面的 Logs 表, 1 是唯一连续出现至少三次的数字。

+-----------------+
| ConsecutiveNums |
+-----------------+
| 1               |
+-----------------+

select distinct log1.Num from  Logs log1 join Logs log2 on log1.Id = log2.Id-1 
                        join Logs log3 on log2.Id = log3.Id -1
                        where log1.Num = log2.Num and log2.Num = log3.Num

-- 同样的 连续问题 

X 市建了一个新的体育馆,每日人流量信息被记录在这三列信息中:序号 (id)、日期 (date)、 人流量 (people)。

请编写一个查询语句,找出高峰期时段,要求连续三天及以上,并且每天人流量均不少于100。

例如,表 stadium:

+------+------------+-----------+
| id   | date       | people    |
+------+------------+-----------+
| 1    | 2017-01-01 | 10        |
| 2    | 2017-01-02 | 109       |
| 3    | 2017-01-03 | 150       |
| 4    | 2017-01-04 | 99        |
| 5    | 2017-01-05 | 145       |
| 6    | 2017-01-06 | 1455      |
| 7    | 2017-01-07 | 199       |
| 8    | 2017-01-08 | 188       |
+------+------------+-----------+
对于上面的示例数据,输出为:

+------+------------+-----------+
| id   | date       | people    |
+------+------------+-----------+
| 5    | 2017-01-05 | 145       |
| 6    | 2017-01-06 | 1455      |
| 7    | 2017-01-07 | 199       |
| 8    | 2017-01-08 | 188       |
+------+------------+-----------+
Note:
每天只有一行记录,日期随着 id 的增加而增加。


;WITH stadium(id,date,people) AS
(
  SELECT 1,' 2017-01-01 ',10 union all
SELECT 2,' 2017-01-02 ',109 union all
SELECT 3,' 2017-01-03 ',150 union all
SELECT 4,' 2017-01-04 ',99 union all
SELECT 5,' 2017-01-05 ',145 union all
SELECT 6,' 2017-01-06 ',1455 union all
SELECT 7,' 2017-01-07 ',199 union all
SELECT 8,' 2017-01-08 ',188  

)
, temp as(
select * from stadium where people > 100  --- 这里 可以避免后面写一大段 筛选条件
)
,tab as(
select id
       ,date
       ,people
       ,(id - (select max(id) from temp t where t.id < tmp.id)) as id_diff 
from temp tmp
)
,final_tab as (
select t2.* from tab t1 join tab t2 on t1.id+1 = t2.id
                     join tab t3 on t2.id+1 = t3.id 
                     where t1.id_diff = 1 and t2.id_diff = 1 and t3.id_diff = 1
)


select s.* from stadium s join final_tab f on s.id+1 = f.id
union ALL
select id
       ,date
       ,people from final_tab
union ALL
select s.* from stadium s join final_tab f on s.id-1 = f.id

leetcode 标记难度 困难
需求:
-- 部门工资前三高的员工
Employee 表包含所有员工信息,每个员工有其对应的 Id, salary 和 department Id 。

+----+-------+--------+--------------+
| Id | Name  | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1  | Joe   | 70000  | 1            |
| 2  | Henry | 80000  | 2            |
| 3  | Sam   | 60000  | 2            |
| 4  | Max   | 90000  | 1            |
| 5  | Janet | 69000  | 1            |
| 6  | Randy | 85000  | 1            |
+----+-------+--------+--------------+
Department 表包含公司所有部门的信息。

+----+----------+
| Id | Name     |
+----+----------+
| 1  | IT       |
| 2  | Sales    |
+----+----------+
编写一个 SQL 查询,找出每个部门工资前三高的员工。例如,根据上述给定的表格,查询结果应返回:

+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Max      | 90000  |
| IT         | Randy    | 85000  |
| IT         | Joe      | 70000  |
| Sales      | Henry    | 80000  |
| Sales      | Sam      | 60000  |
+------------+----------+--------+

第一种解法:


;with Employee(Id,Name,Salary,DepartmentId) AS
(
  select 1,'Joe','70000',1 union all
select 2,'Henry','80000',2 union all
select 3,'Sam','60000',2 union all
select 4,'Max','90000',1 union all
select 5,'Janet','69000',1 union all
select 6,'Randy','85000',1

)
, Department(Id,Name) AS(
 SELECT 1,'IT'
 UNION ALL
 SELECT 2,'Sales'
 
)


 select d.Name as Department,e.Name as Employee,e.Salary from 
 (
 SELECT *,ROW_NUMBER()OVER(partition by DepartmentId order by Salary desc) as Rank FROM Employee)  e
                                             join Department d on e.DepartmentId = d.Id
                                             where Rank<=3
                                             order by d.Id ASC

第二种解法:
为了避免 有相同排名出现,采用 DENSE_RANK  密级排名
 select d.Name as Department,e.Name as Employee,e.Salary from 
 (
 SELECT *,Dense_Rank()OVER(partition by DepartmentId order by Salary desc) as Rank FROM Employee)  e
                                             join Department d on e.DepartmentId = d.Id
                                             where Rank<=3
                                             order by d.Id ASC

leetcode 难度标记: 简单
给定一个 Weather 表,编写一个 SQL 查询,来查找与之前(昨天的)日期相比温度更高的所有日期的 Id。

+---------+------------------+------------------+
| Id(INT) | RecordDate(DATE) | Temperature(INT) |
+---------+------------------+------------------+
|       1 |       2015-01-01 |               10 |
|       2 |       2015-01-02 |               25 |
|       3 |       2015-01-03 |               20 |
|       4 |       2015-01-04 |               30 |
+---------+------------------+------------------+
例如,根据上述给定的 Weather 表格,返回如下 Id:

+----+
| Id |
+----+
|  2 |
|  4 |
+----+
select Id from Weather w where Temperature > (select Temperature from Weather l  where dateadd(day,1,l.RecordDate) = w.RecordDate)

leetcode 难度标记: 简单
编写一个 SQL 查询,获取 Employee 表中第二高的薪水(Salary) 。

+----+--------+
| Id | Salary |
+----+--------+
| 1  | 100    |
| 2  | 200    |
| 3  | 300    |
+----+--------+
例如上述 Employee 表,SQL查询应该返回 200 作为第二高的薪水。如果不存在第二高的薪水,那么查询应返回 null。

+---------------------+
| SecondHighestSalary |
+---------------------+
| 200                 |
+---------------------+
解法:
SELECT MAX(SALARY) AS SecondHighestSalary FROM Employee WHERE Salary <(SELECT MAX(SALARY) FROM Employee)
leetcode 难度标记: 中等
Employee 表包含所有员工信息,每个员工有其对应的 Id, salary 和 department Id。

+----+-------+--------+--------------+
| Id | Name  | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1  | Joe   | 70000  | 1            |
| 2  | Henry | 80000  | 2            |
| 3  | Sam   | 60000  | 2            |
| 4  | Max   | 90000  | 1            |
+----+-------+--------+--------------+
Department 表包含公司所有部门的信息。

+----+----------+
| Id | Name     |
+----+----------+
| 1  | IT       |
| 2  | Sales    |
+----+----------+
编写一个 SQL 查询,找出每个部门工资最高的员工。例如,根据上述给定的表格,Max 在 IT 部门有最高工资,Henry 在 Sales 部门有最高工资。

+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Max      | 90000  |
| Sales      | Henry    | 80000  |
+------------+----------+--------+

/* Write your T-SQL query statement below */
select d.Name as Department
      ,e.Name as Employee
      ,e.Salary as Salary
from Employee e 
      join Department d on e.DepartmentId = d.Id
      join(  SELECT DepartmentId
                    ,max(Salary) as Salary 
                    FROM Employee  Group by DepartmentId
            ) 
             grp on e.Salary = grp.Salary 
             and e.DepartmentId = grp.DepartmentId



编写一个 SQL 查询,来删除 Person 表中所有重复的电子邮箱,重复的邮箱里只保留 Id 最小 的那个。

+----+------------------+
| Id | Email            |
+----+------------------+
| 1  | john@example.com |
| 2  | bob@example.com  |
| 3  | john@example.com |
+----+------------------+
Id 是这个表的主键。
例如,在运行你的查询语句之后,上面的 Person 表应返回以下几行:

+----+------------------+
| Id | Email            |
+----+------------------+
| 1  | john@example.com |
| 2  | bob@example.com  |
+----+------------------+

--- 这道题 leetcode 只支持 mysql 语法,很是恶心到我了。。。最后也不知道 运行成功了没有,但是 暂且记下 我的答案吧

# Write your MySQL query statement below
delete from  Person where id  not in(select id from ( select min(Id)  as id
                                                        from Person  
                                                        where Email in (select Email 
                                                                   from Person 
                                                                   group by Email 
                                                                   having count(Email) > 1
                                                             ) ) as n )
                                     and email in (select email from (select Email 
                                                                   from Person 
                                                                   group by Email 
                                                                   having count(Email)>1) as m)
                                     

leetcode 难度标记:中等
小美是一所中学的信息科技老师,她有一张 seat 座位表,平时用来储存学生名字和与他们相对应的座位 id。

其中纵列的 id 是连续递增的

小美想改变相邻俩学生的座位。

你能不能帮她写一个 SQL query 来输出小美想要的结果呢?

 

示例:

+---------+---------+
|    id   | student |
+---------+---------+
|    1    | Abbot   |
|    2    | Doris   |
|    3    | Emerson |
|    4    | Green   |
|    5    | Jeames  |
+---------+---------+
假如数据输入的是上表,则输出结果如下:

+---------+---------+
|    id   | student |
+---------+---------+
|    1    | Doris   |
|    2    | Abbot   |
|    3    | Green   |
|    4    | Emerson |
|    5    | Jeames  |
+---------+---------+
注意:

如果学生人数是奇数,则不需要改变最后一个同学的座位。


                                             
;with seat(id,student) as(                                             
SELECT 1 ,'Abbot' union all
SELECT 2 ,'Doris' union all
SELECT 3 ,'Emerson' union all
SELECT 4 ,'Green' union all
SELECT 5 ,'Jeames'
)
,odd_tab as(
select * from seat where id %2 = 1
)
,even_tab as(
select * from seat where id %2 = 0
)
select o.id,isnull(e.student,o.student) as student from odd_tab o left join even_tab e on o.id +1 = e.id
union ALL
select e.id,o.student from odd_tab o join even_tab e on o.id +1 = e.id
order by id asc 

leetcode 难度标记 困难,个人觉得不该标记为困难

Trips 表中存所有出租车的行程信息。每段行程有唯一健 Id,Client_Id 和 Driver_Id 是 Users 表中 Users_Id 的外键。Status 是枚举类型,枚举成员为 (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’)。

+----+-----------+-----------+---------+--------------------+----------+
| Id | Client_Id | Driver_Id | City_Id |        Status      |Request_at|
+----+-----------+-----------+---------+--------------------+----------+
| 1  |     1     |    10     |    1    |     completed      |2013-10-01|
| 2  |     2     |    11     |    1    | cancelled_by_driver|2013-10-01|
| 3  |     3     |    12     |    6    |     completed      |2013-10-01|
| 4  |     4     |    13     |    6    | cancelled_by_client|2013-10-01|
| 5  |     1     |    10     |    1    |     completed      |2013-10-02|
| 6  |     2     |    11     |    6    |     completed      |2013-10-02|
| 7  |     3     |    12     |    6    |     completed      |2013-10-02|
| 8  |     2     |    12     |    12   |     completed      |2013-10-03|
| 9  |     3     |    10     |    12   |     completed      |2013-10-03| 
| 10 |     4     |    13     |    12   | cancelled_by_driver|2013-10-03|
+----+-----------+-----------+---------+--------------------+----------+
Users 表存所有用户。每个用户有唯一键 Users_Id。Banned 表示这个用户是否被禁止,Role 则是一个表示(‘client’, ‘driver’, ‘partner’)的枚举类型。

+----------+--------+--------+
| Users_Id | Banned |  Role  |
+----------+--------+--------+
|    1     |   No   | client |
|    2     |   Yes  | client |
|    3     |   No   | client |
|    4     |   No   | client |
|    10    |   No   | driver |
|    11    |   No   | driver |
|    12    |   No   | driver |
|    13    |   No   | driver |
+----------+--------+--------+
写一段 SQL 语句查出 2013年10月1日 至 2013年10月3日 期间非禁止用户的取消率。基于上表,你的 SQL 语句应返回如下结果,取消率(Cancellation Rate)保留两位小数。

+------------+-------------------+
|     Day    | Cancellation Rate |
+------------+-------------------+
| 2013-10-01 |       0.33        |
| 2013-10-02 |       0.00        |
| 2013-10-03 |       0.50        |
+------------+-------------------+



Create table   Trips (Id int,Client_Id int, 
                       Driver_Id int, City_Id int, 
                       Status varchar(50), 
                       Request_at varchar(50));
Create table   Users (Users_Id int,Banned varchar(50), Role varchar(50));
Truncate table Trips;


insert into Trips (Id, Client_Id, Driver_Id,City_Id, Status, Request_at) values ('1', '1', '10', '1', 'completed','2013-10-01');
insert into Trips (Id, Client_Id, Driver_Id,City_Id, Status, Request_at) values ('2', '2', '11', '1','cancelled_by_driver', '2013-10-01');
insert into Trips (Id, Client_Id, Driver_Id,City_Id, Status, Request_at) values ('3', '3', '12', '6', 'completed','2013-10-01');
insert into Trips (Id, Client_Id, Driver_Id,City_Id, Status, Request_at) values ('4', '4', '13', '6','cancelled_by_client', '2013-10-01');
insert into Trips (Id, Client_Id, Driver_Id,City_Id, Status, Request_at) values ('5', '1', '10', '1', 'completed','2013-10-02');
insert into Trips (Id, Client_Id, Driver_Id,City_Id, Status, Request_at) values ('6', '2', '11', '6', 'completed','2013-10-02');
insert into Trips (Id, Client_Id, Driver_Id,City_Id, Status, Request_at) values ('7', '3', '12', '6', 'completed','2013-10-02');
insert into Trips (Id, Client_Id, Driver_Id,City_Id, Status, Request_at) values ('8', '2', '12', '12', 'completed','2013-10-03');
insert into Trips (Id, Client_Id, Driver_Id,City_Id, Status, Request_at) values ('9', '3', '10', '12', 'completed','2013-10-03');
insert into Trips (Id, Client_Id, Driver_Id,City_Id, Status, Request_at) values ('10', '4', '13', '12','cancelled_by_driver', '2013-10-03');
Truncate table Users;
insert into Users (Users_Id, Banned, Role)values ('1', 'No', 'client');
insert into Users (Users_Id, Banned, Role)values ('2', 'Yes', 'client');
insert into Users (Users_Id, Banned, Role)values ('3', 'No', 'client');
insert into Users (Users_Id, Banned, Role)values ('4', 'No', 'client');
insert into Users (Users_Id, Banned, Role)values ('10', 'No', 'driver');
insert into Users (Users_Id, Banned, Role)values ('11', 'No', 'driver');
insert into Users (Users_Id, Banned, Role)values ('12', 'No', 'driver');
insert into Users (Users_Id, Banned, Role)values ('13', 'No', 'driver');


select  t.Request_at as [Day],Round((sum(case when Status like '%cancelled%' then 1 else 0 end) * 1.0 / count(status) ),2) as [Cancellation Rate]
from  Trips t join USERs u 
                   on t.client_id = u.users_id 
                   where Request_at between '2013-10-01' and '2013-10-03'
                         and u.Role = 'client' and Banned = 'No'
                         group by Request_at

小结:

目前,leetcode 上的 sql 题均已刷完,感觉难度一般,个人卡壳的地方 是 连续问题的 求解, 里边有两道 连续问题,本文第一道和第二道
下次见到 连续问题应该首先想 多表 id+1 , id+2 ,...,id+n 这种方式求解,而不应想编号,编号不足以解决连续问题

推荐阅读