sql面试题之”找出使用相同ip的用户“

现有一张表,里面有三个字段为user_id、ip、log_time,现有需求要找出用户共同使用ip数量大于等于3个的用户对找出来。

1.表数据准备


--建表语句  
  create table dms.user_login_log(
  user_id   string 
  ,ip       string
  ,log_time string
  );
--插入数据  
  insert into dms.user_login_log values
(102,'192.168.10.101','2022-05-10 11:04:30'),
(102,'192.168.10.102','2022-05-10 11:05:00'),
(102,'192.168.10.103','2022-05-10 11:06:00'),
(102,'192.168.10.104','2022-05-10 11:07:00'),  
(101,'192.168.10.101','2022-05-10 11:00:00'),
(101,'192.168.10.101','2022-05-10 11:01:00'),
(101,'192.168.10.102','2022-05-10 11:02:00'),
(101,'192.168.10.103','2022-05-10 11:03:00'),
(101,'192.168.10.104','2022-05-10 11:04:00'),

(103,'192.168.10.102','2022-05-10 11:08:00'),
(103,'192.168.10.103','2022-05-10 11:08:00'),
(103,'192.168.10.104','2022-05-10 11:10:00'),

(104,'192.168.10.103','2022-05-10 11:11:00'),
(104,'192.168.10.104','2022-05-10 11:12:00'),

(105,'192.168.10.105','2022-05-10 11:13:00')

2.需求分析

我们最终想要获取的是公共使用ip数量超过3的用户对,比如:101和102 共用了4个IP,101和103公用了3个IP,102和103公用3个。可以通过自关联实现。

3.实现sql如下

 select  
        A1.user_id
       ,A2.user_id
       ,count(1)
      from 
(select
      user_id
      ,ip 
 from dms.user_login_log
group by user_id,ip)  A1
join
(select
     user_id
     ,ip 
 from dms.user_login_log
group by user_id,ip) A2 
ON A1.ip = A2.ip
where A1.user_id>A2.user_id
group by A1.user_id,A2.user_id
having count(1)>=3

实现效果如下:
sql面试题之”找出使用相同ip的用户“_第1张图片

你可能感兴趣的:(sql,hive,sql面试题,sql,数据库,hive)