SQL 常用聚合统计函数

提示

Hive SQL 教程欢迎使用。提供建议、纠错、催更等加作者微信: gr99123（备注：sql ）和关注公众号「盖若」ID: gairuo。跟作者学习，请进入 Python学习课程。欢迎关注作者出版的书籍：《深入浅出Pandas》和《Python之光》。

数据分析的过程是将大量数据进行运算并得出总体特征的过程，就需要利用数据在数学统计意义上的特征。本文讲解常用和通用的 SQL 统计聚合函数，这些函数是经常使用到的，应该熟悉掌握并知道它们的具体业务意义。

概述

函数	功能描述	其他
count()	条数	不计 null 值
sum()	求和	True 按 1 处理，False 按 0 处理，忽略 null 值
max()	最大值	时间字段代表最近最晚的时间
min()	最小值	时间字段代表最早的时间
avg()	平均值	忽略 null 值, sum 除以非空值的计数 count

以上函数会返回一个值，建议给返回字段起一个别名。

其他：

count(*)、count(1) 等可以解决不计 null 的问题
sum(*)、sum(1) 可实现上述 count 的效果

案例

本文例子中使用的数据是筛选指定字段中的数据内容。

-- 最大值：88
select max(math) as max_math from students
-- 最小值：54
select min(math) as min_math from students
-- 人员数量，就是数据的数量 9
select count(name) as count_name from students
-- 有时可以这么写 9，星号 null 值计入
select count(*) as count_data from students
-- 数字平均成绩 73.44444444444444
select avg(math) as avg_math from students
--数学成绩总和 661
select sum(math) as sum_math from students

和 DISTINCT 配合

count() 经常与 DISTINCT 组合使用，表示去重后的总数量：

-- 有几个班：3
select count(distinct class) as class_qty from students
-- 有几个性别：2
select count(distinct gender) as gender_qty from students

聚合使用

对数据使用 GROUP BY 聚合后必须使用聚合函数指出聚合后想要输出字段的计算方法，以上几个函数便经常使用在聚合操作中。

可参见聚合统计函数

SQL 常用聚合统计函数

概述

案例

和 DISTINCT 配合

聚合使用

更多

相关内容