SELECT
mat.tenant_id, mat.org_id ,
sum(case when serv.priorservresult in (0,2,4,5) and rel.match_value <= rule.match_value then 1 else NULL end) as numerator
from mat
left join serv
on serv.animal_id = mat.animal_id
and serv.servdate = mat.event_date
and serv.parity = mat.total_parity
left join rel on rel.sow_animal_id = mat.animal_id and rel.deleted = '0'
left join rule on rule.tenant_id = mat.tenant_id and rule.deleted = '0'
where mat.src_org_id in (0,1)
group by mat.tenant_id, mat.org_id
;
报错信息:
SQL 错误 [1105] [HY000]: errCode = 2, detailMessage = select list expression not produced by aggregation output (missing from GROUP BY clause?): sum(CASE WHEN serv
.priorservresult
IN (0, 2, 4, 5) AND rel
.match_value
<= rule
.match_value
THEN 1 ELSE NULL END)
解决:
1、先去掉group by , 计算明细数据,然后再写select计算group by 聚合逻辑【子查询】
key字段为null报错如下:
Column ‘birth_date’ is NOT NULL, however, a null value is being written into it. You can set job configuration ‘table.exec.sink.not-null-enforcer’=‘drop’ to suppress this exception and drop such records silently.
doris 版本:1.1
1、key字段不能为 text、string , 否则表创建失败
2、key中部分字段是date 类型,flink sink to doris B表,date类型字段大部分数据变成null
解决:
1、改为先建表【create table ();】,如何insert into xxx select xxx
2、参考官网【doris 1.1版本】:https://doris.apache.org/zh-CN/docs/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE-AS-SELECT
备注:doris 1.1.0 版本,使用官网提供的模型,试了好几次都没有成功,找DBA协助验证,DBA说有bug
解决:
1、创建辅助表
CREATE TABLE `nums` (
`id` int(11) NULL COMMENT ""
) ENGINE=OLAP
UNIQUE KEY(`id`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 3",
"in_memory" = "false",
"storage_format" = "V2"
);
样例演示:
数据样例:[{“sex”: “1”, “Code”: “03”, “reserved”: “0”, “offCode”: “aa”}, {“sex”: “1”, “reserved”: “0”, “Code”: “04”, “offCode”: “bb”} ]
代码样例:
说明:
1、定义nums 表,数据从1到带查询表数据的最大数组长度
2、
SELECT os.piglets,
CONCAT('$.[',nums.id-1,'].sex') as aa, -- 构造 $.sex
LENGTH(piglets) as piglets_len, -- 字符串原始长度
LENGTH(REPLACE(piglets,'},{','')) as piglets_length, -- "},{" 每存在一个数组元素,长度都减少3个单位
(LENGTH(piglets)-LENGTH(REPLACE(piglets,'},{','')))/3 as piglets_sub_len, -- 字符串原始长度 - 被替换过的字符串长度是 3 的倍数 + 1 = 数组元素的个数
get_json_string(piglets,CONCAT('$.[',nums.id-1,'].sex')) AS sex,
get_json_string(piglets,CONCAT('$.[',nums.id-1,'].offCode')) AS offspringCode,
get_json_string(piglets,CONCAT('$.[',nums.id-1,'].earno')) AS earno
FROM os
JOIN ods.nums ON nums.id<=(LENGTH(piglets)-LENGTH(REPLACE(piglets,'},{','')))/3+1 -- 判断数组个数
WHERE nums.id<=30
报错详情:
[ERROR] 2022-10-21 09:28:27.993 [TaskLogInfo- - [taskAppId=TASK-7274374882592_782-66761-574830]] - execute sql error: ERROR: extra data after last expected column
Where: COPY xxx, line 1: “643743610690&6270887802992&62680403352&1900-08e7957c004&641645783816&YF…”
SQL statement “copy XXX.xxx from ‘/data/doris_outfile/dwd_anc_event_remove_f679fa9a70ee4237-a6c0a88ac0ae1d08_0.csv’ delimiter ‘&’ null ‘\N’ csv”
PL/pgSQL function XXX.xxx(text,text,text) line 4 at EXECUTE
原因:
1、doris 和pg表字段不一致
2、字段的数值包含了分隔符,导致程序任务doris字段数比pg 多
解决:
1、补足字段
2、etl中处理分隔符
文本类型与日期类型对比存在一定几率失效问题
例子:dt_date
>= ‘2023-01-01’ AND dt_date
<= ‘2023-05-31’
解决:修改字段类型为 date