最近有一个需求,就是要将数据文件打包上传到服务器,上传的同时分析数据文件并将数据清洗入库。
该项目使用springboot + thymeleaf + mybatis,通过mybatis插件pagehelper插件进行分页,简单封装了Page类。
**
**
github上有完整的项目代码,有此需求的朋友可以关注一下,本文只介绍关键步骤。
pentaho-kettle
kettle-core
${kettle-version}
pentaho-kettle
kettle-engine
${kettle-version}
pentaho-kettle
kettle-dbdialog
${kettle-version}
org.codehaus.janino
janino
${janino-version}
注意最后一个jar包(janino),执行简单job和translation时用不上,但是复杂作业没有此包会报错。
# kettle相关配置
kettle:
filerepository:
path: D:/ch/Kettle-repo/test
id: kettleRepo
name: kettleRepo
description: 恩施kettle文件资源库
templates: #数据模板文件路径
path: D:/ch/Kettle-repo/templates
log:
level: basic # 对应nothing error minimal basic detailed debug rowlevel
path: D:/hx/log/kettle_log
说明:kettle.filerepository.path为kettle本地文件资源库的路径,后面的java代码通过该路径来读取资源库,取得该文件资源库下的作业和转换(有兴趣的朋友可以使用数据库资源库)。
/**
* 配置kettle文件库资源库环境
**/
public KettleFileRepository fileRepositoryCon() throws KettleException {
String msg;
//初始化
/*EnvUtil.environmentInit();
KettleEnvironment.init();*/
//资源库元对象
KettleFileRepositoryMeta fileRepositoryMeta = new KettleFileRepositoryMeta(this.KETTLE_REPO_ID, this.KETTLE_REPO_NAME, this.KETTLE_REPO_DESC, this.KETTLE_REPO_PATH);
// 文件形式的资源库
KettleFileRepository repo = new KettleFileRepository();
repo.init(fileRepositoryMeta);
//连接到资源库
repo.connect("", "");//默认的连接资源库的用户名和密码
if (repo.isConnected()) {
msg = "kettle文件库资源库【" + KETTLE_REPO_PATH + "】连接成功";
logger.info(msg);
return repo;
} else {
msg = "kettle文件库资源库【" + KETTLE_REPO_PATH + "】连接失败";
logger.error(msg);
throw new KettleDcException(msg);
}
}
public void callTrans(String transPath, String transName, Map namedParams, String[] clParams) throws Exception {
String msg;
KettleFileRepository repo = this.fileRepositoryCon();
TransMeta transMeta = this.loadTrans(repo, transPath, transName);
//转换
Trans trans = new Trans(transMeta);
//设置命名参数
if(null != namedParams) {
for(Iterator> it = namedParams.entrySet().iterator(); it.hasNext();){
Map.Entry entry = it.next();
trans.setParameterValue(entry.getKey(), entry.getValue());
}
}
trans.setLogLevel(this.getLogerLevel(KETTLE_LOG_LEVEL));
//执行
trans.execute(clParams);
trans.waitUntilFinished();
//记录日志
String logChannelId = trans.getLogChannelId();
LoggingBuffer appender = KettleLogStore.getAppender();
String logText = appender.getBuffer(logChannelId, true).toString();
logger.info(logText);
//抛出异常
if (trans.getErrors() > 0) {
msg = "There are errors during transformation exception!(转换过程中发生异常)";
logger.error(msg);
throw new KettleDcException(msg);
}
}
public boolean callJob(String jobPath, String jobName, Map variables, String[] clParams) throws Exception {
String msg;
KettleFileRepository repo = this.fileRepositoryCon();
JobMeta jobMeta = this.loadJob(repo, jobPath, jobName);
Job job = new Job(repo, jobMeta);
//向Job 脚本传递参数,脚本中获取参数值:${参数名}
if(null != variables) {
for(Iterator> it = variables.entrySet().iterator(); it.hasNext();){
Map.Entry entry = it.next();
job.setVariable(entry.getKey(), entry.getValue());
}
}
//设置日志级别
job.setLogLevel(this.getLogerLevel(KETTLE_LOG_LEVEL));
job.setArguments(clParams);
job.start();
job.waitUntilFinished();
//记录日志
String logChannelId = job.getLogChannelId();
LoggingBuffer appender = KettleLogStore.getAppender();
String logText = appender.getBuffer(logChannelId, true).toString();
logger.info(logText);
if (job.getErrors() > 0) {
msg = "There are errors during job exception!(执行job发生异常)";
logger.error(msg);
throw new KettleDcException(msg);
}
return true;
}
/**
* 加载转换
*/
private TransMeta loadTrans(KettleFileRepository repo, String transPath, String transName) throws Exception{
String msg;
RepositoryDirectoryInterface dir = repo.findDirectory(transPath);//根据指定的字符串路径找到目录
if(null == dir){
msg = "kettle资源库转换路径不存在【"+repo.getRepositoryMeta().getBaseDirectory()+transPath+"】!";
throw new KettleDcException(msg);
}
TransMeta transMeta = repo.loadTransformation(repo.getTransformationID(transName, dir), null);
if(null == transMeta){
msg = "kettle资源库【"+dir.getPath()+"】不存在该转换【"+transName+"】!";
throw new KettleDcException(msg);
}
return transMeta;
}
/**
* 加载job
*/
private JobMeta loadJob(KettleFileRepository repo, String jobPath, String jobName) throws Exception{
String msg;
RepositoryDirectoryInterface dir = repo.findDirectory(jobPath);//根据指定的字符串路径找到目录
if(null == dir){
msg = "kettle资源库Job路径不存在【"+repo.getRepositoryMeta().getBaseDirectory()+jobPath+"】!";
throw new KettleDcException(msg);
}
JobMeta jobMeta = repo.loadJob(repo.getJobId(jobName, dir), null);
if(null == jobMeta){
msg = "kettle资源库【"+dir.getPath()+"】不存在该转换【"+jobName+"】!";
throw new KettleDcException(msg);
}
return jobMeta;
}
调用
Map variables = new HashMap<>();
//传入文件解压出来后的路径
variables.put("param",FileUtil.getBasePath(t_relative_path));
boolean re = kettleManager.callJob(t_job_path,t_job_name,variables, null);
com.alibaba
druid
${druid-version}
mysql
mysql-connector-java
${mysql-connector-java-version}
org.mybatis.spring.boot
mybatis-spring-boot-starter
${mybatis-spring-boot-starter-version}
数据库连接池使用的druid,相关配置文件为druid.properties
即可springboot的mybatis配置通过注解@Configuration完成,如下:
@Configuration
@PropertySource(value = "classpath:druid.properties")
public class SpringConfig {
@Bean(name = "dataSource")
@ConfigurationProperties(prefix = "spring.datasource")
public DataSource druidDataSource() {
DruidDataSource druidDataSource = new DruidDataSource();
return druidDataSource;
}
/*==================MyBatis配置====================*/
@Bean(name = "sqlSessionFactory")
@Primary
public SqlSessionFactory sqlSessionFactory(@Qualifier("dataSource") DataSource dataSource) throws Exception {
//此句必须要加上,不然打包后运行jar包时无法识别mybatis别名
VFS.addImplClass(SpringBootVFS.class);
SqlSessionFactoryBean bean = new SqlSessionFactoryBean();
bean.setDataSource(dataSource);
// 设置mybatis的主配置文件
ResourcePatternResolver resolver = new PathMatchingResourcePatternResolver();
Resource mybatisConfigXml = resolver.getResource("classpath:mybatis/mybatis-config.xml");
bean.setConfigLocation(mybatisConfigXml);
//设置mybatis扫描的mapper.xml文件的路径(非常重要,否则找不到mapper.xml文件)
Resource[] mapperResources = resolver.getResources("classpath:mybatis/mapper/*.xml");
bean.setMapperLocations(mapperResources);
// 设置别名包,便于在mapper.xml文件中ParemeType和resultType不要写完整的包名
bean.setTypeAliasesPackage("com.ch.dataclean.model");
return bean.getObject();
}
@Bean(name = "sqlSessionTemplate")
@Primary
public SqlSessionTemplate sqlSessionTemplate(@Qualifier("sqlSessionFactory") SqlSessionFactory sqlSessionFactory) throws Exception {
return new SqlSessionTemplate(sqlSessionFactory);
}
//初始化kettle环境
@Bean(name = "KettleEnvironmentInit")
public StartInit startInit(){
return new StartInit();
}
}
特别注意:配置bean SqlSessionFactory时一定要加上这句[VFS.addImplClass(SpringBootVFS.class);] ,不加的话在idea里运行正常,可是打成jar运行时会识别不了mybatis的别名,即使使用了注解@Alias("")。
public interface DAO {
/**
* 保存对象
*/
public Object save(String str, Object obj) throws Exception;
/**
* 修改对象
*/
public Object update(String str, Object obj) throws Exception;
/**
* 删除对象
*/
public Object delete(String str, Object obj) throws Exception;
/**
* 查找对象
*/
public Object findForObject(String str, Object obj) throws Exception;
/**
* 查找对象
*/
public Object findForList(String str, Object obj) throws Exception;
/**
* 查找对象封装成Map
*/
public Object findForMap(String sql, Object obj, String key, String value) throws Exception;
}
@Repository
public class DaoSupport implements DAO {
@Resource(name = "sqlSessionTemplate")
private SqlSessionTemplate sqlSessionTemplate;
/**
* 保存对象
*/
public Object save(String str, Object obj) throws Exception {
return sqlSessionTemplate.insert(str, obj);
}
/**
* 批量更新
*/
public Object batchSave(String str, List objs )throws Exception{
return sqlSessionTemplate.insert(str, objs);
}
/**
* 修改对象
*/
public Object update(String str, Object obj) throws Exception {
return sqlSessionTemplate.update(str, obj);
}
/**
* 批量更新
*/
public void batchUpdate(String str, List objs )throws Exception{
SqlSessionFactory sqlSessionFactory = sqlSessionTemplate.getSqlSessionFactory();
//批量执行器
SqlSession sqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH,false);
try{
if(objs!=null){
for(int i=0,size=objs.size();i
com.github.pagehelper
pagehelper
5.1.8
本文使用的是pagehelper5.1.8,使用5.0以后的版本即可,5.0以前的没有如下方法,既不能通过此方法设置排序参数。
public static Page startPage(int pageNum, int pageSize, String orderBy)
//此处省略setter/getter
/**
* Description: pagehelper分页实体类
* Created by Aaron on 2018/11/21
*/
public class Page {
private int pageNum = 1;
private int pageSize = 10;
private int startRow;
private int endRow;
private long total;
private int pages;
//排序
private String orderBy;
private List rows;
/**
* 分页查询
*/
public Page queryForPage(SqlSessionTemplate sqlSessionTemplate, String sqlMappingStr, Map param, Page page){
if(null != this.orderBy && "" != this.orderBy.trim()){
PageHelper.startPage(page.getPageNum(),page.getPageSize(),this.orderBy);
}else {
PageHelper.startPage(page.getPageNum(),page.getPageSize());
}
List list = sqlSessionTemplate.selectList(sqlMappingStr, param);
PageInfo pageInfo = new PageInfo(list);
page.setPageNum(pageInfo.getPageNum());
page.setPageSize(pageInfo.getPageSize());
page.setRows(list);
page.setTotal(pageInfo.getTotal());
page.setPages(pageInfo.getPages());
page.setStartRow(pageInfo.getEndRow());
page.setEndRow(pageInfo.getEndRow());
return page;
}
}
结语:此项目也可作为springboot入门,想学习springboot的朋友可以借鉴,里面有本人对于springboot整合其他框架摸索出来的一些思想。以前都是自己写的分页,缺点是分页查询的时候需要另外写一个sql查询总数,关于pagehelper,目前的感受是真的方便,性能方面如何暂且不知,这都是后话了,先上线后迭代,哈哈。就写这么多吧,熬夜伤身,此刻思路已经不清晰了。该项目源码见 kettle-springboot,比较简单,也希望对你们有用。晚安各位。