WEKA连接MySQL,Oracle,SQLServer

1.准备

win7 64位旗舰版
JDK1.6.0_35
weka-3-6-10-x64.exe
SQLServer2005
mysql-6.0.0
Oraclewin64_11gR2_database

Microsoft SQL Server 2005 JDBC Driver 1.2--->sqljdbc.jar
MySQL Driver for JDBC--->mysql-connector-java-5.1.6-bin.jar
Oracle Driver for JDBC--->ojdbc6.jar


2.双击weka-3-5-7.exe安装weka


3.进入weka安装目录

3.1.解压缩weka.jar

解压后的目录结构
[Weka-3-5]
|____...
|____[weka]
     |____[META-INF]
          |____...
     |____[weka]
          |____...
|____...

3.2.新建lib目录,将数据库Driver for JDBC(jar包)拷贝进/lib

完成后的目录结构
[Weka-3-5]
|____...
|____[weka]
     |____[META-INF]
          |____...
     |____[weka]
          |____...
|____[lib]
     |____mysql-connector-java-5.1.6-bin.jar
     |____ojdbc14.jar
     |____sqljdbc.jar
|____...


4.设置环境变量

WEKA_HOME
C:/Program Files/Weka-3-5

ClassPath
.;%WEKA_HOME%/lib/sqljdbc.jar;%WEKA_HOME%/lib/mysql-connector-java-5.1.6-bin.jar;%WEKA_HOME%/lib/ojdbc14.jar;%JAVA_HOME%/lib/tools.jar;%JAVA_HOME%/lib/dt.jar

Path

C:\ProgramFiles\Java\jre6\bin;%JAVA_HOME%\bin

设置完成后,weka就能找到放在/lib中的数据库jar包了.


5.修改DatabaseUtils.props

进入%WEKA_HOME%/weka/weka/experiment/你会看到:
...
DatabaseUtils.props
DatabaseUtils.props.hsql
DatabaseUtils.props.mssqlserver2005
DatabaseUtils.props.mssqlserver
DatabaseUtils.props.mysql
DatabaseUtils.props.odbc
DatabaseUtils.props.oracle
DatabaseUtils.props.postgresql
...
weka运行时会使用DatabaseUtils.props
其他的如:'DatabaseUtils.props.数据库名称'(这些是weka提供的针对不同数据库提供的模板)
我们先将DatabaseUtils.props随便改成一个其他的名字,如:DatabaseUtils.props.sample
然后将DatabaseUtils.props.mysql改成DatabaseUtils.props(假设我们需要连接mysql数据库)
打开现在的DatabaseUtils.props可以看到以下部分:(#表示注释)[小弟的注释]

[版本信息]
# Database settings for MySQL 3.23.x, 4.x
[小弟连接的是MySQL6所以改成--># Database settings for MySQL 6.x]
#
# url:     http://www.mysql.com/
# jdbc:    http://www.mysql.com/products/connector/j/
# author:  Fracpete (fracpete at waikato dot ac dot nz)
# version: $Revision: 1.3 $
[JDBC版本--># version: $Revision: 5.1 $]

# JDBC driver (comma-separated list)
jdbcDriver=org.gjt.mm.mysql.Driver
[修改为-->jdbcDriver=com.mysql.jdbc.Driver]

# database URL
jdbcURL=jdbc:mysql://server_name:3306/database_name
[这个建议不修改,方便后面进入weka后,通过修改相应的'server_name','datebase_name'来连接相应的mysql数据库.其实大家在这里像这样子jdbcURL=jdbc:mysql://localhost:3306/foodmart写死了也没什么,进入weka后同样可以修改,但显得不够专业不是!~]

# specific data types
# string, getString() = 0;    --> nominal
# boolean, getBoolean() = 1;  --> nominal
# double, getDouble() = 2;    --> numeric
# byte, getByte() = 3;        --> numeric
# short, getByte()= 4;        --> numeric
# int, getInteger() = 5;      --> numeric
# long, getLong() = 6;        --> numeric
# gloat, getFloat() = 7;      --> numeric
# date, getDate() = 8;        --> date
# text, getString() = 9;      --> string
[呵呵,这里是重点!由于weka仅支持名词型(nominal),数值型(numeric),字符串(string),日期(date).所以我们要将现在数据库中的数据类型对应到这四种类型上来.]
[将上面的内容改成:
# specific data types
string, getString() = 0;    --> nominal
boolean, getBoolean() = 1;  --> nominal
double, getDouble() = 2;    --> numeric
byte, getByte() = 3;        --> numeric
short, getByte()= 4;        --> numeric
int, getInteger() = 5;      --> numeric
long, getLong() = 6;        --> numeric
gloat, getFloat() = 7;      --> numeric
date, getDate() = 8;        --> date
text, getString() = 9;      --> string
TINYINT=3
SMALLINT=4
#SHORT=4
SHORT=5
INTEGER=5
INT=5
BIGINT=6
LONG=6
REAL=7
NUMERIC=2
DECIMAL=2
FLOAT=2
DOUBLE=2
CHAR=0
TEXT=0
VARCHAR=0
LONGVARCHAR=9
BINARY=0
VARBINARY=0
LONGVARBINARY=9
BIT=1
BLOB=9
DATE=8
TIME=8
DATETIME=8
TIMESTAMP=8
这里参考了一些网友的帖子,自己google了一些,这里MySQL常用的数据类型都设置好了,再也不用担心weka不识别对应的数据类型了^-^
大家注意,上面有部分'#'要去掉哦!
在附录中会提供小弟为大家精心准备的DatabaseUtils.props文件:
DatabaseUtils.props.mssqlserver2005_ok
DatabaseUtils.props.mysql6_ok
DatabaseUtils.props.oracle10g_ok
文件名大家随意,使用的时候记得改成DatabaseUtils.props就好]

# other options
CREATE_DOUBLE=DOUBLE
CREATE_STRING=TEXT
CREATE_INT=INT
checkUpperCaseNames=false
checkLowerCaseNames=false
checkForTable=true
[其他设置,暂时不用修改]


6.制作weka.jar并替换原来的jar

因为weka软件运行时需要读取weka.jar,所以你修改之后要重新打包jar文件替换原来的jar才可以运行weka软件成功连接数据库.

6.1.从命令行进入%WEKA_HOME%/weka
6.2.执行jar cvf weka.jar weka/*.*(打包的时候,java_cup文件夹总不能打包进去,导致后面报错,拖动java_cup进入jar包,解决)
WEKA连接MySQL,Oracle,SQLServer_第1张图片

6.3.进入%WEKA_HOME%/weka会发现打包好了的weka.jar(没有的请刷新一下)

6.4.将%WEKA_HOME%/weka下的weka.jar复制到%WEKA_HOME%(建议将原来的weka.jar改名成weka.jar.sample备用,大家今后如果针对不同数据库创建了多个weka.jar不妨将其改名成-->'weka.jar.数据库名',用的时候将后缀去掉就行,体力活咱做一次就够了!~)


7.运行weka

奇怪的问题:运行-->Weka 3.5(不带控制台)进入weka连不上数据库(mysql,oracle,sqlserver都不行),说找不到合适的JDBC DRIVER.但运行-->Weka 3.5 (with console)则全部正常.期待达人解答!~

不理它,能用就行,毕竟现在还附送个'控制台'!~

7.1.运行-->Weka 3.5 (with console)

7.2.选择Applications--->Explorer
WEKA连接MySQL,Oracle,SQLServer_第2张图片

7.3.选择Open DB...

7.4.选择User...
根据自己的情况修改Database URL,Username,Password.
WEKA连接MySQL,Oracle,SQLServer_第3张图片
WEKA连接MySQL,Oracle,SQLServer_第4张图片

7.5.选择Connect
注意窗口下方的Info里的信息!
... = true   --->恭喜你,连接成功!~
... = false   --->失败!~别灰心,向上一步步地检查,你离true不远了!~

WEKA连接MySQL,Oracle,SQLServer_第5张图片

7.6.连接成功后光标会自动选择Query栏,等着各位兄台来输入sql语句.小弟输入一个超简单的,然后选择Execute执行sql语句.
WEKA连接MySQL,Oracle,SQLServer_第6张图片

7.7.执行成功后在Result栏中会有数据显示.
WEKA连接MySQL,Oracle,SQLServer_第7张图片

7.8.选择OK,呵呵!~weka已经捕获了相关数据,并显示相关信息,接下来各位爱怎么玩,就怎么玩!~
WEKA连接MySQL,Oracle,SQLServer_第8张图片

7.9.如果我不写sql语句,在连接成功后直接选择OK,会怎么样?嘿嘿,weka会说连接数据库有问题,没有合适的驱动.什么也不显示.所以还是告诉它我们需要哪些数据,不然接下来就没得玩了啦^_^


8.参考帖子

C6H5NO2
WEKA连接数据库指南(mysql版)
http://bbs2.wekacn.org/viewtopic.php?f=2&t=216&sid=13cdd0c42079a4b719d5d54b83855780

kongter
weka连接mysql数据库(windows xp版),无需解压缩,无需配置DatabaseUtils.props文件
http://bbs2.wekacn.org/viewtopic.php?f=2&t=293&sid=13cdd0c42079a4b719d5d54b83855780

ps:小弟刚开始就是使用上面的方法(由于需要对%WEKA_HOME%/RunWeka.bat进行修改,一味的追求连上数据库,破坏了源程序的完整性.就算配好了,更换数据库时便需要再度对其修改.因此不推荐大家使用),运气不好,没整出来!~

数据挖掘青年(DMman)
Weka如何连接数据库
http://blogger.org.cn/blog/more.asp?name=DMman&id=24991

ps:呵呵,这是比较正统的方法,没有这篇帖子就没有今天小弟的作品!~


9.附录(DatabaseUtils.props配置)

这里给大家提供较为完整的配置方案,基本上够用.其中specific data types部分参照了各个数据库的数据类型说明,不常用的数据类型没有列出.(当然,有一部分比较bt的数据类型实在是不知道让weka如何对应,如果哪位高手有更全面的设置,欢迎提供!~)

9.1.DatabaseUtils.props.mssqlserver2005_ok

# Database settings for Microsoft SQL Server 2005 Express Edition
#
# url:     http://www.microsoft.com/
# jdbc:    http://msdn2.microsoft.com/en-us/data/aa937724.aspx
# author:  Fracpete (fracpete at waikato dot ac dot nz)
# version: $Revision: 1.2 $

# JDBC driver (comma-separated list)
jdbcDriver=com.microsoft.sqlserver.jdbc.SQLServerDriver

# database URL
jdbcURL=jdbc:sqlserver://server_name;databaseName=database_name

# specific data types
string, getString() = 0;    --> nominal
boolean, getBoolean() = 1;  --> nominal
double, getDouble() = 2;    --> numeric
byte, getByte() = 3;        --> numeric
short, getByte()= 4;        --> numeric
int, getInteger() = 5;      --> numeric
long, getLong() = 6;        --> numeric
gloat, getFloat() = 7;      --> numeric
date, getDate() = 8;        --> date
text, getString() = 9;      --> string
bit=1
tinyint=3
smallint=4
int=5
bigint=6
smallmoney=2
money=2
numeric=2
decimal=2
float=2
real=2
smalldatetime=8
datetime=8
timestamp=8
char=0
text=0
varchar=0
nchar=0
ntext=0
nvarchar=0
binary=0
varbinary=0
image=0
uniqueidentifier=9
rowversion=9

# other options
CREATE_DOUBLE=DOUBLE PRECISION
CREATE_STRING=VARCHAR(8000)
CREATE_INT=INT
checkUpperCaseNames=false
checkLowerCaseNames=false
checkForTable=true

9.2.DatabaseUtils.props.mysql6_ok

# Database settings for MySQL 6.x
#
# url:     http://www.mysql.com/
# jdbc:    http://www.mysql.com/products/connector/j/
# author:  Fracpete (fracpete at waikato dot ac dot nz)
# version: $Revision: 5.1 $

# JDBC driver (comma-separated list)
jdbcDriver=com.mysql.jdbc.Driver

# database URL
jdbcURL=jdbc:mysql://server_name:3306/database_name

# specific data types
string, getString() = 0;    --> nominal
boolean, getBoolean() = 1;  --> nominal
double, getDouble() = 2;    --> numeric
byte, getByte() = 3;        --> numeric
short, getByte()= 4;        --> numeric
int, getInteger() = 5;      --> numeric
long, getLong() = 6;        --> numeric
gloat, getFloat() = 7;      --> numeric
date, getDate() = 8;        --> date
text, getString() = 9;      --> string
TINYINT=3
SMALLINT=4
#SHORT=4
SHORT=5
INTEGER=5
INT=5
BIGINT=6
LONG=6
REAL=7
NUMERIC=2
DECIMAL=2
FLOAT=2
DOUBLE=2
CHAR=0
TEXT=0
VARCHAR=0
LONGVARCHAR=9
BINARY=0
VARBINARY=0
LONGVARBINARY=9
BIT=1
BLOB=9
DATE=8
TIME=8
DATETIME=8
TIMESTAMP=8

# other options
CREATE_DOUBLE=DOUBLE
CREATE_STRING=TEXT
CREATE_INT=INT
checkUpperCaseNames=false
checkLowerCaseNames=false
checkForTable=true

9.3.DatabaseUtils.props.oracle10g(在oracle11g测试可用)

# Database settings for Oracle 10g Express Edition
#
# General information on database access can be found here:
# http://weka.wikispaces.com/Databases
#
# url:     http://www.oracle.com/
# jdbc:    http://www.oracle.com/technology/software/tech/java/sqlj_jdbc/
# author:  Fracpete (fracpete at waikato dot ac dot nz)
# version: $Revision: 5836 $

# JDBC driver (comma-separated list)
jdbcDriver=oracle.jdbc.driver.OracleDriver

# database URL
jdbcURL=jdbc:oracle:thin:@server_name:1521:database_name

# specific data types
# string, getString() = 0;    --> nominal
# boolean, getBoolean() = 1;  --> nominal
# double, getDouble() = 2;    --> numeric
# byte, getByte() = 3;        --> numeric
# short, getByte()= 4;        --> numeric
# int, getInteger() = 5;      --> numeric
# long, getLong() = 6;        --> numeric
# float, getFloat() = 7;      --> numeric
# date, getDate() = 8;        --> date
# text, getString() = 9;      --> string
# time, getTime() = 10;       --> date

VARCHAR2=0
NUMBER=2
DOUBLE_PRECISION=2
TIMESTAMP=8
CHAR=0
NCHAR=0
NVARCHAR2=0
RAW=9
BINARY_FLOAT=2
DATE=8
ROWID=9

# other options
CREATE_INT=INTEGER
CREATE_STRING=VARCHAR2(4000)
CREATE_DOUBLE=NUMBER
CREATE_DATE=TIMESTAMP
DateFormat=yyyy-MM-dd HH:mm:ss
checkUpperCaseNames=true
checkForTable=true

# All the reserved keywords for this database
# Based on the keywords listed at the following URL (2009-04-13):
# http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/ap_keywd.htm
Keywords=\
  ACCESS,\
  ADD,\
  ALL,\
  ALTER,\
  AND,\
  ANY,\
  AS,\
  ASC,\
  AUDIT,\
  BETWEEN,\
  BY,\
  CHAR,\
  CHECK,\
  CLUSTER,\
  COLUMN,\
  COMMENT,\
  COMPRESS,\
  CONNECT,\
  CREATE,\
  CURRENT,\
  DATE,\
  DECIMAL,\
  DEFAULT,\
  DELETE,\
  DESC,\
  DISTINCT,\
  DROP,\
  ELSE,\
  EXCLUSIVE,\
  EXISTS,\
  FILE,\
  FLOAT,\
  FOR,\
  FROM,\
  GRANT,\
  GROUP,\
  HAVING,\
  IDENTIFIED,\
  IMMEDIATE,\
  IN,\
  INCREMENT,\
  INDEX,\
  INITIAL,\
  INSERT,\
  INTEGER,\
  INTERSECT,\
  INTO,\
  IS,\
  LEVEL,\
  LIKE,\
  LOCK,\
  LONG,\
  MAXEXTENTS,\
  MINUS,\
  MLSLABEL,\
  MODE,\
  MODIFY,\
  NOAUDIT,\
  NOCOMPRESS,\
  NOT,\
  NOWAIT,\
  NULL,\
  NUMBER,\
  OF,\
  OFFLINE,\
  ON,\
  ONLINE,\
  OPTION,\
  OR,\
  ORDER,\
  PCTFREE,\
  PRIOR,\
  PRIVILEGES,\
  PUBLIC,\
  RAW,\
  RENAME,\
  RESOURCE,\
  REVOKE,\
  ROW,\
  ROWID,\
  ROWNUM,\
  ROWS,\
  SELECT,\
  SESSION,\
  SET,\
  SHARE,\
  SIZE,\
  SMALLINT,\
  START,\
  SUCCESSFUL,\
  SYNONYM,\
  SYSDATE,\
  TABLE,\
  THEN,\
  TO,\
  TRIGGER,\
  UID,\
  UNION,\
  UNIQUE,\
  UPDATE,\
  USER,\
  VALIDATE,\
  VALUES,\
  VARCHAR,\
  VARCHAR2,\
  VIEW,\
  WHENEVER,\
  WHERE,\
  WITH

# The character to append to attribute names to avoid exceptions due to
# clashes between keywords and attribute names
KeywordsMaskChar=_

#flags for loading and saving instances using DatabaseLoader/Saver
nominalToStringLimit=50
idColumn=auto_generated_id

 

quote:http://blog.csdn.net/senaku/article/details/2225943

你可能感兴趣的:(数据挖掘,64位,oracle11g,weka,ojdbc)