1、首先运行存储过程:dbo.arachnode_usp_arachnode.net_RESET_DATABASE或者从类 Arachnode.Console。Pragram.cs中执行
ArachnodeDAO arachnodeDAO = new ArachnodeDAO();
arachnodeDAO.ExecuteSql("EXEC [dbo].[arachnode_usp_arachnode.net_RESET_DATABASE]");
_crawler.Crawl(new CrawlRequest(new Discovery("http://taobao.com"), int.MaxValue, UriClassificationType.Domain | UriClassificationType.FileExtension, UriClassificationType.Domain | UriClassificationType.FileExtension, 1));
2、在SQL Server 2008数据库中,对表cfg.Configuration执行如下一段代码:
use [arachnode.net]
update cfg.Configuration
set Value = 'D:\LuceneDotNetIndex\Index'
where [KEY] = 'LuceneDotNetIndexDirectory'
update cfg.Configuration
set Value = 'D:\LuceneDotNetIndex\DownloadedFiles'
where [KEY] = 'DownloadedFilesDirectory'
update cfg.Configuration
set Value = 'D:\LuceneDotNetIndex\DownloadedImages'
where [KEY] = 'DownloadedImagesDirectory'
update cfg.Configuration
set Value = 'D:\LuceneDotNetIndex\DownloadedWebPages'
where [KEY] = 'DownloadedWebPagesDirectory'
update cfg.Configuration
set Value = 'D:\LuceneDotNetIndex\ConsoleOutputLogs'
where [KEY] = 'ConsoleOutputLogsDirectory'
3、将数据库中的表cfg.CrawlActions中的字段
AutoCommit=true|LuceneDotNetIndexDirectory=D:\LuceneDotNetIndex\Index|CheckIndexes=false|RebuildIndexOnLoad=false|WebPageIDLowerBound=1|WebPageIDUpperBound=100000
4、配制数据库的链接:
Arachnode.Configuration中的
connectionString="Data Source=HENRYWEN-TUCU\SQLEXPRESS;Initial Catalog=arachnode.net;Integrated Security=True;Connection Timeout=3600;"或者项目Function右键--属性--数据库--连接字符
5、去掉开发工具(VS2008):look up turning off 'Just My Code' - this is a Visual Studio option
工具--选项--调试--去掉启用仅我的代码
6.激活CLR功能,运行SQL Sever 外围应用配置器,选择功能的外围应用配置器,选择CLR集成,点选激活CLR集成,保存配置。
sql2008 启动clr
exec sp_configure 'show advanced options', '1';
go
reconfigure;
go
exec sp_configure 'clr enabled', '1'
go
reconfigure;
exec sp_configure 'show advanced options', '1';
go
7.新建一个查询,执行存储过程:"[dbo].[arachnode_usp_arachnode.net_RESET_DATABASE]"。
8.新建一个查询,执行:"ALTER DATABASE[arachnode.net]SET TRUSTWORTHY ON"。将数据库赋予合适的权限。