Ruby的内核模块已经实现了I/O相关的方法:gets,open,print,printf,putc,puts,readline,readlines,test。
11.1 IO对象
Ruby提供了基础类:IO,它的继承类有File与BasicSocket。IO对象就是建立一个双向通道,一端接Ruby,一端接外部资源。
11.2文件的打开与关闭
file = File.new("testfile", "r") #...process the file file.close #第二个参数是操作模式:r w r+(read-write)
上面的new方法返回的是一个File对象,File#open与之相像,只是如果给File#open赋一个代码块,则open方法会调用这个代码块,把打开的文件对象作为参数。并且在操作完成后,自动关闭文件。
File.open("testfile","r") do |file| #...processthefile end #<-file automatically closed here
读取文件最好使用这种方式,因为这种在处理过程中如果发生异常,File#open方法会在抛出异常前关闭文件。
#open方法内部大概类似于下面的处理逻辑 class File def File.open(*args) result=f=File.new(*args) if block_given? begin result=yield f ensure f.close end end result end end
11.3 读写文件
gets可从标准输入中读取一行,在通过脚本调用时,如果通过命令指定了文件,也可以从文件中读取。
while line=gets puts line end $ruby copy.rb These are lines These are lines that I am typing that I am typing ^D #脚本执行时指定文体 $rubycopy.rb testfile This is line one This is line two #显示的指定文件,并逐行打印 File.open("testfile") do |file| while line = file.gets puts line end end
循环读取
#IO#each_byte读取下一个8-bit字节 #chr方法将数值转换为ASCII字符 File.open("testfile") do |file| file.each_byte.with_index do |ch,index| print"#{ch.chr}:#{ch}" break if index > 10 end end produces: T:84h:104i:105s:115 :32i:105s:115 :32l:108i:105n:110e:101
IO#each_line逐行读取文件
#String#dump用于显示换行符号 File.open("testfile") do |file| file.each_line{|line| puts "Got #{line.dump}"} end produces: Got "This is line one\n" Got "This is line two\n" Got "This is line three\n" Got "And soon...\n"
#IO.each_line("*")支持自定义换行符号,下面示例使用e作为换行符号 File.open("testfile") do |file| file.each_line("e") {|line| puts "Got #{line.dump}"} end produces: Got "This is line" Got "one" Got "\nThis is line" Got "two\nThis is line" Got "thre" Got "e" Got "\nAnd soon...\n"
#使用IO#foreach IO.foreach("testfile") {|line| puts line}
也可以把文件内容读取为String串,或者String数组(每行读取为一列)
#read into string str = IO.read("testfile") str.length #=>66 str[0,30] #=>"This is line one\nThis is line" #read into an array arr = IO.readlines("testfile") arr.length #=>4 arr[0] #=>"This is line one\n"
注意:IO处理经常会出现异常情况,在调用这些API时,记着使用begin..rescue..end来捕获它们。
写文件
#Note the "w",which opens the file for writing File.open("output.txt","w") do |file| file.puts "Hello" file.puts "1+2=#{1+2}" end
nil写入文件后是empty串。
Doing I/O with Strings
StringIO类类似于java的StringReader,StringWriter。提供了IO类相同的方法实现。
require 'stringio' ip = StringIO.new("now is\nthe time\nto learn\nRuby!") op=StringIO.new("","w") ip.each_line do |line| op.puts line.reverse end op.string#=>"\nsi won\n\nemit eht\n\nnrael ot\n!ybuR\n"
11.4 网络通信
require 'socket' client = TCPSocket.open('127.0.0.1', 'www') client.send("OPTIONS /~dave/ HTTP/1.0\n\n", 0) #0 means standard packet puts client.readlines client.close produces: HTTP/1.1200OK Date:Mon,27May201317:31:00GMT Server:Apache/2.2.22(Unix)DAV/2PHP/5.3.15withSuhosin-Patchmod_ssl/2.2.22 OpenSSL/0.9.8r Allow:GET,HEAD,POST,OPTIONS Content-Length:0 Connection:close Content-Type:text/html
lib/net包下面提供了更高一层次的应用协议封装(FTP,HTTP,POP,SMTP,telnet)
require 'net/http' http = Net::HTTP.new('pragprog.com',80) response = http.get('/book/ruby3/programming-ruby-1-9') if response.message == "OK" puts response.body.scan(/<imgalt=".*?"src="(.*?)"/m).uniq[0,3] end produces: http://pragprog.com/assets/logo-c5c7f9c2f950df63a71871ba2f6bb115.gif http://pragprog.com/assets/drm-free80-9120ffac998173dc0ba7e5875d082f18.png http://imagery.pragprog.com/products/99/ruby3_xlargecover.jpg?1349967653
更高一层次
require 'open-uri' open('http://pragprog.com') do |f| puts f.read.scan(/<imgalt=".*?"src="(.*?)"/m).uniq[0,3] end produces: http://pragprog.com/assets/logo-c5c7f9c2f950df63a71871ba2f6bb115.gif http://pragprog.com/assets/drm-free80-9120ffac998173dc0ba7e5875d082f18.png http://imagery.pragprog.com/products/353/jvrails2_xlargebeta.jpg?1368826914
11.5解析HTML
#通过正则式匹配,%r{..}m,添加m表示要多次匹配 require 'open-uri' page = open('http://pragprog.com/titles/ruby3/programming-ruby-1-9').read if page =~ %r{<title>(.*?)</title>}m puts "Title is #{$1.inspect}" end produces: Title is "The Pragmatic Bookshelf | Programming Ruby1.9"
使用nokogiri模块,可以更强大的支持解析html
require 'open-uri' require 'nokogiri' doc = Nokogiri::HTML(open("http://pragprog.com/")) puts"Pagetitleis"+doc.xpath("//title").inner_html #Output the first paragraph in the div with an id="copyright" #(nokogiri supports both xpath and css-like selectors) puts doc.css('div#copyright p') #Output the second hyperlink in the site-links div using xpath and css puts "\nSecond hyperlink is" puts doc.xpath('id("site-links")//a[2]') puts doc.css('#site-links a:nth-of-type(2)')
Nokogiri can also update and create HTML and XML