Link: http://www.troubleshooters.com/codecorn/ruby/symbols.htm
CONTENTS
- Introduction
- What do symbols looklike?
- What dothey resemble in other languages?
- How are symbolsimplemented?
- What are symbols?
- What are symbols not?
- What can symbols dofor you?
- What arethe advantages and disadvantages of symbols?
- Summary
Introduction
Overwhelmingly, Ruby conforms to Eric Raymond's Rule of Least Surprise.However, the concept of Ruby Symbols recently precipitated a ratherlively 29 participant, 97 post (and counting) thread on
[email protected] mailing list, complete with disagreements andkillfile pronouncements. Perhaps some documentation is called for :-)
I'm writing this documentation for a specific audience: People who wantto use Ruby but are not Ruby veterans. Maybe they've used Ruby, maybethey haven't, but they're not Ruby veterans. For the understanding ofthis specific audience, this documentation is written with a minimum ofRuby specific content. Instead, this documentation relies on generalprogramming concepts. In the end, this document will enable the RubyNewbie to use symbols correctly, every time, so that their code runsand does what they intend it to do. That is the sole goal of thisdocumentation.
Real Ruby veterans understand symbols intuitively, so they don't needthis documentation. Indeed, a Ruby veteran might look at thedocumentation you're now reading and call it inaccurate, because theconcepts introduced in this documentation do not match the actual Rubyimplementation of symbols. This document does not claim to explain theRuby implementation of symbols, but instead explains how an applicationprogrammer can think of them and use them to attain the desiredresults, as well as toread code containing symbols.
Symbols can be viewed on many levels:
- What do symbols look like?
- What do they resemble in other languages?
- How are symbols implemented?
- What are symbols?
- What are symbols not?
- What can symbols do for you?
- What are the advantages of symbols?
Some of these levels are explained in this document, and several arenot necessary to correct use of symbols. Read on...
What do symbols look like?
This is the one area where everyone agrees. Most symbols looks like acolon followed by a non-quoted string:
:myname
Another way to make a symbol is with a colon followed by a quotedstring, which is how you make a symbol whose string representationcontains spaces:
:'Steve was here and now is gone'
The preceding is also a symbol. Its string representation is:
"Steve was here and now is gone"
#!/usr/bin/env ruby
puts :'I love Ruby.'
puts :'I love Ruby.'.to_i |
sssss[slitt@mydesk slitt]$ ./test.rb
I love Ruby.
10263
[slitt@mydesk slitt]$
|
When using quotes in a symbol, you can use either single ordoublequotes, as long as the beginning and ending quotes are the sametype. Single or double, the string and numeric representations areidentical, and the object_id is the same.
Symbols are immutable. Their value remains constant during the entiretyof the program. They never appear on the left side of an assignment.You'll never see this:
:myname = "steve"
If you were to try that, you'd get the following error message:
[slitt@mydesk slitt]$ ./test.rb
./test.rb:37: parse error, unexpected '=', expecting $
:myname = "steve"
^
[slitt@mydesk slitt]$
|
Symbols ARE used like this:
mystring = :steveT
Or this:
mystring = :steveT.to_s
Or this:
myint = :steveT.to_i
Or this:
attr_reader :steveT
Now you at least know what we're talking about. Naturally, you stillhave plenty of questions. Read on...
What dothey resemble in other languages?
I'm not qualified to answer this question. In the long run, it doesn'tmatter. Tryingto answer this question at the start of your Ruby career can muddle theissue.
How are symbolsimplemented?
The only really authoritative answer to this question is to read the Ccode from which Ruby (actually the
ruby executable) is built.However, if you're new to Ruby, or a person who uses Ruby because helikes it but doesn't need to be a foremost Ruby authority, this is ananswer you do not need at this time, and it's probably best for thetime being to ignore all discussions of how symbols are implemented inRuby.
What are symbols?
It's a string. No it's an object. No it's a name.
There are elements of truth in each of the preceding assertions, andyet in my opinion they are not valuable, partially because they dependon a deep knowledge of Ruby to understand their significance. I preferto answer the question "what are symbols" in a language independentmanner:
A Ruby symbol is a thing that has both a number(integer) representation and a string representation.
|
In its actual Ruby implementation, the symbol does not
contain either a string or a number-- the string and number are kept somewhere else. That's not importantfor understanding how it works, however, so feel free to think of thesymbol as containing the string and number if that's easier tovisualize. In your code, you can derive the number representation withthe
:mysymbol.to_isyntax, and the string representation with the
:mysymbol.to_s syntax. In mostsituations, a symbol yields the string representation even without the
to_s conversion.
The string representation of the number is MUCH more important than thenumber part. As a matter of fact, the number part is seldom used.
Let's explore further using code:
#!/usr/bin/env ruby
puts :steve
puts :steve.to_s
puts :steve.to_i
puts :steve.class
|
The preceding code prints four lines. The first line prints the stringrepresentation because that's how the
puts() method is set up. Thesecond line is an explicit conversion to string. The third is anexplicitconversion to integer. The fourth prints the type of the symbol. Thepreceding code results in the following output:
[slitt@mydesk slitt]$ ./test.rb
steve
steve
10257
Symbol
[slitt@mydesk slitt]$
|
The first line shows the string representation of the symbol. Note thatthe string representation is identical to the string following thecolon. The second line first converts the symbol to a String objectusing
to_s, and thenprints it. The output is the same, but the explicit conversion toString object offers some added capabilities you might (or might not)need. This is discussed later in this document.
The third line shows the integer representation of the symbol. It is anon-meaningful and pretty much non-useful number that cannot be changed.The fourth line shows thatthe symbol is an object of the Symbol class.
Now let's explore some code that proves that the symbol's value cannotbe changed at runtime:
#!/usr/bin/env ruby
:steve = "Big Steve"
|
[slitt@mydesk slitt]$ ./test.rb
./test.rb:2: parse error, unexpected '=', expecting $
:steve = "Big Steve"
^
[slitt@mydesk slitt]$
|
Well, that failed miserably. Maybe if we explicitly change the stringrepresentation:
#!/usr/bin/env ruby
:steve.to_s = "Big Steve"
|
[slitt@mydesk slitt]$ ./test.rb
./test.rb:2: undefined method `to_s=' for :steve:Symbol (NoMethodError)
[slitt@mydesk slitt]$
|
No go on strongarming the string part. What about the integer?
[slitt@mydesk slitt]$ ./test.rb
./test.rb:2: undefined method `to_i=' for :steve:Symbol (NoMethodError)
[slitt@mydesk slitt]$
|
[slitt@mydesk slitt]$ ./test.rb
./test.rb:2: undefined method `to_i=' for :steve:Symbol (NoMethodError)
[slitt@mydesk slitt]$
|
Can't strongarm the integer. Of course,
to_i and
to_a were never meant to be setmethods -- they're get methods (actually they're conversions, but youneedn't consider that right now), but it's pretty obvious that a symbolcannot be changed at runtime. In computer science speak, it's immutable.
One last point. In a single program, every occurrence of an identicallynamed symbol is actually the same object. This is not true of strings.Watch this:
[#!/usr/bin/env ruby
puts :myvalue.object_id
puts :myvalue.object_id
puts "myvalue".object_id
puts "myvalue".object_id
|
[slitt@mydesk slitt]$ ./test.rb
2625806
2625806
537872172
537872152
[slitt@mydesk slitt]$
|
As you can see, both times
:myvaluewas used, it had the same object ID. As you can see, the object IDs oftwo uses of
"myvalue"produced two different object IDs. This is how symbols can save memory.
Based on what's been presented in this section, we can add to thelanguage independent answer to the question "what is a symbol":
- A Ruby symbol is a thingthat has both a number (integer) representation and a stringrepresentation.
- The string representation is much more important and used muchmore often.
- The value of a Ruby symbol's string part is the name of thesymbol, minus the leading colon.
- A Ruby symbol cannot be changed at runtime.
- Multiple uses of the same symbol have the same object ID and arethe same object.
Now let's inject just a little bit of Ruby specific terminology. Almosteverything in Ruby is an object, and symbols are no exception. They'reobjects.
What are symbols not?
A Symbol is Not a String
A Ruby symbol is not a string. Ruby string objects have methods such as
capitalize, and
center. Ruby symbols have nosuch methods:
#!/usr/bin/env ruby
mystring = :steve.capitalize
puts mystring
|
[slitt@mydesk slitt]$ ./test.rb
./test.rb:2: undefined method `capitalize' for :steve:Symbol (NoMethodError)
[slitt@mydesk slitt]$
|
As an aside, if you want to capitalize the string representation of asymbol, you can first convert it to a string:
#!/usr/bin/env ruby
mystring = :steve.to_s.capitalize
puts mystring
|
[slitt@mydesk slitt]$ ./test.rb
Steve
[slitt@mydesk slitt]$
|
A Symbol is not (Just) a Name
The following illustrates the the use of a symbol as a name:
attr_reader :length
You're naming both a get method (
length())and an instance variable (
@length).
However, symbols can be used to hold any sort of immutable string. Itcould be used as aconstant (but you'd probably usean identifier starting with a capital letter instead. The point is,symbols are not restricted to just names.
That being said, symbols
areused as names quite often, so although equating a symbol to a name isnot correct, saying symbols are often used to hold names is areasonable assertion.
A Symbol is an Object, but So What?
No doubt about it, a symbol is an object, but sowhat. Almosteverything in Ruby is an object, so saying a symbol is an object saysnothing distinctive about symbols.
What can symbols do foryou?
A symbol is a way to pass string information, always assuming that:
- The string needn't be changed at runtime.
- The string doesn't need methods of class String.
Because a symbol can be converted to a string with the
.to_s method, you can create astring with the same value as the symbol's string representation, andthen you can change that string at will and use all String methods.
A great many applications of symbols could be handled by strings. Forinstance, you can do either the customary:
attr_writer :length
Or you can do the avant-garde:
attr_writer "length"
Both preceding code statements create a setter method called
length, which in turn createsan instance variable called
@length.If this seems like magic to you, then keep in mind that the magic isdone by
attr_writer, notby the symbol. The symbol (or the string equivalent) just functions asa string to tell
attr_writerwhat it should name the method it creates, and what that method shouldname the instance variable it creates.
To see, in a simplified manner, how
attr_writer does its "magic",check out this program:
#!/usr/bin/env ruby
def make_me_a_setter(thename)
eval <<-SETTERDONE
def #{thename}(myarg)
@#{thename} = myarg
end
SETTERDONE
end
class Example
make_me_a_setter :symboll
make_me_a_setter "stringg"
def show_symboll
puts @symboll
end
def show_stringg
puts @stringg
end
end
example = Example.new
example.symboll("ITS A SYMBOL")
example.stringg("ITS A STRING")
example.show_symboll
example.show_stringg
|
In the preceding, function
make_me_a_setteris a greatly simplified version of
attr_writer. It does notimplement the equal sign, so to use the setter you must put theargument in parentheses instead of after an equal sign. It does notiterate through multiple arguments, so each
make_me_a_setter can take onlyone argument, which is why we call it individually for both
:symboll and
"stringg".
With the setters made, the application programmer can access thesetters as
example.symboll("ITSA SYMBOL"). The following is the output of the program:
[slitt@mydesk slitt]$ ./test.rb
ITS A SYMBOL
ITS A STRING
[slitt@mydesk slitt]$
|
In most situations, you could use a string instead of a symbol. Perhapsusing a string would decrease performance to some degree. If a literalstring is used repeatedly, it will certainly consume more memory thanits symbol counterpart. Perhaps using a string would be less readable,or less customary. But you can usually use a string in any situationyou can use a symbol. With one exception...
If you need a "string" (term used loosely) that must not be changed,then you need a symbol, because a symbol's value cannot be changed atruntime.
Whatare the advantages and disadvantages of symbols?
Symbols generally have performance benefits. Each symbol is identifiedto the programmer by its name (for instance,
:mysymbol), but the program canidentify it by its numeric representation, which of course is a quickerlookup.
When two strings are compared, somewhere in some abstraction layerpointers must walk both strings, looking for a mismatch. When two Rubysymbols are compared, if their numeric representation is equal, thenthe symbols are equal. If you were to use
:mysymbol twenty times in yourprogram, every usage of
:mysymbolwould refer to exactly the same object with exactly the same numericrepresentation and exactly the same string representation. Thiscan enhance performance.
Because every
:mysymbolrefers to exactly one object and yet "defines" (I use the term loosely)a literal string, it saves considerable memory over using a literalstring every time, because each true literal string consumes memory,whereas once a symbol is defined, additional usages consume noadditional memory. So if you use the same literal string tens orhundreds of times, substitute symbols. Hash keys are an excellentexample.
The granddaddy of all advantages is also the granddaddy of advantages:symbols can't be changed at runtime. If you need something thatabsolutely, positively must remain constant, and yet you don't want touse an identifier beginning with a capital letter (a constant), thensymbols are what you need.
The big advantage of symbols is that they are immutable (can't bechanged at runtime), and sometimes that's exactly what you want.
Sometimes that's exactly what you don't want. Most usage of stringsrequires manipulation -- something you can't do (at least directly)with symbols.
Another disadvantage of symbols is they don't have the String class'srich set of instance methods. The String class's instance method makelife easier. Much easier.
Summary
Ruby symbols generated a 97 post thread on the
[email protected] list. There were many disagreements, some of which got a littleheated. Twenty people, most of them smarter than me, had conflictingviews of how to explain symbols. So if anyone tells you he has the
one true way
(TM) toexplain symbols, he's probably wrong.
No matter what their understanding of symbols, Ruby veterans know howto use them to get the desired results. The problem is, Ruby newbiesdon't know how to use symbols to get the desired results, and yet theymust listen to and try to learn from the often conflicting explanationsfrom Ruby veterans. Ruby veterans often base their explanations on Rubyspecific customs, riffs or even internal implementations, furtherdistancing their explanations from the Ruby newbie.
This document is aimed specifically at the Ruby newbie. It uses verylittle Ruby specific implementation in its explanation of symbols. Itdoes not argue fine points, but instead bestows the information thenewbie REALLY needs to know in order to USE symbols to accomplish hiscoding goals, as well as to read code containing symbols.
The following statements are handy in using (or not using) symbols:
- A Ruby symbol looks like a colon followed by characters. (:mysymbol)
- A Ruby symbol is a thingthat has both a number (integer) and a string.
- The value of a Ruby symbol's string part is the name of thesymbol, minus the leading colon.
- A Ruby symbol cannot be changed at runtime.
- Neither its string representation nor its integer representationcan be changed at runtime.
- Ruby symbols are useful in preventingmodification.
- Like most other things in Ruby, a symbol is an object.
- When designing a program, you can usually use a string instead ofa symbol.
- Except when you must guarantee that the string isn't modified.
- Symbol objects do not have the rich set of instance methods thatString objects do.
- After the first usage of :mysymbolall further useages of:mysymboltake no further memory -- they're all the same object.
- Ruby symbols save memory over large numbers of identical literalstrings.
- Ruby symbols enhance runtime speed to at least some degree.