Ruby (in progress!)

Note: Based on Programming Ruby book

Table of Contents

Chapter 1 - Programming Ruby Book Summary

Interactive Ruby Shell

  • Start with irb
  • Show warnings with -w flag (i.e. ruby -w - "Hi") - to enable warnings, read program from STDIN and pass param
  • Show irb threads with jobs
  • Create irb subsessions with fg 1 (fg <session_id>) and return to parent session fg 0
  • Create irb subsessions within a class to edit it or an instance (i.e. irb MyClassName)
  • Check current session self
  • Exist subsession irb_exit
  • Kill irb threads kill 1,2,3
  • Load files with load 'file/subdirectory/myfile.rb'
  • Tab Completion (press TAB to see method options after start typing)
  • Show IRB History of commands cat ~/.irb-history
  • Create .ircrc to customise IRB

Executable Ruby Programs

  • Run plain Ruby file
ruby main.rb
  • Run executable file having shebang notation at start of file (i.e. #!/usr/bin/ruby)
./main.rb

Ruby Object-Oriented Terms

  • Objects are created and manipulated from a Class.
  • Entities modelled in code based on real world concepts into categories are represented by Classes (variable state and methods using it)
  • Classes define:
  • Class Instances (Objects) created derived based on same base Class using Constructor method new
  • Class Instances have
  • Object Identifier (unique)
  • Object Instance Variables (Object State)
  • Object Instance Methods (invoke by calling method on receiver with message containing method name and parameters)
  • Accessibility Constraints

Ruby Code Evaluation, Convention and Styling

  • Parentheses used due to precedence rules (i.e. puts(foo('hi')))
  • String literals single quote minimal processing (i.e. puts('line1\nline2'))
  • String literals double quote higher processing (i.e. puts("line1\nline2"))
    • Substitutes sequences with binary (i.e. \n with new line character)
    • Expresson Interpolation of sequences #{expression}
  • Last expression evaluated in method is returned (i.e. return not required) is Idiomatic Ruby
  • Naming Conventions:
    • Local variables, method parameters/names - start with lowercase letter or _
    • Global variables - prefix with $
    • Instance variables (multiword) - start with @ (i.e. @foo_bar)
    • Class variables - start with @@foo_bar (avoid them as class variables inherited by children propogate upwards if defined first in the child, and are not actually private in the defining class as leakage occurs)
    • Class/Module names and Constants - start with uppercase letter camelcase (i.e. class FooBar)
    • Constants - capitalised (i.e. FOO_BAR)
    • Method names - may end with ? (aka Predicate Methods), ! (aka Bang Methods modify receiver in place), or =
  • nil means
    • nil.kind_of? Object => true
  • Constants
  • Reference: Programming Ruby book
OUTER_CONST = 99
class Const
  def get_const
    CONST
  end
  CONST = OUTER_CONST + 1
end
Const.new.get_const # => 100
Const::CONST # => 100
::OUTER_CONST # => 99
Const::NEW_CONST = 123

Ruby Global Symbols

  • Refer to pages 314-316 (Exception Info, Pattern Matching Variables, I/O Variables, Environment Variables, Standard Objects, Global Constants) and text Library on page 747 (i.e. $~, etc) for interpretation

Ruby Operators

  • High to low precedence shown on pages 318

Ruby Expressions

  • Chain assignment a = b = c = 0

Ruby Comparables

  • == [1, 2] == [1, 2] # => true

Ruby Object IDs

  • Fixnums have odd object_id. Bignum have even object_id
  • Smallest Bignum is 4611686018427387904
(4611686018427387903..4611686018427387904).each { |i| puts "Number #{i} has:\n\tObject ID: #{i.object_id}\n\tData Type: #{i.class}" }
9223372036854775807
Fixnum
70346687541440
Bignum

Ruby Standard Types

  • Numbers
  • Floating-point numbers do not always have exact internal representation and calling Integer method on inexact value truncates rounding down. Overcome by adding 0.5 before calling Integer method to round up instead.
  • Use BigDecimal for financial calculations
  • Integers outside a certain range are automatically stored/converted in objects of class Bignum
  • Integers may have optional leading base indicator (i.e. decimal 0d200 == 200)
  • Underscores ignored in digit strings so may use as comma (i.e. 100_000_000 == 100000000)
  • All numbers are objects and respond to various messages (i.e. num.abs)
  • Expressions with strings containing only numbers must be explicitly converted to number (i.e. v1 = "5"; Integer(v1))
  • Operations applied to two numbers of different class Type result in a value of the more general one (i.e. integer+float # => float)
  • mathn library used to return most natural representation (i.e. 5/7 # => 5/7)

  • Strings
  • Escape sequences supported between single quotes '' (or delimiters %q{...}) 'escape "\\"' == %q{escape "\\"} # => escape “" 'luke\'s' # => luke’s
  • Escape sequences supported between double quotes "" (or delimiters %Q{...}) "#{'a '*3}" == %Q{#{'a '*3}} # => a a a
  • Interpolated puts "#{ def my(a) a; end; my('test')}"
  • here document string = <<END_OF_STRING my string END_OF_STRING
  • Default String Encoding (Ruby 2) is UTF-8 puts RUBY_VERSION "".encoding # => #

  • Character Constants
  • Strings of length 1 and created by preceding with a question mark (i.e. ?a\n)

  • String Idioms (Common)
  • chomp strip trailing new lines \n from file line read
  • "2:33".split(/:/) to use RegEx and split string around defined character
  • squeeze! to alter string in place trimming off repeated characters (i.e. x.squeeze!(" ") )
  • "2:33".scan(/|d+/) breaks string into chunks based on specified RegEx pattern (i.e. one or more digits)

  • Ranges (as Sequences, Conditional Expressions, or Intervals)
  • Range objects only contain references to two objects. Converting large range to an array may use much memory
  • Sequences have Start and End points and produce successive values in sequence
  • Sequences (Inclusive) uses range operators .. (i.e. 1..10)
  • Sequences (Excludes high value) uses range operators ... (i.e. 0..."luke".length)
  • Check if specific value is in Sequence range = 0..9; range.include?(5) # => true
  • Find max value in Sequence range = 0..9; range.max # => 9
  • Filter Sequence range = 0..9; range.reject {|i| i < 5 } # => [5, 6, 7, 8, 9]
  • Accumulator on Sequence range = 0..9; range.inject(:+) # => 45
  • Convert Sequence to Array ('aaa'..'aac').to_a # => [“aaa”, “aab”, “aac”]
  • Convert Sequence to Enum ('aaa'..'aac').to_enum.next.next # => “aab”
  • Ranges on Objects must respond to succ with next object in sequence and be Comparable using <=> (returning -1 for less than, 0 for equal to, or 1 for greater than)
  • Range using Conditional Expression with Match Operator (match position) name = "luke"; puts name if name =~ /0/ .. name =~ /3/
  • Range in Interval
age = gets.to_f
case age
when 0..35    then puts "young"
when 36..100  then puts "old"
when age === 101 then puts "retired"
else
  puts "unknown"
end

Ruby Booleans

  • Values are true if they are not nil or false
  • Value is true if 0 or ""
  • && is same as and
  • Shortcircuit Evaluation false && true # => returns first argument of false nil && true # => returns first argument of nil true && whatever # => returns second argument of whatever
  • No assignment if variable already set var ||= "whatever" (i.e. var = “whatever” unless var)
  • Assignment of var1 value if it is true, otherwise assign var2 value var = var1 || var2
  • Not not true == !false # => false
  • Defined? defined? nil # => nil

== - test for equal value (not type) === - compare items with target in case statement when clause 1.to_f === 1.to_i # => true 1.to_f.eql? 1.to_i # => false “1”.eql? 1 # => false (matches type and value) “1”.eql? “1” # => true “1”.equal? “1” # => false (matches object_id) “1” <=> 1 # => nil (matches greater than, equal to, or less than) “1” <=> “1” # => 0 “1” <=> “2” # => -1 “1” <=> “-1” # => 1

Ruby Variables

  • Variables hold reference pointing to an object (not the object itself) likely stored in a heap
test1 = "Test1"
puts "#{test1.class}, #{test1.object_id}"
test2 = test1
test2[0] = "B"
puts "Test1: #{test1}, #{test1.class}, #{test1.object_id}. Test2: #{test2}, #{test2.class}, #{test2.object_id}"
test1.freeze
test2[0] = "P"
  • Parallel Assignement to Swap positions of two variables
a, b = 1, 2
b, a = 2, 1
  • Excess elements assigned are discarded
a, b = 1, 2, 3, 4
  • Splat RHS expands before assignment
a, b, c, d, e = *(1..5) => [1, 2, 3, 4, 5]
  • Splat as greedy LHS
  • Only one LHS splat allowed. Soaks up RHS values leaving enough for remaining LHS values
a, *b, c = 1, 2, 3, 4       # a => 1, b => [2, 3], c => 4
  • Raw asterisk to Ignore some RHS values
first, *, last = 1,2,3,4,5,6    # first=1, last=6
  • Nested Assignment
a, (b,*c), d = 1,[2,3,4],5    # a=1, b=2, c=[3, 4], d=5

Ruby Struct

  • Data structure containing attributes (i.e. `Struct.new(:name, :age, :sex) )

Ruby Arrays

  • Arrays hold a collection of references pointing to objects in array index positions
  • Array operation [] used for indexes is an instance method of class Array and may be overridden by subclasses
  • Array index are >= 0. nil is returned if no object at an index. Providing negative index counts from the end
  • Arrays allow Integer as key and any Object value
  • Array Objects (Strings Only) Created and Initialised using Shortcut (i.e. %w{ 99 a}) # => ["99", "a"]

  • Create and initialise array object using array literal: a = [1.23, "abc", 123]
  • Get class: a.class
  • Get array length: a.length
  • Get element value at index: puts "element 1 is #{a[1]}"
  • Get element with no value: a[3] # => nil
  • Get range of element values: a[-2, 2] # => ["abc", 123] # [start, count]
  • Get range of element values: a[-2..-1] # => ["abc", 123]
  • Get first n element values; a.first(2) # => [1.23, "abc"]
  • Get last n element values; a.last(2) # => ["abc", 123]
  • Set element values: a[5] = 2.34 # => [1.23, "abc", 123, nil, nil, 2.34]
  • Inspect array: a.inspect
  • Stack implemented using Array:
stack = []
stack.push("a").push("c").pop # => "c"
  • FIFO Queue implemented using Array:
queue = []
queue.push("a").push("c").shift
puts queue  # => ["c"]
  • Examples:
# generate array where each element is word contained in a sentence string
def words_from_string(string)
  string.downcase.scan(/[\w']+/)
end

Ruby Hash (aka Associative Arrays / Maps / Dictionaries)

  • Indexed Data Structure collections of objects accessible using a key
  • Hashes allow any Object as key or entry value
  • Iterating over entries returns in same item order as when added

  • Create and initialise hash literal with unique KV mapping: h = { 'moo' => 'cow', :dog => 'bark' }
  • Alternative (Ruby v1.9+): h = { 'moo': 'cow', dog: 'bark' }
  • Create and initialise hash object with default value returned:
h = Hash.new(0)
h[9999] # => 0
  • Get class: h.class
  • Get hash length: h.length
  • Get element with no value: h['dog'] # => nil, h[:dog] # => ‘bark’
  • Get element entry value for key: h['blah'] # => 0

Ruby Symbols

  • Symbol literals are always unique and do not require a value. Used for Keys in a Hash

Ruby Control Structures

  • if elseif else end
  • _ if _
  • while _ and _ \ end
  • _ while _

Ruby Regular Expressions (built-in)

  • Ruby uses Onigmo RegEx Library (many extensions over traditional Unix RegEx). Onigmo is extension of Oniguruma regular expression engine.
  • RegEx allows Find Pattern Match, Extract String Matching Pattern, Change String Matching Pattern with Substitute
  • Create RegEx object pattern between slash character delimiters (i.e. match x or y /x|y/)
  • Match Operator =~ (!~ for negated) to match against a RegEx and return starting position or nil (i.e. puts 'hello' =~ /lo/ # => 3)
  • Not Match Operator !~
  • Pass split to a RegEx /\s*\|\s*/ to break string into separate tokens whenever encounter vertical bar symbol
  • Failure in Pattern Matching returns nil/false
  • Match String against Pattern /luke/ =~ “xxxlukexxx”
  • Match Special Character /\*/ (i.e. single asterisk)
  • sub Replace/Substitute First String Matching Pattern and return new string (i.e. "luke".sub(/luke/, "james") )
  • gsub Replace/Substitute All (Global) Strings Matching Pattern and return new string (i.e. "lukelukepeterjohn".gsub(/luke|john/, "james") )
  • gsub! Replace/Substitute All (Global) Strings Matching Pattern modify original string in place (i.e. name = "lukeluke"; name.gsub!(/luke/, "james") )
  • 'a\b\c'.gsub(/\\/, '\\\\\\\\') # => "a\\b\\c"
  • 'a\b\c'.gsub(/\\/) { '\\\\') # => "a\\b\\c" Note: Use block form of gsub so string substitution analysed once during syntax pass only

  • Escape special characters used in pattern match (i.e. /\(no\)/ =~ "(no)")
  • RegEx /mm|/dd/ and Regexp.new("mm/dd") and %r{mm/dd} (return character position of match)

  • RegEx /mm|dd/.match("mlm") (returns MatchData Object encapsulating match info or nil)
  • .pre_match returns string before match
  • .post_match returns string after match (i.e. /mm/.match("aammccdd").pre_match)
  • [0] returns matched portion (i.e. /mm/.match("aammccdd")[0])

  • RegEx Options used to Configure pattern matching of strings
  • Use characters after terminator for /mm|/dd/ (RegEx Object Literal)
  • Use constants as second parameter of constructor for Regexp.new("mm/dd")

  • RegEx Options include:
  • i Case insensitive
  • o Substitute once
  • m Multiline mode so . matches any character including new line character (normally ignored)
  • x Extended mode allows making RegEx more readable

  • Anchors
  • \s match whitespace
  • \S match non-whitespace
  • \d match decimal
  • \D match non-decimal
  • \w match word
  • \W match non-word
  • \R match linebreak (i.e. \r\n)
  • ^ only match when pattern at start of line (i.e. /^xxno/ =~ "no")
  • $ match end of line
  • \A match start of string
  • \z match end of string (\Z matches end of string before \n)
  • \b match only occurrences at word boundaries (at start or end of a word) (i.e. /\be/ =~ "pet eat the" # => 4 )
  • \B match only occurrences not at word boundaries (not at start or end of a word) (i.e. /\Be/ =~ "pet eat the" # => 1 )
  • \& last match
  • \+ last matched group
  • \ string prior to match
  • \' string after match

  • Character classes
  • [abc$s#] match any in this Set of characters (including special)
  • [A-Fa-f][^0-9] matches sequence of characters in current UTF-8 encoding (i.e. first instance where two letters are adjacent)
  • [^0-9] negate matching (i.e. find where is not a letter /[^\w]/ =~ "a a")
  • ?d matches /(?a)\w+/) =~ "über" # => ü->ber<- (also /(?d)\W+/) ... # => ->ü<-ber)
  • ?a matches ASCII only /(?d)\w+/) # => ü->ber<-
  • ?u matches full Unicode /(?u)\w+/) # => ->über<- (also note /(?u)\W+/) ... # => über->.<-)
  • POSIX characters (corresponding to ctype(3) macros) pattern matched (i.e. /[[:space:]]/)
  • /p matches specific encoding characters match Unicode character /\p{Digit}/
  • /./ matches any character except newline

  • Intersection
  • && Intersection of characters matched (i.e. [a-f&&[^de]])

  • Alternation
  • | matches construct before or after it, but has low precedence (use RegEx Grouping to override default precedence)

  • Repetition
  • Warning: /a*/ always matches as all strings contain zero or more a’s
  • Greedy (match as much of string as possible)
  • Lazy (stop after minimum work performed to achieve task)
  • r* matches zero or more occurrences of r
  • r+ matches one or more occurrences of r
  • /ab+/ matches an ‘a’ followed by one or more b’s (not sequence of ab’s)
  • /(an)+/ matches the sequence ‘an’ one or more times
  • r? matches zero or one occurrence of r
  • r{m,n} matches from m to n occurrences of r
  • r{m,} matches at least m occurrences of r
  • r{,n} matches at most n occurrences of r
  • r{m} matches exactly m occurrences of r

  • Grouping
  • Parenthesise stores results of partial pattern matches
  • Note: within pattern sequence \1 is first group match, \2 is second group match, etc
  • Note: outside pattern use $1 and $2 for same purpose
  • Note: using MatchData object returned by match allows use of [0], [2], etc to get subpatterns
  • /red (ball|angry) sky/ # => the ->red angry sky<-
  • Name a group using (?<name>... and then subsequently refer to named group with \k<name>
  • Mix Group Names and Position-based references (i.e. /(?<hour>\d\d):(?<min>\d\d)(..)/ =~ "12:45pm"; "Hour #{hour}, minute #{$2}" # => “Hour 12, minute 45” )
  • \k<name> used to subsequently refer to Named Captures

  • Backslash Substitution
  • puts "luke:schoen".sub(/(\w+):(\w+)/, '\2, \1') # => schoen, luke
  • puts "luke:schoen".sub(/(?<first>\w+):(?<last>\w+)/, '\k<last>, \k<first>') # => schoen, luke

  • Commenting Regex
  • ?# comment

  • Back References
  • Ignored by using (?:re) (makes ‘re’ a group without generating back references i.e. $2, $7, etc where find this specific group)
date = "12/25/2016"
date =~ %r{(\d+)(/|:)(\d+)(/|:)(\d+)} [$1,$2,$3,$4,$5] # => ["12", "/", "25", "/", "2016"]
date =~ %r{(\d+)(?:/|:)(\d+)(?:/|:)(\d+)} [$1,$2,$3] # => ["12", "25", "2016"]
  • Lookahead
  • ?=re Zero-Width Positive Lookahead extension to match pattern #1 that is followed by another pattern #2, but only want to consume pattern #1 in the match (looks forward for context of match without affecting $&)
  • negated form ?!re
"luke, schoen, and another".scan(/[a-z]+(?=,)/) # => ["luke", "schoen"]
  • Lookbehind
  • ?<=re Zero-Width Positive Lookbehind extension (or negated form ?<!re)
res = []; "joeschoen lukeschoen".scan(/(?<=luke|joe)schoen/) { |word| res << [word, Regexp.last_match.offset(0)[0]] }; p res
  • Backreferences using Named Matches
  • \k<name> used to subsequently refer to Named Groups (by matching whatever is matched by the subpattern)
timeframe = "12:15-12:45"
# Backreference using numbering
timeframe =~ /(\d\d):\d\d-\1:\d\d/ # => 0

# Backreference using Name Matching
timeframe =~ /(?<hour>\d\d):\d\d-\k<hour>:\d\d/ # => 0
  • Negative Backreferences numbers (relative numbers that count backward)
  • i.e. match four-letter palindrome
"abba" =~ /(.)(.)\k<-1>\k<-2>/ # => nil
  • Invoke Subpattern
  • \g<name> re-executes the match in the subpattern (DRY). i.e. in below example it returns match when red, green, or blue is the 1st, 3rd, and 5th word
/(?<color>red|green|blue) \w+ \g<color> \w+ \g<color>/ =~ "red sun blue moon green"
  • Recursively invoke backreference pattern in structured RegEx using \x
# brace expression is an open brace then sequence of zero or more chars or brace expressions (nested parenthesis are not allowed), then a closing brace
regex_valid_syntax = /
  \A                            # from start of string
    (?<brace_expression>        # create a group named brace_expression
      {                         # start of the brace_expression
        (                       # zero or more occurrences of parenthesis and inner contents
          [^()]                 # anything but parenthesis inside the braces
        |                       # or
          \g<brace_expression>  # nested brace expression
        )*
      }
    )
  \Z                            # until end of string
/x                              # allow writing understandable expression with indentation
  • Positive Lookahead using Zero-Length Assertions to return overlapping matches
# positive lookahead using zero-length assertions, since matches cannot overlap
evidence = "hackddanger"
guilty_terms = [/danger/i, /hack/i, /ckdd/i]
regex_guilty_terms = /(?=(#{Regexp.union(@guilty_terms)}))/
result = [].tap do |arr| 
  evidence.scan(regex_guilty_terms) do |x| 
    arr << [$1, $~.begin(1)]
    # alternative syntax: arr << [index, Regexp.last_match.offset(0)[0]]
  end
end

Ruby Blocks/Closures (similar to Anonymous Functions)

  • Blocks are Closures that remember context they are defined in and uses the context when called
  • Code Blocks are chunks of code between braces { ... } (single line) or do ... end (multi-line) passed around as parameters associated with call to method invocations and callbacks
  • Implement Block after method call/invocation and other parameters. Methods invoke associated block one or more times using yield statement
  • Blocks are an additional implicit parameter passed to the method
  • Variables declared inside block with same name as variables declared beforehand outside block in same scope are the same
  • Avoid name conflicts by declaring block-local variables after semicolon in parameter list
a = "a"
["b"].each { |char; a| a = char }
puts a
["b"].each { |char| a = char }
puts a
  • Parameters passed to a block are local to the block
def owner(name)
  yield(name)
end
owner('luke') { |name| puts "#{name} owns" }
  • Convert Block to an Object by Prefixing last parameter in method definition with ampersand &my_block
  • When method is called Ruby converts the code block named my_block to an object of class Proc and assigns it to the method parameter as a variable
  • Call the call method with parameters (if any) on the Proc object to invoke the code
def pass_in_block(&my_block)
	my_block
end

block_as_object = pass_in_block { |my_param1| puts "Parameter1 is #{my_param1}" }
block_as_object.call(10) # => Parameter 1 is 10
  • Example: Pass Multiple Blocks directly into Method
def my_method(condition, large, small)
	condition ? large.call : small.call
end

5.times { |val| my_method(val >= 3, -> { puts "#{val} is large" }, -> { puts "#{val} is small" }) }
  • Example: Pass Multiple Args and Block pointer parameter into Method
proc1 = -> (*args, &block) { puts "args = #{args.inspect}"; block.call }
proc1.call(1, 2, 3) { puts "in block1" }
  • Example: Refactor multiple puts using Lambda
operator = gets
number = Integer(gets)
if operator =~ /^t/
  puts((1..10).collect {|n| n*number }.join(", "))
else
  puts((1..10).collect {|n| n+number }.join(", ")) end

if operator =~ /^t/
  calc = lambda {|n| n*number }
else
  calc = lambda {|n| n+number }
end
puts((1..10).collect(&calc).join(", "))
  • Next, Break, Redo
  • Add to iterations and blocks
  ...
	next if el.nil? 							# skip to next el
	 											# skip el starting with #
	break if el[/end/]							# shortcut first match or nil
	redo if el.gsub!(/`(.*?)`/) { eval($1) }	# retry after substitution
	...

Ruby I/O Routine - Simple Interface

  • I/O methods for Standard Input/Output for use with filters are implemented in Kernel Module (gets, open, print, printf, putc, puts, readline, readlines, test)
  • Output arguments with newline character after each puts
  • Output arguments without newline print
  • Outputs arguments under control of format string (i.e. printf("Name: %s,\nAge: %5.2f", "luke", 35.67) substitutes string and a float with min 5 chars with 2 after decimal point, and embeds a newline)
  • Outputs of puts and print output to standard output by default
  • Input read next line from standard input stream gets (i.e. puts "old" if gets > 35.to_s)
  • Note: gets returns nil when reaches end of input (i.e. while entry = gets \ print entry \ end)
  • Command-Line variable Arguments accessible using array ARGV (i.e. test.rb: puts ARGV.size, execute: ruby test.rb hi bye # => 2)
  • Command-Line filename Arguments accessible using array ARGF
  • Append object to array and output IO stream STDOUT << 34 << " Sydney" << "\n"

Ruby I/O Objects

  • IO single base class handles input/output
  • IO subclasses include: File, BasicSocket
  • IO objects provide bidirectional channel b/w Ruby app and external source

  • File Subclass

.new returns new File object

  • read r / write w / both r+
  • close file ensures buffered data written and resources freed (not guaranteed until garbage collection when exception occurs)
file = File.new("testfile", "r")
file.close

.open invokes block passing opened File as param and closes File when block exits

  • Benefit: closes file automatically and always ensures buffered data written and resources freed even when exception occurs

  • Loops for reading file with block

file = File.open("testfile", "r") { |file|
  while line = file.gets { puts line }
} 
  • Iterators for reading/writing file with block
  • Default line ending \n appears
  • Custom line ending e
File.open("testfile") { |file|
  file.each_line("e") { |line| puts "Reading #{line.dump}" }
} 
File.open("testfile", "w") { |file|
  file.puts "1 + 1 = #{1+1}"
} 
  • Iterators for reading file with block
  • I/O source opened for reading with iterator called onces for each line in file and auto closed
IO.foreach("testfile") { |line| puts "Reading #{line.dump}" }
  • Iterators for reading file with block and assignment
  • Assign entire file retrieved to string or array of lines
str = IO.read("testfile")
arr = IO.readlines("testfile")
  • StringIO Objects
  • Read/write strings (not files)
require 'stringio'

ip = StringIO.new("now is\nthe time\nto learn\nRuby!")
op = StringIO.new("", "w")
ip.each_line { |line| op.puts line.reverse }
op.string # => "\nsi won\n\nemit eht\n\nnrael ot\n!ybuR\n"

Ruby Iterators

  • each (with_index) instead of for ... in for iterating over array passing elements to block
["a", "b"].each.with_index { |element, index| puts "Index #{index} is element: #{element}" }
  • map or collect for iterating over array passing elements to block where new array constructed
  • reduce or inject applying accumulator to collection returning final value
  • initial iteration accumulator is set to parameter of inject (i.e. 3)
  • subsequent iterations accumulator is set to value returned by previous block call
  • final value of inject is value returned by block the last time it is called
[2,10,10].inject(3) {|accumulator, element| accumulator*element} # => 600
  • select for filtering values from a collection
  • reject is opposite of select
  • find returns true upon first match in collection for given query
%w( luke claudia ).each { |name| puts name }
2.times { print "*\n" }
1.upto(3) { |i| print i }
50.step(80, 5) { |i| print i }
arr = 50.step(80, 5).to_a
arr.include? 50
('a'..'f').each { |char| print char }

Ruby Language Extensions

# Requires `gem install facets`
require 'facets/proc'

# Proc invocation sugar (note: extends enumerable)
class Proc
  def in?(enumerable)
    self.bind_to(enumerable).call
  end
end

# Compose some (Enumerable) methods 
any_nils = Proc.new{ any?(&:nil?) }

# Tests
puts any_nils.in?([1, 2, nil])    #-> true
puts any_nils.in?([nil, 2, 3])    #-> true
puts any_nils.in?([1, 2, 3])      #-> false

Ruby Enumerators (external Iterators)

  • Ruby built-in Enumerator class implements external iterators
  • Create Enumerator object calling to_enum method (or enum_for in collections)
  • Iterator Methods using Enumerators
s = "abc"
a = [1, "a"]
h = { earth: "planet", person: "human" }

# Create Enumerators
enum_s = s.each_char
enum_a = a.to_enum
enum_h = h.to_enum

enum_a.next
enum_a.next
enum_h.next
enum_h.next

# Loop terminates after iterating over all Enumerator object values
loop { puts "#{enum_a.next}" }
loop { puts "#{enum_h.next}" }

# Enumerator built-in method with item value and index parameters
result = []
enum_s.each_with_index { |item, index| result << [item, index] }
result # => [["a", 0], ["b", 1], ["c", 2]]

# Create Enumerator explicitly
enum_s.enum_for(:each_char).to_a # => ["a", "b", "c"]
  • Chains of Lazy Enumerators
  • Lazy Enumerator instance returned by calling .lazy on Generator Enumerator to convert it and providing built-in support to filter the infinite sequences (i.e. select, map) generated by an Enumerator.
  • Lazy Enumerator allows calls (i.e. select) to return as it only consumes values on demand (not infinite sequence of values from Generator Enumerator)
  • Lazy Enumerator methods chained each return a new enumerator with specific logic applied to input data collection only when requested
# Helper method for Integer class
def Integer.all
  Enumerator.new do |yielder, n: 0|
    loop { yielder.yield(n += 1) }
  end.lazy
end

# Lazy filter methods as Procs (Closure) that return new Enumerator objects (for readability and reusability)
# Note: Proc called using `-> (arg1, arg2) { ... }` instead of `Proc.new { |params| ... }` or `lambda { |params| ... }` generates error if incorrect number of arguments passed to Proc
# Note: `->` represents Lambda character
# Reference: http://ruby-doc.org/core-2.2.0/Proc.html
multiple_of_three     = -> n { (n % 3).zero? }
contain_value_of_two  = -> n { n.to_s =~ /2/ }

# Retrieve array containing first 10 multiples of 3
p Integer \
  .all \
  .select(&multiple_of_three) \
  .select(&contain_value_of_two) \
  .first(10)
  • Blocks to define code to run under Transaction Control
  • Simpler and less error prone approach using Blocks rather than Linear code
class FileList

  # Class method called by block so files manage their own lifecycle
  def self.open_and_process(*args)
    begin
      f = File.open(*args)
      yield f
    rescue
      print "Error opening and processing file"
    else
      print "File opened and processed without error"
    ensure
      f.close() unless f.closed?
    end
  end
end

contents = []
FileList.open_and_process("file_list/testfile", "r") { |file| while line = file.gets; contents << line; end }

Ruby Object-Oriented System Design of Classes

  • O-O Class Designs representing External Things:
    • Identify resources to deal with using Class entities
    • Represent each captured input data reading as a generated Class Instance Row
    • Represent all captured input data as collection of all Class Instance Objects
    • Decide Internal State (Class Instance Variables) and use initialize method to setup Object in usable state
    • Decide External State of Object (Attributes) appearance from outside Class to users (exposed using Attributes aka Getter/Setter Methods) that use the initialized state
    • Decide on other actions for Class (regular Class Instance Methods) that use the initialized state
  • O-O Class Designs Pattern to representing Internal Things:

e.g.

  • Problem Statement A: we want to represent data for each Person on Earth
    • What does the representation? PeopleOnEarth class
  • Problem Statement B: we want to 1) consolidate and 2) summarise data feed inputs from CSV files
    • What does consolidation data? CsvReader class
    • What does summarising data? CsvReader class
  • Problem Statement C: we want to parse data from CSV
    • What does parsing data from CSV? Ruby CSV Library
    • Note: Create a PeopleOnEarth Object by extracting data from columns of each CSV row we iterate over and append the PeopleOnEarth Object Instance to a Class Instance Variable that was created in the initialize method
  • O-O File Organisation:
    • Organise source code into multiple Class files and a main driver program file
    • Benefits: Separate files instead of monolithic in same file to ease automated tests and ease class reuse
  • O-O Class Access Control

    • Control access to Class Instance Object Methods to control changes to its state
    • Expose Logical Class Interface and hide details of implementation (prevent usage that causes tight coupling)
    • Do not expose methods that may cause invalid Object state
  • Create Class Definition
class PeopleOnEarth

  # Initialize Object. Transfer and store parameter information in Local Instance Variables when constructed to set Object State before initialize returns 
  # Store unique State within Class Instance Object in distinct set of Local Instance Variables @

  def initialize(name, age)
    @name = name
    
    # Accept any object for age parameter that converts to a Float otherwise raise Exception
    @age = Float(age)
  end

  # Stored Instance Variables are available to Instance Methods of Class Instance Objects
  def to_s
    "Name: #{@name}, Age: #{@age}"
  end
end
  • Create Class Instance Object of class PeopleOnEarth and Set Object’s State with distinct identity
  • Ruby allocates memory to hold uninitialised object
  • Ruby calls the object’s initialize method passing parameters given to .new a_person = PeopleOnEarth.new("luke", 35)
  • Pass a Class Instance Object to puts method that calls to_s to get string representation
  • Without Overriding to_s Instance Method p a_person # => <PeopleOnEarth:0x007f8b3285fe20> puts a_person # => <PeopleOnEarth2:0x007fc4db892388 @name="luke", @age=35.0>
  • With Override of to_s Instance Method p a_person # => Name: luke, Age: 35
  • Override default implementation of to_s to improve rendered formatting of objects

Ruby Methods - Object Attributes to Access/Manipulate Object State

  • Instance Variable Internal State is Private to a Class Instance Object
  • Attributes of an Object are Externally visible (Public)
  • Default values for method arguments def my_method(arg1="luke", arg2=arg1)
  • Value returned by method is the last statement executed by it
  • return statement exits current method or loop
  • Passing Hash as argument to a method does not require braces (i.e. `my_method(:arg1, arg2_hash_key1: “hi”, arg2_hash_key2: “bye”)) if preced splat and block arguments
  • Parallel assignment to collect return value (i.e. return a, b, a, b = my_method )
  • Splat an Argument to capture multiple arguments (in Array) assigned to single parameter def my_method(arg1, *other_args)
  • Useful when not using inheriting arguments from superclass
class Child < Parent
  def inherited_method(*not_used)
    # local processing
    super
  end
end
  • Double Splat ** to send/receive as Hash (Single Splat * only receives as Array)collect extra arguments passed as options to a function as a hash parameter
def my_method(arg1, arg2, arg3: 0, **arg_other1)
  p [arg1, arg2, arg3, arg_other1]
end
options = { arg3: 100, arg_other3: "c", arg_other2: "b", arg_other1: "a" }
my_method(:arg2, :arg1, options) # => [:arg2, :arg1, 100, {:arg_other3=>"c", :arg_other2=>"b", :arg_other1=>"a"}] 
  • Expand Collections in Method Calls using Splat
  def my_method(a, b, c, d, e, f)
    p "#{a} #{b} #{c} #{d} #{e} #{f}"
  end
  my_method("a", *[1,2], *[2..4]) # => "a" 1 2 2 3 4

Ruby Methods - Object Accessor Method Common Idiom and Shortcut (Getter)

  • Accessor Methods Long Way to access and return values of Instance Variables
def name
  @name
end

def age
  @age
end

a_person = PeopleOnEarth.new("luke", 35)
a_person.name
a_person.age
  • Accessor Methods Shortcut attr_reader
    • Ruby decouples Class Instance Variables and Accessor Methods
    • attr_reader creates Accessor Methods so variables do not need to be declared
    • Uses Symbols to conveniently reference the name :name with value accessed with name
attr_reader :name, :age

Ruby Methods - Object Attributes (Setter Methods)

  • attr_accessor - Both Read and Write Access
  • attr_reader - Read-only Access
  • attr_writer - Write-only Access

  • Create Ruby method with name ending with = as target for assigning to Class Object Instance Variable
...

attr_accessor :age

def age=(new_age)
  @age = new_age
end

...
# invoke Setter Method in the PeopleOnEarth Class Object Instance passing new age as argument
a_person.age = a_person.age + 1
puts "New Age: #{a_person.age}" 
  • Assign to Instance Variable using Method Chaining
def <<(new_rating)
  @new_rating = new_rating # allow method chaining
  self
end
  • Instance Method accepts Array where last element contains value assigned when called
def []=(*params)
  @group[params.pop] = params
end
c = MyClass.new
c[1,2] = :group_b
puts c # => { :group_b => [1,2] }

Ruby Methods - Object Virtual Attributes (Setter Methods)

  • Virtual Attribute Methods create Virtual Instance Variables (hiding implementation and difference b/w Class Instance Variables and Calculated Values so outside class they appear like ordinary attribute but having no internal Class Instance Variable)
  • Benefit: Uniform Access Principle - Internal Class implementation changes do not impact and require changes to code using the Class
  • Note: Floating-point numbers do not always have exact internal representation and calling Integer method on inexact value truncates rounding down. Overcome by adding 0.5 before calling Integer method to round up instead.
  • Note: Use BigDecimal for financial calculations
def age_in_months
  Integer(age*12 + 0.5)
end

...

a_person.age_in_months

External Dependencies

# load library external standard CSV dependency
require csv

# load custom Class dependency (from file in same directory)
require_relative 'people_on_earth'

Access Control

  • Note: initialize method is always Private
  • Note: Methods are Public by default
  • Note: Access control determined dynamically as program runs (not statically)

  • Public Methods - No access control
  • Protected Methods - Only invokable by any Class Instance Object of defining class and associated subclasses
  • Private Methods - Only callable in context of current calling object receiver (self)

  • Approach #1 - Set default access control of subsequently defined methods
class MyClass
  def method1
  end
  
protected # or private/public
  def method2
  end

end 
  • Approach #2 - Set default access control of named methods listed as arguments to access control function
class MyClass

  def method1
  end
  
  def method2
  end
  
  def method3
  end
  
  def method4
  end
  
  # public, protected, or private
  public    :method1, method2
  protected :method3
  private   
end

Ruby Sharing Functionality

  • Avoid duplicating functionality (DRY codebase)
  • Generic functionality to inject across different classes
  • Ruby is Single-Inheritance language as a Ruby class has only one direct parent
  • Ruby provides controlled multiple-inheritance-like capability when Ruby classes include functionality of any number of Mixins

  • Class-Level Inheritance
  • Disadvantage is tight coupling
  • Inheritance allows creation of subclasses (child of parent superclass)
  • Child inherits capabilities of parent class including methods
  • Calling super sends a message to the parent of the current object requesting to invoke method of same name and passing parameter
  • Object-Oriented programming requires subclasses to ensure inherited initialisation gets run with a call to super in initialise method of subclass
class Parent
  def public_parent_method
    puts "public_parent_method of #{self}"
  end

  protected
  
    def protected_parent_method
      puts "protected_parent_method of #{self}"
    end

  private
  
    def private_parent_method
      puts "private_parent_method of #{self}"
    end
end

class Child < Parent
end

c = Child.new
c.public_parent_method
Parent.superclass
Child.superclass
  • Mixins (using Modules) and Metaprogramming
  • Modules are not a class and do not have instances
  • include references a Module. require first when Module in separate file
  • include Module in class definitions to make Module Instance Methods available to class
  • module_function <module_method_name> to access Module instance methods directly from outside module
  • Mixins that require their own state to be stored should be written as a class
  • Design with flexible Mixins/Metaprogramming uses Composition and avoids tight coupling of Class-level Inheritance

  • Pub/Sub Observable O-O Design Pattern
  • Reference https://ruby-doc.org/stdlib-1.9.3/libdoc/observer/rdoc/Observable.html

  • Clone instance

  • clone shallow copies an object instance
  • does copy instance variables of object
  • does copy tainted state of obj
  • does NOT copy frozen object state
  • does copy frozen object instance variable state
  • does copy any associated singleton class
  • does NOT copy objects they reference
  • does shallow copy and but only allows non-granular changes to reflect back on original

  • Duplicate instance

  • dup shallow copies an obj (duplicates just the state of an object)
  • does copy instance variables of obj
  • does copy tainted state of obj
  • does NOT copy frozen object state
  • does copy frozen attributes of instance variables
  • does shallow copy and but only allows non-granular changes to reflect back on original
  • does not copy any associated singleton class
  • does NOT copy objects they reference

Ruby Shell Commands (Command Expansion)

  • Enclose string in Backquotes (aka Backticks) or Delimiters i = 1 id=#{i} %x{ls -al} by Defaut causes the operating system to execute the enclosed command with the value returned being the standard output without newlines stripped

Ruby Exceptions (Begin, Raise, Rescue, Retry, Ensure, End)

  • Exceptions package error info into Exception object that propogates up calling stack to find code that handles the type of exception
  • Exceptions hierarchy of built-in child exception classes may be used (see page 146 text)
  • Custom exceptions by subclassing built-in class
  • Exception handling by enclosing code that could raise exception in begin/end block with rescue clauses handling exception types
@toggle = 0

begin
  @toggle += 1
  p "Toggle is: #{@toggle}"
	# Multiple exceptions to catch in process
	if @toggle == 1 || 2    then
	  p "Processing #{@toggle}"
	  raise Exception
	elsif @toggle == 3      then 
	  p "Processing #{@toggle}"
	  raise StandardError
  elsif @toggle == 4      then 
  	p "Processing #{@toggle}"
  	raise "Bad error 4"
  elsif @toggle == 5      then 
    p "Processing #{@toggle}"
    # Caller method produces stack traces and removes two routines by passing subset of call stack to new exception
    raise ArgumentError, "Bad number 5", caller[1..-1]
	elsif @toggle > 5       then p "Done #{@toggle}"
	else                    p "Done #{@toggle}"
  end
# Ruby compares raised exception against each `rescue` clause in turn 
# until match is found using `parameter === $!`	(or superclass) since exceptions are classes
# that are kinds of Module that has a `===` method returning true if match is descendant. 
# Default match is StandardError to handle an error
rescue StandardError => err # Local variable declared to receive matched exception (instead of $!)
	p "Error: " + err
rescue Exception
	p "Exception is #{$!}" # Reference to Exception object in global variable $!
	# Add code here to assist in filtering exceptions
	# Pass exceptions unable to handle to higher-levels in error processing hierarchy
	if (@toggle == 0)        then 
	  p "Found 0"
	  raise if !@toggle.nil? # Intercept and re-raise current exception (or RuntimeError if no current exception)
	elsif (@toggle == 1 || @toggle == 2) then
	  @toggle = rand(2..10)
	  retry
	end
# Guaranteed to run code
ensure
  p "Finished"
end
  • System Errors
  • Raised when call to OS returns error code and wrapped in Ruby object i.e. Errno:: (list of codes on Unix man errno)

Ruby Exceptions (Catch, Throw)

  • Escape nested code during normal processing
def prompt_user(prompt)
  p prompt
  res = readline.chomp
  throw :requested_exit if res == "q"
  res
end

catch :requested_exit do
  name = prompt_user("Name: ")
  age = prompt_user("Age: ")
end

Ruby Network

  • Socket Library (TCP, UDP, SOCKS, Unix Sockets, writing servers)

  • Example: Create Local Web Server using HTTP

require 'socket'
client = TCPSocket.open('127.0.0.1', 'www')
client.send("OPTIONS /~LS/ HTTP/1.0\n\n", 0) # 0 is standard packet
puts client.readlines
client.close
  • Example: Read HTML and list Images at a Webpage using library open-uri instead of net/http
  • .open method recognizes http:// and ftp:// URLs in filename
  • handles redirects automatically (except https)
  • error handling
  • reports “Not Found” errors (i.e. 404)
require 'open-uri'
open('https://pragprog.com') do |f|
	puts f.read.scan(/<img alt=".*?" src="(.*?)"/m).uniq[0,3]
end
  • Example: Parse HTML using library open-uri leveraging RegEx
require 'open-uri'
page = open('http://www.google.com').read
if page =~ %r{<title>(.*?)</title>}m        # RegEx with forward slash
  puts "Title is #{$1.inspect}"
end
  • Example: Parse HTML using Nokogiri
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::HTML(open("http://www.google.com/"))
puts "Page title is " + doc.xpath("//title").inner_html

Ruby Fibers, Multi-Threading, Multi-Processes

  • Simple asymmetric (plain old) Coroutine mechanism with Fibers
  • https://ruby-doc.org/core-2.2.0/Fiber.html
  • Programmer schedules (instead of VM) to create a code block that may be paused and resumed like threads
  • Lightweight primitive for concurrency
  • Controlling app generates Fibers, suspends them at points and passes control back to resume
  • Previously we would use regular block yield
  • Allows write program that appear to use threads without complexity of using threading
  • Fibers that transferred control to another Fiber cannot be yield and resume until control is transferred back

  • Example: Read File, Find Words, and Count Words
words = Fiber.new do
  File.foreach("testfile") do |line|
    line.scan(/\w+/) do |word| 
      Fiber.yield word.downcase
    end 
  end
  nil
end

counts = Hash.new(0)
while word = words.resume
  counts[word] += 1
end
counts.keys.sort.each {|k| print "#{k}:#{counts[k]} "}
  • Full Fibers mechanism by extending Fiber class with Fiber Library
  • Co-routine control structure provides full symmetrical co-routines for decoupling
  • Control when scheduled. Initial Fiber state is suspended. Runs block when resumed until block finishes or hits a Fiber.yield and yields back to code that resumed it.
  • Fiber objects given instance methods transfer (allows transfer control to other fibers) and alive? method
  • Fiber class given current singleton method

  • (DEPRECATED) Continuations Library (like threads)
  • continuation records state of app (current binding) between requests and resumes from it in future
  • continuation objects generated by callcc method of the Continuation Library
  • continuation objects hold return address and execution context allowing non-local return to end of callcc block from elsewhere in app

Multi-threading to split tasks in same program (decouple execution)

  • Risk: Ruby extension libraries are mostly not thread safe (since ritten for old threading model)
  • Mitigation: Ruby compromises by using native OS threads using only a single thread at a time (never runs Ruby code truly concurrently)
  • Possible: Thread running some Ruby code on say Thread #1 while I/O on another Thread #2
  • Note: Post-Ruby 1.9 threading performed by the operating system and multiple processors (and multi-cores of single processor)
  • Note: Pre-Ruby 1.9 known as Green Threads (threads switched within interpreter)

  • Thread.new(param) { |val| ... } is passed a param and given code block to run in new thread it creates.
  • Local variables by pass param to block as val local to thread since we do not want other threads to use the same variable
  • print "#{val}\n" syntax should be used inside thread block (instead of puts to prevent interleaving)
  • Call Thread#.join to block calling thread until requested thread completes and before program terminates (since all threads are otherwise killed regardless of state when program terminates)
  • Thread timeout param to Thread#.join to not block forever and return nil if expires before thread terminates
  • Call Thread#.value to block calling thread until requested thread returns value of last statement executed by it
  • Call Thread#.current to show current thread
  • Call Thread#.list to list all thread objects
  • Call Thread#.status or Thread#.alive? to show thread status
  • Call Thread#.priority to change thread priority

  • Control Thread Scheduler (Advanced)
  • Call Thread#.run to run specific thread
  • Call Thread#.stop to stop thread
  • Call Thread#.pass to deschedule thread allowing others to run

  • Share local thread variables with other threads by treating Thread object as a hash and access variables by name
    • Write local variables by indexing current thread object for sharing Thread.current[:myvar] = myvar_value
    • Race condition possible if multiple threads set this variable
    • Race condition avoided by synchronising access to shared resources (i.e. variables being used)
    • Joining to thread that has raised an exception raises the exception in the thread performing the joining
  • Threads and Exceptions
  • abort_on_exception flag (disabled default) determines how deal with thread raising an unhandled exception. Default is to kill thread but without calling join the exception is not logged
  • use $DEBUG flag of interpreter (with -d) to kill main thread on unhandled exceptions

  • Mutex (prevent Race Conditions)
  • Create synchronised regions of code that only one thread may enter at a time (to prevent race conditions)
  • Analogy - Pass to control access queue for threads trying to access resource. Mutex locks so only one thread at a time.
  • synchronize provides shortcut by automatically locking mutex, running block, and unlocking mutex. Mutex unlocks even if exception thrown when locked
  • try_lock claims a mutex lock on a resource only if it is currently unlocked (print status message if not) to avoid suspending current thread if it is not
  • sleep to temporarily unlock a mutex that you are holding

  • Pattern #1
mutex = Mutex.new
...
mutex.lock
...
mutex.unlock
  • Pattern #1 - Shortcut using synchronize
mutex = Mutex.new
...
mutex.synchronize { ... }

Multi-processes spawning and management to split tasks across different programs

  • Multi-core CPU processing
  • Run separate process not written in Ruby

  • New Process Spawning
  • System approach but command output goes to programs output echo 'system("ls")' | ruby
  • Backticks approach executes command in subprocess and captures its standard output
  • IO.popen method with fork runs command piped parent and subprocess and connects it to STDIN. Ruby IO object can read it on STDOUT. The PID of the process may be obtained with IO#pid method

Performance Testing

  • Ruby Profiler require 'profiler' or -r profile

Ruby Gems

  • RubyGems can be libraries for use in app code.
  • RubyGems can install utility programs for invoking from command line (or wrappers around libraries in the gem) but they are not versioned
  • Show where Ruby installed
  • ruby -e 'puts $:' shows where Ruby installed

  • Find Gems
  • gem RubyGems tool is built in to Ruby so gems may be used as libraries
  • gem query --details --remote --name-matches <gem_name> to search remote central gem repo for a Gem matching RegEx
  • gem list --details --remote --all <gem_name> to show list of available versions of gem
  • gem list shows gems on local box

  • RubyGems RDoc Documentation (gems, dependencies, descriptions)
  • gem environment gemdir to show documentation main directory, documentation is located under subdirectory /doc
  • gem server to run server with docs, and go to http://localhost:8808/

  • Use specific version of gem using multiple predicates
  • gem 'builder', '> 1.0', '< 1.5.6; require 'builder' # major.minor.patch_level
  • ~> is boxed version operator for version >= specified and < specified version after increasing its minor version by 1

Rake

  • Automation tool for task management
  • Package code as a Rake “task” (code for rake to execute)
  • Rake searches for a Rakefile containing definitions for tasks to run
  • desc rake method provides single line of documentation for the task
  • task rake method defines rake task as a parameter (symbol) to execute from command line
  • “Compose” a task using Rake that depends on other tasks to be run first (pass task method task a Ruby hash with task name key and value is array of tasks)
  • Rakefiles may contain ruby methods to eliminate duplication i.e. def ... end
  • Rake task named default is run by rake with no parameters
  • rake -T - Find tasks implemented by Rakefile with a description

  • TODO - try use Rake to rebuild a file when another changes, and to package gems

Sake library

  • Allows Rake tasks to be available regardless of current directory

Thor library

  • Write Ruby CLT

Juwelier (prev Jeweler) & Bundler library

  • Juwelier library creates new project skeleton using below layout guidelines and includes Rake tasks to help create and manage project as a gem
  • Bundler utility manages gems used by any Ruby app

Ruby Extension Creation

Ruby Namespaces (partitioning names)

  • Preventing naming clashes (using Classes and Modules to partition names in programs)
  • Organising source files
  • :: Double Colon is Ruby Namespace resolution operator (i.e. <class_or_module>::<constant_in_class_or_module>)
  • Note: Class and Module names are just Constants

Ruby Source File (organisation) & Package Development (by package developers)

  • Split source code into separate part and files so automated tests can load source code files without the program itself running
  • Store source code at specific locations in the file system
  • Ruby conventions of the RubyGems system suggest:
  • Split source code into one CLI file and the rest in various library files
main/             # top-level module (prevents polluting top-level Ruby namespace with our class names)
  bin/            # cli
    main
  doc/            # documentation
  ext/            # C-language extensions
  lib/            # lib files (comprises only classes that are used by the cli or other users)
    main/
      options.rb  # i.e. Main::Options is self-contained with minimal dependencies and testable
  test/           # test files
GemSpec           # info about project for RubyGems (lowercase gem name, x.y.z version)
LICENSE           # license distributed under
README            # what it does

Ruby Distribution and Installation of Code (GemSpec for RubyGems)

  • RubyGems package management system (Gems) - dependency and installation management
  • GemSpec file is information for RubyGems about project that isn’t in directory structure
  • Create packaged .gem file for distribution when GemSpec file is complete gem build ___.gemspec
  • Install the gem sudo gem install pkg/___-x.y.z.gem
  • Show details about installed gem gem list ___ -d
  • Share gem publicly from RubyGems main public server repo https://rubygems.org/
  • Push gem to RubyGems gem push ___-x.y.z.gem
  • Search for gem on RubyGems gem search -r ___
  • Install remote RubyGem gem install ___
  • Show details about all installed gems gem list --details
  • Avoid other users having to use same flags as you to run code
  • Avoid other users having to use library files located in random directories (not a standard Ruby location)

  • Want standard installation structure
  • Installation script distributed with code that copies app components to appropriate directories of target system

Ruby Encodings

  • Allow code to work overseas where different character encodings are used so Ruby knows how to interpret characters in the file

  • Ensure notify Ruby when source file not written in 7-bit ASCII (encoding is attribute of source file)
  • If first line of file is a comment (or second line if shebang on first line) then Ruby scans for string coding: (i.e. # encoding: utf-8)
  • If start of file has UTF-8 Byte Order Mark (BOM) \xEF\xBB\xBF it assumes file is UTF-8
  • Return encoding of current source file with __ENCODING__
  • Show encoding of a string literal in code i.e. str = "cat"; str.encoding.name # => UTF-8
  • Create arbitrary UTF-8 encoded string literals using \u escape (i.e. a = "\u{3c0}" # => π; a.bytes.to_a; a.bytesize)
  • Data streams containing binary data using ASCII-8BIT (dangerous encoding for source files)
  • Convert a string between different encodings and supply placeholder char to use when no direct translation possible
  • Convert from binary data without changing byte contents of string using say "\xc3\xa9".force_encoding("utf-8")
  • Multiple encodings
# encoding: utf-8
utf_string = "olé"
iso_string = utf_string.encode("iso-8859-1", :undef => :replace, :replace => "???")

Ruby I/O Encoding and Transcoding

  • Ruby programs run with concept of Default External Encoding (default external encoding used by I/O objects unless overridden upon I/O object creation)

  • Example:
  • Default External Encoding is UTF-8 on OSX stored in ENV[“LANG”] printenv LANG # => en_AU.UTF-8 or echo $LANG. All file I/O is UTF-8 by default unless overridden
  • View current Default External Encoding ruby -e 'p Encoding.default_external.name' # => "UTF-8"
  • View current Default Internal Encoding ruby -e 'p Encoding.default_internal' # => nil
  • Query External Encoding of I/O object using f = File.open("/myfile"); p "#{f.external_encoding}"; l = f.gets; p "#{l.encoding}"

  • Force External Encoding when open I/O object by overriding mode string f = File.open("/myfile", "r:ascii")
  • Force Ruby to Transcode (change encoding) of data read/written by applying two encoding names in mode string so strings tagged with correct encoding and mapping (i.e. <external_encoding>:<internal_encoding>) on input to what OS expects so data transcoded from first to second on read and the other way on write (i.e. f = File.open("/myfile", "r:iso-8859-1:utf-8"))
  • Open file with Binary data in ruby using binary flag to select 8-bit clean ASCII-8BIT encoding, using binary as alias for the encoding f = File.open("/myfile", "rb:binary")
  • Set Default External Encoding of I/O objects: ruby -E utf-8 -e 'p Encoding.default_external.name' "UTF-8"
  • Set Default External Transcoding (<default_external>:<default_internal>)of I/O objects: ruby -E sjis:iso-8859-1 -e 'p Encoding.default_internal.name'

Ruby Web Dev

  • Embedded Ruby (eRuby aka .erb) in HTML is equivalent to ASP, JSP, or PHP tools but instead using Ruby
<% ... %>     # execute only
<%= ... %>    # execute and replace with output
<%# ... %>    # comment
%             # just Ruby code

Ruby Frameworks

  • Build own framework on top of Rack independent of underlying web server (i.e. Sinatra, Grape )

Ruby Duck Typing

  • “Duck Typing” in Ruby is defining the “Type” of an object by what it can do, not by its class
  • Convenient for testing (instead of type checking in code)
  • More flexible methods that accept variety of inputs that work as long as inputs support operators used in method to achieve output
  • Example: Avoid garbage collection delays by breaking results into manageable tasks and combining at the end

  • Dynamic Typing Systems - Ruby uses this so not required to declare types of variables or methods
  • Static (aka Strong) Typing Systems - explicitly declare typing
  • Not required when use common sense (“type safety” often illusionary)
  • Use short methods so variable scope is limited and testing to prevent error propogation

Conversion Functions

  • See page 350 (i.e. to_ary, to_enum, to_io, to_open, to_path, to_proc, to_regex, etc)

  • Ruby Conversion Types:
  • Loose coercion
  • Strict coercion
  • Numeric coercion

Ruy Type Coercions

  • Shorthand equivalencies (Ruby passes Proc object in & to map method as a block and coerces it into a Proc object by sending it to to_proc message)
  • Disadvantage is that shorthand takes longer in benchmarks than explicitly coded block, so use expanded version for performance critical code
[1,2,nil].any?(&:nil?) is shorthand for:
[1,2,nil].map(&:nil?).any?, which is shorthand for:
[1,2,nil].map{|e|e.nil?}.any?

Numeric Coercion (Double Dispatch)

  • coerce method checks and guarantees receiver and parameter is of Class that will result in same Class so may be operated on
  • Receiver calls coerce method of its param to generate array.

Metaprogramming

  • Metaprogramming Dfn: Writing code that writes code. Building layers of abstractions.
  • Metaprogramming Ruby Goal: Create new DSL (expressing concepts needed to solve a problem) abstractions integrated into host language
  • Metaprogramming Techniques: Create Metaprogramming Idioms that simplify code using underlying Ruby principles
  • Higher levels of abstraction (i.e. Ruby) allow coding closer to target domain of application development

  • Ruby object components
  • Flags
  • Instance variables (state)
  • Ruby Class (object of class “Class”) with Method definitions and superclass reference

  • Current object self
  • Method calls without explicit “receiver” uses self as receiver (with method lookup process up class inheritance hierarchy chain up to Object, which mixes in module Kernel)

  • Singleton methods created cause Ruby to generate a Singleton Class (anonymous) between the object itself and its superclass
singleton = class << Kernel
    def self.blah; p "blah"; end
    self
end
singleton.singleton_methods # => [:blah, :constants, :nesting]
singleton.inspect # => "#<Class:Kernel>"
singleton.blah.is_a? Kernel # => true
singleton.instance_of? Kernel # => false 
singleton.instance_of? Object # => false
singleton.instance_of? Class # => true
  • Access class-level instance variables by invoking attr_accessor (defines setters/getters) in the singleton class
class MyClass
  @var = 1
  class << self
    attr_accessor :var
  end
end
puts "Original value = #{MyClass.var}" # => 1
MyClass.var = 2
puts "New value = #{MyClass.var}" # => 2
  • Modules behave like Classes due to side-effect of Ruby
  • See page 367 of book for diagram
  • Ruby creates a new anonymous “proxy” class object when a module is included in a class #1, and makes it a superclass of class #1, and sets the superclass of the anonymous class to be the original superclass of class #1
  • Changes to the module are therefore reflected down the inheritance chain

  • prepend is similar to include but methods in prepended module take precedence over those in host class
  • Disadvantage is that these are global changes and may break existing code. “Refinements” are required

  • “Refinements” are defined in a module refine block that packages a set of changes activated with using to classes that use it without affecting other code outside the source file

  • extend (equivalent to self.extend) adds instance methods of a module into a singleton class of a particular object in the superclass chain of a particular object that it may lookup
  • Example adding extend at class level to make modules methods class methods of particular class
module MyClass
 def my_method
  "#{self} hi!"
 end
end
class MyClass2
  extend MyClass
end
MyClass2.my_method
  • Macros are methods that expand into something bigger (i.e. has_many)

  • Singleton class hierarchy parallels the Regular class hierarchy

  • Module has an instance method attr_accessor that is available in all Module and Class definitions

  • Kernel methods may be called without a receiver (i.e. in IRB, local_variables is a Kernel method)

up to page 372

  • TODO - Project
  • Refinements (extend libraries without breaking)
  • Thor (create command line library http://whatisthor.com/)
    • Example: http://willschenk.com/making-a-command-line-utility-with-gems-and-thor/
  • GemSet (create using Bundler or Juwelier https://github.com/flajann2/juwelier and submit to https://rubygems.org)

Chapter 2 - Dashing App Ruby Guides

  • - DASHING > RUBY > GUIDES > ASSIGNMENT

  • Scope
    • Local Variables
      • View local_variables
      • Blocks create a New Scope, but loops do not
      • Calling an undefined Variable blah. Ruby assumes desire to call Method with that name.
      • Calling an existing Variable blah when Method with same name exists. Ruby assumes Local Variable.
        • Call Variable with blah
        • Call Method with self.blah (explicit receiver) or with parenthesis blah()
      • Expressions should be in Order (i.e. ensure not calling non-existant Method)
    • Instance Variables
      • Shared across Methods of same Object
      • Classes have Instance Variables since Classes are Objects
      • @ is prefix for Instance Variables (otherwise are Local Variables)
      • initialize in-built Class Method used to Initialize Instance Variables
      • nil value for Uninitialized Instance Variables
      • Getter Method used to Access value Set by initialize Method only for same Object
    • Class Variables
      • Shared between Class, Subclasses, and associated Instances
      • @@ is prefix for Class Variables
      • Update Class Variable (stored in Parent) using Setter in Parent or Child Object shares update.
    • Global Variables
      • Shared everywhere
      • $ is prefixed for Global Variables
      • nil value for Uninitialised Global Variable
  • Assignment (aka Setter) Method
    • Methods behave like Assignment for Class or Instance Variables
    • def x=(x) syntax
    • Receiver Class or Instance Variable required otherwise assumes assigning to local variable (i.e. within Setter use syntax @value or self.value to set receiver)
  • Attribute Accessor (Shorthand to Getter and Setter)
    • attr_accessor :value
  • Variable Info
    • .inspect
    • .methods
    • .class
    • .class.ancestors
    • .yaml
  • Assignment (Shorthand)
    • Shortcuts
      • += same as _ = _ + _
      • ||= (i.e. a || a = 0) makes assignment only if value is nil or false
      • &&= (i.e. a = a || 0) makes assignment only if value was NOT nil or false
    • Array Assignment
      • Implicit Array Assignment a = 1, 2, 3 or [1, 2, 3]
    • Multiple Assignment
      • a, b = 1, 2; p a: a, b: b # {:a=>1, :b=>2}
    • Splat Assignment (only one allowed)
        a, *b = 1, 2, 3 	# p a: a, b: b # prints {:a=>1, :b=>[2, 3]}
      
    • Decomposition of Array Assignment
        a, (b, *c), *d, (e, f) = 1, [2, 3, 4], 5, 6, [7, 8]
      
        p a: a, b: b, c: c, d: d, e: e, f: f
        # prints {:a=>1, :b=>2, :c=>[3, 4], :d=>[5, 6], :e=>7, :f=>8}
      
Written on October 7, 2016