29 Apr 2009

Single quote, backslash and string substitution in ruby

Problem Defintion

Perform a gsub on the string Ruby’s gem to get the result Ruby\’s gem

Solution

str2 = "Ruby's gem" 
puts str2.gsub(/'/,"\\\\'") 

Analysis

If you look at the solution you will notice that there are four back slashes in the substitution string. Question is why four of them are needed.

Use Double backslash in substitution string

While performing substitution if you need backslash then one single slash like \ will not work. You will need to use double backslash like this \\.

str1 = 'foo bar baz'
puts str1.gsub(/bar/,'\')

#results in syntax error
/Users/nkumar/Desktop/untitled.rb:17: syntax error, unexpected tIDENTIFIER, expecting ')'
puts str1.gsub(/bar/,'hello')
                           ^
/Users/nkumar/Desktop/untitled.rb:21: syntax error, unexpected $undefined, expecting $end
puts str1.gsub(/bar/,"\\")

As stated you need to have double backslash.

str1 = 'foo bar baz'
puts str1.gsub(/bar/,'\\') #=> foo \ baz

Pre and Post match

In the substitution string if you use \’ then it is treated as post match. The pattern \` is treated as pre-match in the substitution string as mentioned here .

str1 = 'foo bar baz'
puts str1.gsub(/bar/,"\\'") #=>foo  baz baz

As mentioned pattern \‘ is treated as post-match. However in the substitution string a backslash must be doubled up so you are seeing \\’ in the example above. In the above case after the match of string ‘bar’ string ‘baz’ is the post-match. That is why you are getting the final result as ‘foo baz baz’ .

Same goes for the pre-match.

str1 = 'foo bar baz'
puts str1.gsub(/bar/,"\\`") #=> foo foo  baz

Coming back to the main issue of why four backslashes are needed

str2 = "Ruby's gem" # substitue ' with \'
puts str2.gsub(/'/,"\\\\'") #=> Ruby\'s gem

A person might start with something like this if asked to replace with \’ .

str2.gsub(/'/,"\'") #=>Ruby's gem

Since in substitution string a backslash must be doubled , it should be changed to.

str2.gsub(/'/,"\\'") #=> Rubys gems gem

Look at the substitution string. It is \’ which has a special meaning of post-match. That is why in the result you are seeing the string Rubys gems gem because s gem is the post-match.

In order to escape that and to tell ruby to not to use the special meaning, you need to escape it one more time. However a single backslash does not work in the substitution string as you saw before. So in order to escape it the string must be escaped with two back slashes. This results in

str2.gsub(/'/,"\\\\'")