class RubyPants
RubyPants – SmartyPants ported to Ruby¶ ↑
Synopsis¶ ↑
RubyPants is a Ruby port of the smart-quotes library SmartyPants.
The original “SmartyPants” is a free web publishing plug-in for Movable Type, Blosxom, and BBEdit that easily translates plain ASCII punctuation characters into “smart” typographic punctuation HTML entities.
Description¶ ↑
RubyPants can perform the following transformations:
-
Straight quotes (
"
and'
) into “curly” quote HTML entities -
Backticks-style quotes (
``like this''
) into “curly” quote HTML entities -
Dashes (
--
and---
) into en- and em-dash entities -
Three consecutive dots (
...
or. . .
) into an ellipsis entity
This means you can write, edit, and save your posts using plain old ASCII straight quotes, plain dashes, and plain dots, but your published posts (and final HTML output) will appear with smart quotes, em-dashes, and proper ellipses.
RubyPants does not modify characters within
<pre>
, <code>
,
<kbd>
, <math>
or
<script>
tag blocks. Typically, these tags are used to
display text where smart quotes and other “smart punctuation” would not be
appropriate, such as source code or example markup.
Backslash Escapes¶ ↑
If you need to use literal straight quotes (or plain hyphens and periods), RubyPants accepts the following backslash escape sequences to force non-smart punctuation. It does so by transforming the escape sequence into a decimal-encoded HTML entity:
\\ \" \' \. \- \`
This is useful, for example, when you want to use straight quotes as foot
and inch marks: 6'2“ tall; a 17” iMac. (Use
6\'2\"
resp. 17\"
.)
Algorithmic Shortcomings¶ ↑
One situation in which quotes will get curled the wrong way is when apostrophes are used at the start of leading contractions. For example:
'Twas the night before Christmas.
In the case above, RubyPants will turn the
apostrophe into an opening single-quote, when in fact it should be a
closing one. I don't think this problem can be solved in the general
case–every word processor I've tried gets this wrong as well. In such
cases, it's best to use the proper HTML entity for closing
single-quotes (“’
”) by hand.
Bugs¶ ↑
To file bug reports or feature requests (except see above) please send email to: chneukirchen@gmail.com
If the bug involves quotes being curled the wrong way, please send example text to illustrate.
Authors¶ ↑
John Gruber did all of the hard work of writing this software in Perl for Movable Type and almost all of this useful documentation. Chad Miller ported it to Python to use with Pyblosxom.
Christian Neukirchen provided the Ruby port, as a general-purpose library that follows the *Cloth API.
Copyright and License¶ ↑
SmartyPants license:¶ ↑
Copyright © 2003 John Gruber (daringfireball.net) All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
-
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
-
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
-
Neither the name “SmartyPants” nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
This software is provided by the copyright holders and contributors “as is” and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the copyright owner or contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.
RubyPants license¶ ↑
RubyPants is a derivative work of SmartyPants and smartypants.py.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
-
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
-
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
This software is provided by the copyright holders and contributors “as is” and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the copyright owner or contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.
Links¶ ↑
- John Gruber
- SmartyPants
- Chad Miller
- Christian Neukirchen
Constants
- VERSION
Public Class Methods
Create a new RubyPants instance with the text
in string
.
Allowed elements in the options array:
- 0
-
do nothing
- 1
-
enable all, using only em-dash shortcuts
- 2
-
enable all, using old school en- and em-dash shortcuts (default)
- 3
-
enable all, using inverted old school en and em-dash shortcuts
- -1
-
stupefy (translate HTML entities to their ASCII-counterparts)
If you don't like any of these defaults, you can pass symbols to change RubyPants' behavior:
:quotes
-
quotes
:backticks
-
backtick quotes (“double'' only)
:allbackticks
-
backtick quotes (“double'' and `single')
:dashes
-
dashes
:oldschool
-
old school dashes
:inverted
-
inverted old school dashes
:ellipses
-
ellipses
:convertquotes
-
convert
"
entities to"
for Dreamweaver users :stupefy
-
translate RubyPants HTML entities to their ASCII counterparts.
# File rubypants.rb, line 207 def initialize(string, options=[2]) super string @options = [*options] end
Public Instance Methods
Apply SmartyPants transformations.
# File rubypants.rb, line 213 def to_html do_quotes = do_backticks = do_dashes = do_ellipses = do_stupify = nil convert_quotes = false if @options.include? 0 # Do nothing. return self elsif @options.include? 1 # Do everything, turn all options on. do_quotes = do_backticks = do_ellipses = true do_dashes = :normal elsif @options.include? 2 # Do everything, turn all options on, use old school dash shorthand. do_quotes = do_backticks = do_ellipses = true do_dashes = :oldschool elsif @options.include? 3 # Do everything, turn all options on, use inverted old school # dash shorthand. do_quotes = do_backticks = do_ellipses = true do_dashes = :inverted elsif @options.include?(-1) do_stupefy = true else do_quotes = @options.include? :quotes do_backticks = @options.include? :backticks do_backticks = :both if @options.include? :allbackticks do_dashes = :normal if @options.include? :dashes do_dashes = :oldschool if @options.include? :oldschool do_dashes = :inverted if @options.include? :inverted do_ellipses = @options.include? :ellipses convert_quotes = @options.include? :convertquotes do_stupefy = @options.include? :stupefy end # Parse the HTML tokens = tokenize # Keep track of when we're inside <pre> or <code> tags. in_pre = false # Here is the result stored in. result = "" # This is a cheat, used to get some context for one-character # tokens that consist of just a quote char. What we do is remember # the last character of the previous text token, to use as context # to curl single- character quote tokens correctly. prev_token_last_char = nil tokens.each { |token| if token.first == :tag result << token[1] if token[1] =~ %r<(/?)(?:pre|code|kbd|script|math)[\s>]! in_pre = ($1 != "/") # Opening or closing tag? end else t = token[1] # Remember last char of this token before processing. last_char = t[-1].chr unless in_pre t = process_escapes t t.gsub!(/"/, '"') if convert_quotes if do_dashes t = educate_dashes t if do_dashes == :normal t = educate_dashes_oldschool t if do_dashes == :oldschool t = educate_dashes_inverted t if do_dashes == :inverted end t = educate_ellipses t if do_ellipses # Note: backticks need to be processed before quotes. if do_backticks t = educate_backticks t t = educate_single_backticks t if do_backticks == :both end if do_quotes if t == "'" # Special case: single-character ' token if prev_token_last_char =~ /\S/ t = "’" else t = "‘" end elsif t == '"' # Special case: single-character " token if prev_token_last_char =~ /\S/ t = "”" else t = "“" end else # Normal case: t = educate_quotes t end end t = stupefy_entities t if do_stupefy end prev_token_last_char = last_char result << t end } # Done result end
Protected Instance Methods
Return the string, with “``backticks''
”-style single
quotes translated into HTML curly quote entities.
# File rubypants.rb, line 384 def educate_backticks(str) str.gsub("``", '“').gsub("''", '”') end
The string, with each instance of “--
” translated to an
em-dash HTML entity.
# File rubypants.rb, line 347 def educate_dashes(str) str.gsub(/--/, '—') end
Return the string, with each instance of “--
” translated to an
em-dash HTML entity, and each “---
” translated to an en-dash
HTML entity. Two reasons why: First, unlike the en- and em-dash syntax
supported by educate_dashes_oldschool
, it's compatible
with existing entries written before SmartyPants 1.1, back when
“--
” was only used for em-dashes. Second, em-dashes are more
common than en-dashes, and so it sort of makes sense that the shortcut
should be shorter to type. (Thanks to Aaron Swartz for the idea.)
# File rubypants.rb, line 369 def educate_dashes_inverted(str) str.gsub(/---/, '–').gsub(/--/, '—') end
The string, with each instance of “--
” translated to an
en-dash HTML entity, and each “---
” translated to an em-dash
HTML entity.
# File rubypants.rb, line 355 def educate_dashes_oldschool(str) str.gsub(/---/, '—').gsub(/--/, '–') end
Return the string, with each instance of “...
” translated to
an ellipsis HTML entity. Also converts the case where there are spaces
between the dots.
# File rubypants.rb, line 377 def educate_ellipses(str) str.gsub('...', '…').gsub('. . .', '…') end
Return the string, with “educated” curly quote HTML entities.
# File rubypants.rb, line 397 def educate_quotes(str) punct_class = '[!"#\$\%\()*+,\-.\/:;<=>?\@\[\\\]\^_`{|}~]' str = str.dup # Special case if the very first character is a quote followed by # punctuation at a non-word-break. Close the quotes by brute # force: str.gsub!(/^'(?=#{punct_class}\B)/, '’') str.gsub!(/^"(?=#{punct_class}\B)/, '”') # Special case for double sets of quotes, e.g.: # <p>He said, "'Quoted' words in a larger quote."</p> str.gsub!(/"'(?=\w)/, '“‘') str.gsub!(/'"(?=\w)/, '‘“') # Special case for decade abbreviations (the '80s): str.gsub!(/'(?=\d\ds)/, '’') close_class = %Q[^\ \t\r\n\\[\{\(\-]! dec_dashes = '–|—' # Get most opening single quotes: str.gsub!(/(\s| |--|&[mn]dash;|#{dec_dashes}|ȁ[34];)'(?=\w)/, '\1‘') # Single closing quotes: str.gsub!(/(#{close_class})'/, '\1’') str.gsub!(/'(\s|s\b|$)/, '’\1') # Any remaining single quotes should be opening ones: str.gsub!(/'/, '‘') # Get most opening double quotes: str.gsub!(/(\s| |--|&[mn]dash;|#{dec_dashes}|ȁ[34];)"(?=\w)/, '\1“') # Double closing quotes: str.gsub!(/(#{close_class})"/, '\1”') str.gsub!(/"(\s|s\b|$)/, '”\1') # Any remaining quotes should be opening ones: str.gsub!(/"/, '“') str end
Return the string, with “`backticks'
”-style single quotes
translated into HTML curly quote entities.
# File rubypants.rb, line 391 def educate_single_backticks(str) str.gsub("`", '‘').gsub("'", '’') end
Return the string, with after processing the following backslash escape sequences. This is useful if you want to force a “dumb” quote or other character to appear.
Escaped are:
\\ \" \' \. \- \`
# File rubypants.rb, line 335 def process_escapes(str) str.gsub('\\', '\'). gsub('\"', '"'). gsub("\\\'", '''). gsub('\.', '.'). gsub('\-', '-'). gsub('\`', '`') end
Return the string, with each RubyPants HTML entity translated to its ASCII counterpart.
Note: This is not reversible (but exactly the same as in SmartyPants)
# File rubypants.rb, line 445 def stupefy_entities(str) str. gsub(/–/, '-'). # en-dash gsub(/—/, '--'). # em-dash gsub(/‘/, "'"). # open single quote gsub(/’/, "'"). # close single quote gsub(/“/, '"'). # open double quote gsub(/”/, '"'). # close double quote gsub(/…/, '...') # ellipsis end
Return an array of the tokens comprising the string. Each token is either a
tag (possibly with nested, tags contained therein, such as <a
href="<MTFoo>">
, or a run of text between tags.
Each element of the array is a two-element array; the first is either :tag
or :text; the second is the actual value.
Based on the _tokenize()
subroutine from Brad Choate's
MTRegex plugin. <www.bradchoate.com/past/mtregex.php>
This is actually the easier variant using tag_soup, as used by Chad Miller in the Python port of SmartyPants.
# File rubypants.rb, line 471 def tokenize tag_soup = /([^<]*)(<[^>]*>)/ tokens = [] prev_end = 0 scan(tag_soup) { tokens << [:text, $1] if $1 != "" tokens << [:tag, $2] prev_end = $~.end(0) } if prev_end < size tokens << [:text, self[prev_end..-1]] end tokens end