Using Ruby 1.9 Ripper

posted: July 5th, 2009 · by: Sven

in: Programming ·  16 comments »

Ripper is a s-expression based Ruby Parser shipped with Ruby 1.9. There’s no documentation out there so trying to figure out what’s going on can be quite hard so I thought the following notes might help you get started.

While Ripper parses your code it continously fires events (or “calls callbacks”) when it finds something interesting. There are two types of events: scanner (lexer) and parser events.

The scanner basically goes through the code from the left to the right character by character. When it finds known things (such as a keyword, whitespace or a semicolon) it fires a corresponding even that you can react to. The parser works on a higher level and watches for known Ruby constructs (such as a symbol, a method call or a class definition) and also fires events.

You can check the available events by outputting Ripper::SCANNER_EVENTS and Ripper::PARSER_EVENTS.

You can respond to these events by simply defining methods named :"on_#{event_name}" (omitting the @ character for scanner events). As long as you do not mess this up (which you might want to do) the parser always passes the results from the last inner parser events to the current parser event. E.g.:

require 'ripper'

class DemoBuilder < Ripper::SexpBuilder
  def on_int(token) # scanner event
    super.tap { |result| p result }
  end

  def on_binary(left, operator, right) # parser event
    super.tap { |result| p result }
  end
end

src = "1 + 1"
DemoBuilder.new(src).parse

This outputs:

[:@int, "1", [1, 0]]
[:@int, "1", [1, 4]]
[:binary, [:@int, "1", [1, 0]], :+, [:@int, "1", [1, 4]]]

When a scanner event is fired you can check the current position (it is passed to the event but you can also always call self.position) which allows for tracking detailled positioning information. Positions are given as [row, column] with the row being 1-based. On parser level events the current position is not very useful (and not passed to your event callbacks) because parser events are fired when the parser recognizes a known ruby construct as completed - i.e. at the end of the construct.

Scanner events are fired “just so”, i.e. the scanner finds something and calls your callback method. The return values might or might not be passed to parser events. Parser events otoh build a meaningful tree and their return values are always passed to the next (outer) event. You can generally think of events being fired “from the inside out”, starting with lowlevel scanner events.

You can examine the hierarchie of these events by doing:

require "pp"
src = "1 + 1"
pp Ripper::SexpBuilder.new(src).parse

will output:


[:program,
 [:stmts_add,
  [:stmts_new],
  [:binary, [:@int, "1", [1, 0]], :+, [:@int, "1", [1, 4]]]]]

You think of this as a nested method call where the first element of each array is the method name and the rest are the arguments. In the example above there would be 5 method calls. The first :@int call would receive the arguments "1" and [1, 0], the :binary would receive ["1", [1, 0]], :+, ["1", [1, 4]]. The other calls, like :program would not receive any arguments.

When executed the (theoretical) interpreter would first evaluate the innermost arguments, right? That’s exactly what Ripper does, too. It will first fire the first @int event, then the second one and then pass the return values of these two events (together with the :+ operator token) to the next outer method, which is the :binary event in this case.

(“Theoretical” of course refers to these particular s-expressions. There are languages that are very much based on exactly this concept, like e.g. Lisp.)

As you can see even though the scanner fires events on whitespace there aren’t any whitespace characters passed to any of the callbacks. I don’t know if there’s anything else happening to these but of course you can define callbacks for the different kinds of whitespace and do something useful with it. The same is true for comments and quite some stuff that doesn’t make a semantical difference in Ruby (such as parentheses for method calls etc.).

To examine all events in the order they are actually fired you can use the event log that ships with Ripper2Ruby:


src = "1 + 1"
Ripper::EventLog.out(src)

will output:


@int                1
@sp                 " "
@op                 +
@sp                 " "
@int                1
binary
stmts_new
stmts_add
program

I’m not an expert here but Ripper’s s-expressions and events seemed to make more sense to me than ParseTree’s stuff. Ripper still doesn’t seem to be completely consistent though.

E.g. for word lists (i.e. Arrays that are defined using %w() syntax) there are different events fired depending whether you have %w() or %W().

src = '%W(foo bar)'
pp Ripper::SexpBuilder.new(src).parse

outputs:


[:program,
 [:stmts_add,
  [:stmts_new],
  [:words_add,
   [:words_add,
    [:words_new],
    [:word_add, [:word_new], [:@tstring_content, "foo", [1, 3]]]],
   [:word_add, [:word_new], [:@tstring_content, "bar", [1, 7]]]]]]

But on the other hand:


src = '%w(foo bar)'
pp Ripper::SexpBuilder.new(src).parse

outputs:


[:program,
 [:stmts_add,
  [:stmts_new],
  [:qwords_add,
   [:qwords_add, [:qwords_new], [:@tstring_content, "foo", [1, 3]]],
   [:@tstring_content, "bar", [1, 7]]]]]

As you can see for qwords (i.e. the non-interpolating version) there seems to be a :qwords_add and :qwords_new event missing. I can’t see any good reason for this.

Also, Ripper seems to get the method call operator wrong when you use "::"


src = "A::b()"
pp Ripper::SexpBuilder.new(src).parse

outputs:


[:program,
 [:stmts_add,
  [:stmts_new],
  [:method_add_arg,
   [:call,
    [:var_ref, [:@const, "A", [1, 0]]],
    :".",
    [:@ident, "b", [1, 3]]],
   [:arg_paren, nil]]]]

Watch the period which should be a :"::" symbol.

In quite some situations I’ve found the events ambigous or not explicit. E.g. for the closing parentheses in a words list like %w(foo bar) Ripper fires a :@tstring_end event - which is the same event as it fires for closing parentheses in Strings as in %(foobar).

It gets really weird when you try to build something from the events that Ripper fires for Heredocs or even stacked Heredocs combined with method calls on the Heredoc opener token - maybe the most weird Ruby construct anyway. In general though this stuff is fun to work with and quite obvious once you got the idea :)

Leave a comment

16 Comments

  1. Ryan Davis said July 6th, 2009 at 11:17 AM  

    Ripper is a lot more like rubyparser than it is like ParseTree in that it is actually parsing the input. The inconsistencies you find are a result of that. ParseTree simply converts MRI’s internal nodes to ruby objects so there is a lot of artifacts and cruft in there that we often may not care about. rubyparser tried very hard to be as compatible with ParseTree as possible, to provide a migration path for dependencies of ParseTree. An unfortunate result is that it has the same cruft.

  2. Sven said July 7th, 2009 at 10:29 AM  

    Awesome, thanks for clearing this up, Ryan!

    In case you have any input on improving Ripper2Ruby, please let me know :)

  3. Magnus Holm said July 8th, 2009 at 12:58 PM  

    It would be very cool if we could convert the Ripper-output to UnifiedRuby. It basically cleans up ParseTree’s mess and gives a very nice and consistent AST. As far as I know, that’s the same stuff which RubyParser gives me, and I’ve never found that to have the “same cruft” as raw ParseTree…

  4. jack said January 23rd, 2011 at 10:30 AM  

    I must admit that today is my first time I visit here. However, I have found so many interesting thing in your blog and I really love that. Keep up the good work! cheap vps

  5. Mark Cvelli said February 3rd, 2011 at 12:17 AM  

    Excellent features! Thanks for an example. A code very interesting. male enhancement products

  6. QQQ said February 7th, 2011 at 06:31 PM  

    Finally we kissed and the passion scale went sky high and I knew I was onto a good thing - sex was a certainty free porn videos. She never hesitated when I began to fondle her breasts and she willingly exposed them for me mobile porn. They were firm and I suspected a breast enhancement but said nothing - they still felt good and I was enjoying them and gradually working my way further south free porn tube. She was a step ahead of me and before I could completely undress her she moved on me atk hairy and I was suddenly having my pants pulled down and I was enjoying one of he best cock sucking hairy pussy experiences I had ever had. ABB728019394

  7. muneeb said February 24th, 2011 at 01:40 PM  

    I must admit that today is my first time I visit here. However, I have found so many interesting thing in your blog and I really love that. Keep up the good work!Go Ped

  8. HelgaGiant said March 1st, 2011 at 11:01 PM  

    no documentation, pickaxe three in beta, libraries not done yet, creators themselves say it’s a development release not for production…..those of us with more common sense than bravado are more than happy to let you oooo-look-shiny kidz stub your toes for a year or so and file bug reports, the rest of us get with the new Ruby probably early next year or so, no rush. zithromax sans ordonnance

  9. ddsgd said March 21st, 2011 at 06:30 AM  

    How to Convert AVCHD films to AVI

    URL?http://www.wondershare.com/avchd/convert-avchd-to-avi.html Troubles we may possibly have encountered to take satisfaction in and reveal AVCHD films with friends:

    1. You have shot lots of films utilizing AVCHD camcorder, and you also need to upload these films to internet that consist of Youtube to reveal with friends, but you can’t determine the right way to create it.
    2. you could possibly also uncover which you would not possess the ability to available your AVCHD camcorder films on house windows film maker or advertising player.

    That’s true, the AVCHD structure is not accepted by lots of players, and also this kind of structure cannot be uploaded to internet which prevents film sharing. I’ve searched near to some great offer and uncover an simplest alternative finally. The simplest way can be to convert avchd to avi or other standard formats.

    The alternative we are able to consider to solve them: An AVCHD to AVI Converter is needed. Please click to obtain avchd to avi converter, set up and run it. Step one Get AVCHD films from AVCHD Camcorder to PC

    Connect the AVCHD Camcorder for the PC using a USB cable. when attached and powered on, the camcorder should seek out the desktop like a brand brand new disk. It is desirable which you duplicate the films for the PC’s hard-drive earlier to converting avchd to avi or editing it. Step 2. fill AVCHD camcorder videos

    Click “Add Video” or just drag & squirrel away your AVCHD camcorder films using the document checklist directly, you can very easily include AVCHD camcorder videos.

    Load AVCHD camcorder videos Step 3. find AVI as output format

    Select AVI as output structure away from your categorized output formats checklist in “Profile” drop-down checklist and specify the output list to preserve your converted records in “Output” drop-down list.

    Select Convert AVCHD films to AVI output format Step 4. start to convert avchd to avi

    After every little thing is done, strike “Start” key to convert avchd to avi. And all the things can be achieved at quickly demand and higher quality. The conversion demand often is dependent for the genuine film sizing as well as your PC configuration. as well as the output best quality is great.

    After converting avchd to avi, now you can upload these AVI films to internet or view them with house windows advertising Player/Movie Maker to reveal them jointly with your friends. The complete method is easy, and you also can possess a try of the AVCHD to AVI Converter to how to convert avchd to avi and reveal your delighted instant easily.

    Tips: Why we choose AVI format? As we realize that AVI structure is among the probably the most standard formats, that is broadly accepted by the majority of players. as well as the best quality of AVI film is great. using the sake of reputation and compatibility, AVI is regarded as probably the most advantageous choice. Therefore, you only should convert AVCHD to AVI, after which you can very easily reveal your AVCHD camcorder videos.

  10. custom temporary tattoos said March 21st, 2011 at 09:39 AM  

    It was a beneficial workout for me to go through your webpage. It definitely stretches the limits with the mind when you go through very good info and make an effort to interpret it properly

  11. side sleeper pillow said April 22nd, 2011 at 04:25 AM  

    Great Job![:program, [:stmts_add, [:stmts_new], [:methodaddarg, [:call, [:var_ref, [:@const, “A”, [1, 0]]], :”.”, [:@ident, “b”, [1, 3]]], [:arg_paren, nil]]]]

  12. Okey oyunu said May 12th, 2011 at 03:33 PM  

    Tüm dünya artik okey oyunu oynuyor. Yillardir bir çok oyun programi olmasina ragmen, içlerinden en güzeli olarak nitelendirebilecegimiz tek bir site göze çarpmaktadir. Diger tüm okey oyunu programlarinin aksine ücretsiz olmasi ve 3 boyutlu olarak hizmet vermesi mükemmel bir gelismedir. Sizlerde www. okey-oyunu.com adresinden bu essiz okey oyununu indirebilirsiniz. Kullanimi çok basit ve Türkçe dil seçenegi ile kolaylikla oyuna baslayabilirsiniz. Ister kendi ülkenizden, isterseniz dünyanin tüm farkli bölgelerinden dilediginiz oyun odalarini seçerek, oyuna hemen baslayabilirsiniz. Okey oyunu oynamak için artik arkadas bile aramaniza gerek kalmadan, bilgisayarinizdan 100 binlerce üye ile online olarak okey oyununu oynamanin zevkine varabilirsiniz.

  13. Foana22 said May 20th, 2011 at 07:38 AM  

    Thank you for encouraging input in this discussion. There’s an easy way to help employees be more productive and effective Use inventory management software to keep track of parts and products pass4sure BH0-006. This allows more employees to participate in inventory management and it also cuts down on time spent on it pass4sure JK0-015, as share good stuff with good ideas and concepts, lots of great information and inspiration, both of which we all need pass4sure JK0-016, thanks for all the enthusiasm to offer such helpful information here.

  14. p said May 21st, 2011 at 05:05 PM  







    chaussures nike lunarhaze+ white black red [cn3340] - €49.99 : nike tn,requin tn,tn nike,tn requin,nike requin,basket tn,chaussures tn
    nike tn,requin tn,tn nike,tn requin,nike bw,basket tn,chaussures tn : nike free 3.0 3 - Kvinner sko Menn sk
    Nous avons une satisfaction garantie à 100%. Si vous n’êtes pas satisfait avec votre produit dans les 365 jours suivant la réception vous pouvez l’envoyer de nouveau à nous pour un remboursement complet.

    Il est 24 heures période de traitement une fois le paiement re?u. Une fois que les chaussures sont expédiée, vous recevrez un email de notre part avec numéro de suivi afin que vous mai suivre votre commande. Vous aurez vos produits dans 3-5 jours d’affaires (sans compter les week-ends ou jours fériés). La plupart des commandes sont traitées et re?ues par les clients dans les 3 jours.



  15. porno said May 23rd, 2011 at 10:51 AM  

    good comment. thanks you friends.

    I’ve surfed the net more than three hours today, however, I haven’t found such useful information. Thanks a lot, it is really useful to me

  16. porno said May 23rd, 2011 at 10:52 AM  

    I do agree with all of the ideas you have presented in your post. They’re really convincing and will definitely work. Still, the posts are too short for newbies. Could you please extend them a bit from next time? Thanks for the post.

Sorry, comments are closed for this article.

artweb design
Sven Fuchs
Grünberger Str. 65
10245 Berlin, Germany


http://www.artweb-design.de

Fon +49 (30) 47 98 69 96
Fax +49 (30) 47 98 69 97