High Notes

High Notes

High Notes  //  Thoughts on programming languages, software development, entrepreneurship and any other things that I find interesting.

Nov 19 / 9:53pm

:Treetop => "Parser in Ruby"

I built a parser a few days ago using treetop. Treetop is an awesome ruby gem for writing parsers in ruby. The syntax is very clean and elegant, and you don't need to handle countless auto-generated files. Treetop grabs the grammar file on the fly and automagically loads the classes that you need. Pretty neat.

Once you parse text with your parser, treetop returns to you a syntactical tree. To avoid navigating through the syntactical tree, treetops allows you to add methods to your grammar. You can then call these methods in the syntactical tree. Here's an excerpt of the grammar I'm working on:

grammar RepositoryProtocol

  rule message

    credential / response / request / reply / challenge

    {

      def node_type

        "message"

      end

    }

  end

  rule credential

    credential_hdr public_key certificates credential_end

    {

      def pk

         public_key.data

      end

      def certs

        certificates.certs

      end

      def node_type

        "credential"

      end        

    }

  end

  # obviously more stuff goes down here...

end

 

Now lets talk about the dark side of this tool. There are two problems that I found while using it:

1- Lack of documentation: You can find pretty simplistic examples of grammars in the tools homepage and its github repository. Unfortunately if you need something more complex then a calculator, you're pretty much on yourself. If you google for a while you will find some more complex grammars written out there, but the truth is that the learning curve is a bit steep. For instance it took me quite a while to realize that the syntactical nodes only provide two methods to query a node: elements (an array of child nodes) and text_value (the text contained by the node). Even at this point I'm still not 100% sure if these are the only methods available. Also, if your rule is very simplistic (I mean it lacks or's, *'s, +'s, etc...) , then treetop will add methods with the same name as your grammar rule sons.

2- If you name a rule as something that treetop uses internally you're screwed. This sucks really. There's no message, no list of reserved words, no nothing. So if a rule doesn't work for any logical reason, try renaming it.

3- Nodes don't have a way to identify themselves. There's no node_type method to call to figure out what you're navigating in. If you really need to do that, then you'll have to manually add a node_type method to each rule.

4- No easy way to ignore white spaces. You have to provide a rule in your grammar to handle white spaces, tabs, \r's and \n's.

5- And finally, this problem really took me a while to figure it out. Treetop doesn't really construct the syntactical tree following your grammar literally. Take for example the following rules:

 rule clause

    implication / single_atom

    {

      def import(rsa_key)

        internal_import(rsa_key)

      end

 

      def node_type

        "clause"

      end

    }

  end

  rule single_atom

    literal "."

    {

      def internal_import(rsa_key)

        "#{literal.import(rsa_key)}."

      end

 

      def imported_clause?

        literal.imported_clause?

      end

 

      def node_type

        "single_atom"

      end

    }

  end

 

  rule implication

    literal ":-" literal other_literals "."

    {

      def import(rsa_key)        

        elements.inject("") do |imported_string, e|

          imported_string << if e.respond_to? "import"

            e.import(rsa_key) 

          else

            e.text_value

          end

        end

      end

 

      def node_type

        "implication"

      end

    }

  end

 

  rule other_literals

    ("," literal)*

    {

      def import(rsa_key)

        elements.inject("") do |imported_clause, e|

          imported_clause << ",#{e.elements.last.import(rsa_key)}"

        end

      end

    }

  end

  

  rule literal

    ( "says(" iden "," predicate ")" space ) / predicate

    {

      def import(rsa_key)

        "says(#{rsa_key}, #{text_value})"

      end

 

      def node_type

        "literal"

      end

    }

  end

Following the grammar of the previous example I would expect the following to be the way in which the syntactical tree for an implication cause to be generated:

But instead I get a tree like this one:

Somehow treetops decides that clause and implication should be the same node and adds some methods of clause to the tree and others from implication. For that reason I use the strange import_internal for some paths, and on others I call import directly. So don't expect the syntactical tree to be an exact representation of your grammar, treetop might be trying some tree optimizations I guess. 

Even with all it's problems I sincerely consider this tool much more intuitive than Antler. The syntax is clean and if you don't try to make fancy tricks like making a language translator you should be safe. So if you need a parser on your next ruby project make sure to give treetop a try.

Comments (1)

Oct 29 / 1:24pm

Been busy lately.

I haven't been posting as much as I would like lately on this blog. Graduate school is keeping me pretty busy lately. Right now I'm taking a Computer Security course that's taking more time from me then what I would really like to. Although at this point I'm feeling a bit regretful, I've been doing some cool stuff lately. From code injections on broken C applications to exploit buffer overflows, remote attacks to web servers, using clusters to break passwords by brute force, up to using prolog for weird ways of security policy definition.

I've also started my final project. I picked up to make a solution for a problem I had while integrating a rails application to a ActiveMQ. You see, in corporate software, there are these things they call "Enterprise Integration Servers". These things are like the key part of the SOA architecture. Although I'm not a big fan of SOA, I can see that there's a lot of value in the so called Enterprise Integration patterns. These patterns are well defined ways for applications to communicate between each other. In the rails side of the world, you communicate between applications using REST. That works great, but when you need to decouple one application from another (specifically the application that sends the resource), or you need to make some fancy sort of queueing, you end up repeating yourself. Sure, you can use ActiveMQ, and using the wonderful ActiveMessaging plugin, but if you're like me, you will not like the idea of having to run this Java server. REST is awesome, rails applications should not rely on STOMP, JMS or other weird things to do their job.

So my final project will be a RESTful messaging server that will implement the design patterns that are described here. I already build a prototype using rails and the delayed_job pattern. I'll be posting more about this project later on.

Anyways, as the old saying says, "don't let school get in the way of your education". I'm gonna try to keep posting more often...

Comments (10)

Sep 25 / 7:06pm

Burndown now has comments!

We're proud to introduce a new feature on burndown, comments! Project management shouldn't only be about charts; communication is more important. 

Now users have pictures. One important note, the pictures of the users are grabbed from Gravatar by using your email address. So if you already have a gravatar created (if you use github for example) your picture will be displayed right away. If not you can easily create a gravatar account and import a picture for your self.

Taking advantage of these changes we also reorganized a little bit how the configuration of burndown was done. Previously we had both an Account and a Subscription tab. Both tabs were quite confusing to understand because both words have a similar meaning. Instead we have renamed the Account tab to Settings, and we have moved the User setup to it's own tab. Now things are more clear... three tabs, 1 for users, another for settings (basecamp connectivity, domain name, etc...) and another one for your subscription details (the plan in which you're at).

Now all the discussion regarding the state of the burndown can happen right were you see it. We're very excited about these new features and we hope you love it. We think it'll transform the way you use Burndown.

Comments (0)

Aug 25 / 1:25pm

Schedule tasks Whenever

If you need to automate some sort of task in your rails app I will certainly recommend the "whenever" gem. It's pretty awesome how it translate beautiful ruby code into the cryptic Cron format. Ryan Bates made a screencast a few months ago explaining how to work with it. The screencast can be summarized in:

  • Install the gem
  • Setup your application for whenever by running "wheneverize .".
  • Edit your schedule.rb file with your tasks.
  • Modify your deploy.rb file to update Cron on your server.

The format of the shedule.rb file is very simple. Here's an example of what I did:

every 1.day, :at => "4am" do
  dump_path
= "/home/user/mysqldump/dump#{Date.today.to_s}"
  command
"sudo mysqldump appdatabase > #{dump_path}.sql"
  command
"tar -zcvf #{dump_path}.tar.gz #{dump_path}.sql"
  command
"rm #{dump_path}.sql"

    runner "Storage.store_dump '#{dump_path}.tar.gz'"
end

You basically call the "every" method with parameters regarding time and a block of ruby code with what it's supposed to do. Inside the block you can use 2 methods:
  • command: Similar to the "run" method in capistrano. Lets you run something in the console. For example I call mysqldump to make a database backup, then I cal tar to compress the backup and finally I delete the uncompressed backup.
  • runner: Schedule ruby code to run in production mode (you can change that). In this case I'm using a storage class that I wrote to upload the backup to Amazon S3.

After editing the deploy.rb file to call the "whenever --update_crontab" command, I checked out what Cron had scheduled (by running "crontab -l" on the server). And as it always happens when I'm trying something new, there was a problem.

You see all those strings that use the "dump_path" variable? Well the date that got imprinted as part of the file name is the date in which the "whenever --update_crontab" command was ran. So all my backups will have tomorrow's date... Whenever doesn't magically run the ruby block you pass as a parameter to the "every" method. Instead what it does is translate that block into cron syntax. 

Got to go fix that right now...

NOTE:
After fixing the problem with the string I realized that placing those 4 commands to start at the same time, will translate into 4 cron jobs with exactly the same start time. Since cron will fork a new process for each job, this will mean that the tar command will try to create a zip before the dump command is finished. To solve this you should have 4 separated schedule blocks with each one of them starting at slightly a different hour.

Comments (0)

Aug 3 / 7:36pm

Stay on the same line!

Have you ever wanted to place a link aligned to the right in the same line in which you already have text? In burndown, each chart has a lower section with three links. The first link is for editing the title and description, the second link is for deleting it and the last link is a handy link to the Basecamp project from which the data is been pulled. Since the "Go to Basecamp" link is not really related to the first two links, I decided to place it separated from the other two links.

The "float: right;" style should be enough to do the trick I'm looking for, and indeed it works great on Safari. On the other hand, Firefox and IE display the things in a different way:


If you notice the link is on the right but not on the same line. I don't know why but Firefox and IE will always display an element floating to the right in a separate line, unless you define the element as the first one. To fix the issue I had to modify the partial from this:

    <div class="meta">
      <%= edit_or_unarchive(burndown) %> |
      <%= link_to_remote
'Delete' ... %>
     
<div style="display: inline;">
        <%= image_tag
'whitespinner.gif', :style => "display: none;" %>
     
</div
      
<%= goto_basecamp(burndown) %>
   
</div>

To this:

    <div class="meta">
      <%= goto_basecamp(burndown) %>
      <%= edit_or_unarchive(burndown) %> |
      <%= link_to_remote 
'Delete' ... %>
      
<div style="display: inline;">
        <%= image_tag 
'whitespinner.gif', :style => "display: none;" %>
      
</div
    
</div>

Now the footer of the charts look like this:

I don't know what browser is displaying the things incorrectly. Safari or Firefox?

Comments (0)

Jul 27 / 1:49pm

How a good thing can turn into a bad one?

A month ago I launched burndown (www.burndowngraph.com), a service that takes Basecamp's milestones and to-do lists and creates burndown charts. Thanks to the little push given by 37signals, I got a lot of visits during the last weeks. I got new users on a daily basis, which is great, but the service was growing slower with time. This weekend I made the decision to upgrade my hosting service.


Things were working as expected. The upgrade seemed to be smooth and all tests were running and passing. Taking advantage of the downtime caused by the upgrade I decided to remove the requirement of billing information for creating an account. 20% of the people who decided to signup stopped the signup process when they were asked to add their billing information. Fair enough, let's remove this thing for free trials, and make trials free without any commitment. I made the change, push the change to the new server and verify that all the tests were running.

This morning I woke up and saw that there were no new subscriptions during the whole weekend... strange. I started navigating the site and it was down. Apache was eating all the memory of the server. The server's memory was increased and apache was up and running fine. But still a few pages were down. Checking the log I found that the changed I introduced to remove the billing information requirement from trials screwed up all the pages that were accessed using "www" as their subdomain. Which means, that it was impossible to create support tickets and even worse than that, it was impossible to create a new account.

The bug was fixed. The system is running fine now. I sincerely apologize to anyone that was trying to create a new account and even worse tried to create a support ticket. I specially apologize to whoever was on IP addresses 65.55.106.135, 77.211.144.202, and 65.55.106.204. Man, you guys really tried hard to get that support ticket going. I don't know who you are, but all I can say is that I'm sorry.

Lessons learned. 
  1. Be really careful with application.rb, all controllers are tied to this.
  2. Test driven development isn't flawless... The tests were running in rails test environment and the url of the requests in test environment are quite different from those that occur on the production environment. I need to make tests that simulate the real urls and I need to make exploratory testing after a deployment.
  3. Don't push a functionality update at the same time that I'm upgrading the server. Lets just handle one source of issues at a time.

Comments (0)

Jul 6 / 10:51am

Include or Extend?

A few days ago I was trying to move class methods to a module. The code to add the module to my ActiveRecord model looked more or less like this.


require "concerns/finances.rb"

class Subscription < ActiveRecord::Base
  
include Finances
  ...

I was watching a railscast about how to make a gem, and I saw how Ryan added class methods that were defined in a module. The solution was simple... use the keyword extend instead of include... /facepalm 

require "concerns/finances.rb"

class Subscription < ActiveRecord::Base
  
extend Finances
  ...


So, when working with modules in Ruby, you have 3 options:
  • Include the module. All the methods are going to be instance methods.
  • Extend the class. All the methods are going to be class methods
  • Use module functions. Use the module_function modifier in the module. To call the method use Module.method_name.

I guess I need to review my ruby book again.

Comments (0)

Jun 26 / 9:52am

No Silver Bullet for the Diseconomy of Scale

This summer I'm taking a software project management course, and we use a text book that's quite old. According to my professor the principles still apply today. An interesting problem discussed by the book is the "Diseconomy of Scale". As you write more code, the cost of each code increment becomes higher than that of the previous code base. To explain the problem the author uses this formula:

Effort = Personnel * Environment * Quality * (Size ^ Process)

The first three variables are factors that describe something. Personnel describes the skill of your development team. Environment describes the tools that you use to improve your productivity (visual designers, code generators, etc...). And quality describes how much important quality is for the success of the product.

The interesting part of the formula is the one that's inside parenthesis. Size is a measurement of how big the software is going to be. It's measured either in lines of code or in function points. Process is a factor that describes how much distraction and resistance the team experiences from the artifacts required by the software process that they're following (UP, XP, Scrum, etc...).

According to the author, the diseconomy of scales happens because of the overhead of the process. He recommends using automated tools to reduce the amount of code that needs to be written and to use software processes that provide as little resistance as possible.

The book was printed in 1998, and at that time he predicted that in 10 years the diseconomy of scale will disappear. More than 10 years have passed and we still have the problem... so what's wrong with this model?

No Silver Bullet
In my opinion the formula has the following problems:
  • It values the process over the people. The process is important for developing software, but the skills of the people involved are way more determinant in the outcome than the steps of a process. There are enormous differences on the amount of working code that a skilled and experienced developer can produce. The bottom line is that software development is still an art. There are engineering aspects on it, but there still craftsmanship involved.
  • The author compares software with factories. He states that in a factory the first products are more expensive but finally the factory gets to a point were the costs of the products gets fixed. He says that the first lines of code should be expensive, but as more lines of code are written the developers gain experience and the cost of each line gets cheaper. That's completely wrong. Writing code is not pasting bricks with cement. It's more like playing Jenga, the higher the tower get, the harder it gets to keep everything stable.

  • It ignores software complexity. The formula assumes that every line of code is equal to another one. I don't think so. Have you ever coded a compiler? The lexical parser is dead simple, but the semantical checker... that's hard. Some parts of software are complex, others aren't. Usually as you add more functionality to a project, it gets more complex. Complexity is something very abstract and hard to measure.

In my opinion the formula should look more like:

Effort = Personnel * Environment * Quality * Process * (Size ^ Complexity)

Royce's silver bullet was code automation and component reutilization. Automation and code reutilization have helped a lot, but still software projects always require customization and usually it's hard to integrate different components. Too much plugin integration and you'll be feeling like you're trying to give life to a dead body.

A Viable Option?
If the complexity is bound to the amount of code why not try something revolutionary as... writing less code? Less code means fewer complexity, less component integration and less headaches. Sometimes you're not in the position where you can decide if a feature makes it to development, but if you can... try to question if something is really required. Does your client needs streaming for online help or will simple embedded tutorials be enough? Do you need to code a full blown blogging app for your page or will a simple posterous blog be enough? Reduce the amount of code and you'll reduce the cost. It's simple.

Comments (0)

Jun 11 / 11:35pm

Class Methods in Modules

After a 4 hour delay on my first connection and a 3 hour delay on the second one, I decided to add a tiny bit of functionality to my application. My application has a Subscription model that has the information of the payment related information. What I wanted to add was a simple email report that shows me how much money I am supposed to receive next month by each of the plans.

To make that report happen, I added a few static methods to the Subscription model. I noticed that this model already had a lot of other stuff, and since these methods were related I decided to create a separated module for them. The first time that I saw someone modularize an ActiveRecord model was in a video of RailsConf at europe. In that video David H. Hansson showed some of basecamp's source code, and they had different modules for some of the models to separate the logically related methods and have the model code feel cleaner.

After separating the methods in modules, my tests were broken. After a lot of struggling with the code I figured out something... you cannot add class methods to modules. Ruby won't complain about it, but if you add class methods to a module, you won't be able to use it on the class that you're including the module to.

Lets take a look at the wrong code.

module SalesReports
 
def self.active(plan = nil)
   
if plan.blank?
     
Subscription.all :conditions => [ "verified = ? and canceled = ?" , true, false]
   
else
     
Subscription.all :conditions => [ "plan = ? and verified = ? and canceled = ? " ,
                            plan,
true, false]
   
end
 
end
end

Since sometimes people just see code and copy paste it to their app... I have a disclaimer:
THIS CODE IS WRONG, DON'T COPY PASTE IT AND EXPECT IT TO WORK.

How did I fix it? It turns out that modules have a built in method that works similar to ActiveRecord magical validations. The method is called module_function. By adding "module_function :active" and removing the self from the active method, SalesReports.active is now a module level method.

What did I learn today?
1- American Airlines Suck.
2- Modules can't have class methods.
3- You can instead create module methods with the module_function thing.

Comments (1)

Jun 1 / 11:49am

How to do SCRUM with Basecamp?

About 2 years ago we were sitting at the office trying to figure out which tool to use for project management. I mean, being agile is about being simple, but index cards were starting to get messy. Bad handwriting was starting to make things unclear (Is that an "r" or an "f"?). Finally the decision to look for a tool was taken when one of our project managers dropped a cup of coffee over a deck of story cards.


There were many tools out there that could help us out, and perhaps we tried several of those. There was the good old FogBugz, which seemed more focused on taking track of problems rather than help us organize the project. We also considered another tool that I can't remember it's name, which allowed us to define workflows. The customization was so hard (and it required us to install some weird active X control), that we decided to skip it. And finally we had all these "agile" tools that require some sort of "certification" given by some consulting firm. I don't know about you, but certifications don't sound really agile to me.

Of all the tools we tried, the one that gave us the best results was Basecamp. These are the two reasons that we believe make Basecamp great for agile project management:
  1. Simplicity: There is no need for training when you start using Basecamp. In essence Basecamp is a set of message boards, blogging tools, calendars and to-do list trackers. All these tools combined in an intuitive way is what Basecamp is all about. If your team can't figure out how to use Basecamp... you're in troubles. Being agile is about using the simplest tools available.
  2. Interactions instead of Process: The purpose of Basecamp is to promote and enhance collaboration between a team. Basecamp doesn't try to force you into a specific workflow as most project management tools do. You can pretty much do whatever process or workflow you want, and still, Basecamp is going to be helpful. Guess what's the first line of the Agile ManifestoIndividuals and interactions over processes and tools. So instead of thinking "Is ticket 4137 assigned to David or to Paula?", you will see the discussion of a real issue between your team.
But since Basecamp has no process or workflow embedded on it, how can we adapt it to an agile methodology? Before explaining how we did it, let me talk a little bit about the software development method that we used at your office. In our office we used a mixture of Scrum and XP. We basically had our product backlog based on a set of user stories. We tried to measure our team's velocity and based on that to estimate how many iterations we would require to finish a project. We used a sprint burndown graph that we updated daily to keep track of how we were doing with respect to our plan. If we ever saw that we were not going to be able to finish all the work we thought we could, then we talked with our customers and explained the reasons.

Ok Ok, Now Here is How
Our process was more or less the following:
  1. Make a mind map to identify the project vision, goals, risks and stake holder. You know what a mind map is right? Something really important, make sure to have this step right. Consult with your customer and make sure that you share the same vision. How we used Basecamp here? By keeping track of the communication with the customer. Once you have your vision written, you can set it up as a Basecamp Announcement, so that it can be shown to all the project member every time they log on.
  2. Have one or more Story Writing Workshops. Here is when the user comes with his big christmas list that includes all his unicorns and care bears. Everything the user wants must be taken into consideration but you need to make sure that the user is able to formulate his dreams with the following format: As a "role" I want to be able to "user wish" so that "business value". If the user is not able to fill out all the blanks then it means that either you can't identify anyone that is going to make the task, or the task has no value for the business. I find that using Basecamp writeboards is really good for managing your stories (And even better if you have a Backpack account, you can create a note for each user story, and move them like a stack of cards between different Backpack pages). Index cards can be used as a fast way of writing the stories, but  if you need to share them with a lot of people, it's better to have them online.
  3. Estimate Story Points. You need to decide how hard each story is in comparison to another. We used to have a limited set of possible values for this task. The possible values were: 1 - 2 - 3 - 5 - 8 - 13. No other value is allowed, if you need something smaller, then you need to join the story, if you need something bigger, then you need to split it. Again, if you need to do this in a distributed environment, Basecamp can be very helpful to keep track of the discussions.
  4. Plan Your Iterations. Make sure that the customer understands that the date you will be giving is only an estimate. Be careful of setting expectations that you won't be able to accomplish. Usually at this point your customer would want to cut out things that have very low business value (probably the unicorns and care bears will have to wait again). Once you have a deliver date for each iteration, create a milestone for each iteration in Basecamp.
  5. Define the Tasks for the First Iteration. Take all the stories that are going to be developed in the first iteration and think of all the tasks that are going to be required for each one. Remember to consider the tests, the coding and the research. Finally until you get to this point you are able to start talking about how many hours a task might take you. In Basecamp you are going to create a to-do list for each user story. Each of the tasks is going to be a to-do item in your to-do list. Make sure to assign each to-do list to the milestone that represents the iteration in which it's supposed to be delivered. If you figure out at the end of this step that you have space for more work (usually it never happens) or that you're scheduling more work than you're able to finish in one iteration, move user stories (and I mean the complete story not only a few tasks) between iterations. Remember to talk to your customer about any changes you're making on the expected delivery date of each user story.

So that's what you do to plan a project in Basecamp using a mixture of Scrum/XP on it. We've only discussed about the planning of the first iteration. The next step is:

Keeping Track Of Your Project
At this point your job is to keep track how your team is advancing through the to-do items you created previously. As usual you will discover new tasks that you didn't think before hand. If this happens make sure to add them to the to-do list that represents their story. The ability to post comments on the to-do items works great when you need to record any discussion or insights your team has about a task.

One thing I forgot to mention previously is that when we create a to-do item, we assign an estimated amount of time. Since there's no way to add time estimates for each task we simply add the duration estimate as a string at the end of the task. For example a task would say something like:

Refactor Rick's messy authentication class. 2h

One thing that is important to do is to remember to re estimate the tasks daily. That will keep your expected time to remain realistic. Usually there are two or three tasks that take less then expected, but there's always one that can take 10 times longer.

Some people like to use the time tracking features of Basecamp to keep track of their tasks. Although time tracking works fine for tracking how long a task took, it doesn't tell you what your customers would probably be more interested on: how long will it take you to finish? For this reason we think that the labels on the to-do items are by far more important then the time tracking capabilities that Basecamp offer (this is because of our business model, we charge a rate based on the amount of iterations that the project will take to finish, we don't count the exact amount of hours we take on each task in order to charge).

We usually don't assign the tasks to anyone until we do our daily stand-up meeting. During the meeting each developer picks up a story (or set of tasks) they would like to work on and by assigning the tasks to themselves, they commit to complete the work.

That's it!
Once you're about to finish an iteration, you can print your milestones, and message boards. You will get a professional looking document that will tell you what happened during the iteration (works great to impress your customers during iteration reviews).  Once done with an iteration, do the same thing for the following ones.

We're Missing Just 1 Little Thing...
The only thing that Basecamp doesn't offer is the ability to generate a sprint burndown chart. For that purpose, we used to keep a copy of the task list on a spread sheet. From the spread sheet we can then generate the burndown chart. Keeping the list in the spread sheet and in Basecamp at the same time was a real pain in the rear, and for that reason we built a tool that uses Basecamp's API to generate the burndown charts for us.

We liked the tool so much that I figured out that we should actually start selling it. In a few weeks we're going to be done with the subscription management process. Here's a link to it if you want to learn more: http://www.burndowngraph.com

Comments (17)