Programming Language Performance Part VI:F# and Ruby

This is part of a occasionally updated series on various programming languages. It should not be interpreted as a benchmark, but rather as a casual look at various programming languages and how they might be used by somebody for a practical purpose.
Currently, there are Articles written regarding Python,C#, Java and VB6 (merged for some reason),Scala,F# & Ruby,Perl, Delphi, PHP,C++,Haskell,D, VB.NET, QuickBASIC, and Rust

In order to compare various languages, I will be implementing a anagram search program in a variety of languages and testing and analyzing the resulting code. This entry describes the F# and Ruby entrants into that discussion. A naive anagram search program will typically have dismal performance, because the “obvious” brute force approach of looking at every word and comparing it to every other word is a dismal one. The more accepted algorithm is to use a Dictionary or Hashmap, and map lists of words to their sorted versions, so ‘mead’ and ‘made’ both sort to ‘adem’ and will be present in a list in the Hashmap with that key. A psuedocode representation:

Loop through all words. For each word:
1. create a new word by sorting the letters of the word.
2. use the sorted word as a key into a dictionary/hashmap structure, which indexes Lists. Add this word (the normal unsorted one) to that list.
When complete, anagrams are stored in the hashmap structure indexed by their sorted version. Each value is a list, and those that contain more than one element are word anagrams of one another.

The basic idea is to have a program that isn’t too simple, but also not something too complicated. I’ll also admit that I chose a problem that could easily leverage the functional constructs of many languages. Each entry in this series will cover an implementation in a different language and an analysis of such.

My proceeding adventure in Scala took me back to my native roost- at least partially. I went back to .NET, but this time in the form of F#. Again, I’ve hardly touched the language, like Scala; but projects such as this post make one feel adventurous. This is the code I ended up with:

open System
open System.IO
open System.Collections.Generic 
open System.Linq 
let dict = "D:\\dict.txt"
let fstream = new StreamReader(new FileStream(dict,FileMode.Open))
let anagrams = new Dictionary<String,List<String>>()

[<EntryPoint>]
let main (args : string[]) =

    let start = DateTime.Now
    let grablines = fstream.ReadToEnd().Split('\n')
    for grabline in grablines do
        
        let grabchars = grabline.ToCharArray().ToList()
        grabchars.Sort()
        let sorted = new String(grabchars.ToArray())
        if not (anagrams.ContainsKey(sorted)) then
            anagrams.Add(sorted,new List<String>())
        
        anagrams.Item(sorted).Add(grabline)
        
      
        
    let totaltime = DateTime.Now - start
    Console.WriteLine("Execution time:" + totaltime.TotalSeconds.ToString() + " seconds");
    Console.ReadKey()
    0

open System

open System.IO

open System.Collections.Generic

open System.Linq

let dict = "D:\\dict.txt"

let fstream = new StreamReader(new FileStream(dict,FileMode.Open))

let anagrams = new Dictionary<String,List<String>>()

[<EntryPoint>]

let main (args : string[]) =

let start = DateTime.Now

let grablines = fstream.ReadToEnd().Split('\n')

for grabline in grablines do

let grabchars = grabline.ToCharArray().ToList()

grabchars.Sort()

let sorted = new String(grabchars.ToArray())

if not (anagrams.ContainsKey(sorted)) then

anagrams.Add(sorted,new List<String>())

anagrams.Item(sorted).Add(grabline)

let totaltime = DateTime.Now - start

Console.WriteLine("Execution time:" + totaltime.TotalSeconds.ToString() + " seconds");

Console.ReadKey()

My initial run took almost 10 seconds. Then I remembered the C# test had changed compile settings to meet a end user environment; I changed the target platform to “Any CPU” and set it to the release build. The resulting program finished around half a second each time- even faster than the C# implementation, despite using much of the same library code (For the Dictionary, List, etc.). Not to mention this is my first iteration, and I’m a relative beginner to F#. I’m fairly sure this could be made faster and/or shorter using the myriad of functional constructs available. F# is definitely a promising language; and my quick implementation that I wrote here is faster than my C# implementation- and I would usually typify myself as a C# programmer.

Ruby

Moving right along, I decided to target a language almost universally panned for it’s “slowness”- Ruby. This is the code sample that I ended up with.

require 'time'
startt = Time.now
anagrams=Hash.new(nil)
File.foreach("D:\\dict.txt") do |word|
    if not word === "" then
		sorted = word.split('').sort.join()
		if anagrams[sorted]==nil
		    anagrams[sorted] = [word]
		else
			anagrams[sorted] << word
		end
	end
end
result = Time.now-startt
puts result

require 'time'

startt = Time.now

anagrams=Hash.new(nil)

File.foreach("D:\\dict.txt") do |word|

if not word === "" then

sorted = word.split('').sort.join()

if anagrams[sorted]==nil

anagrams[sorted] = [word]

else

anagrams[sorted] << word

end

result = Time.now-startt

puts result

Honestly, I rather like this one. It’s short, and with a little Ruby expertise, pretty easy to read. It took around 10 seconds each execution. Ruby’s rap for bad performance could be argued as being well-deserved; however, there is something to be said for the brevity and easy reading that the code provides to the mildly initiated viewer, and it certainly lives up to it’s reputation as being Rapid to work with; I went from knowing almost nothing abotu ruby to creating the above implementation in the span of an hour or so. I’m particularly fond of my use of word.split('').sort.join to turn what is often a complete sort and conversion implementation in the other languages (change to a char array, sort that, turn back into string) into a single function call. Ruby is clearly a language designed for programmers first. Considering future Ruby implementations could include compilers or a full on JIT or bytecode implementation (the current implementations are “pure” interpreters), I see no reason to pan Ruby now based simply on how it is currently implemented. The time for judging a language on it’s inherent execution model passed before the end of the Cold War.

Have something to say about this post? Comment!

BASeCamp Programming Blog

Multi-lingual when he wants to be.

Programming Language Performance Part VI:F# and Ruby

Ruby