Marco Mastrodonato ...at work: 2011

Tuesday, 13 December 2011

Leonardo: update to version 1.9

I just released the1.9 version which enhances and completes the resources management.

If you use the --under option to nest the new resource under another existing one, must be edited the file config / routes.rb because the operation will create the new routes nested, overwriting the originals.

For example if we create the product resource, leonardo will create these routes:

resources :products do
  post :select,           :on => :collection
  post :edit_multiple,    :on => :collection
  put  :update_multiple,  :on => :collection
  put  :create_multiple,  :on => :collection
end

if then we create the comment resource under product you will get:

resources :products do
  resources :comments do
    post :select,           :on => :collection
    post :edit_multiple,    :on => :collection
    put  :update_multiple,  :on => :collection
    put  :create_multiple,  :on => :collection
  end
end

resources :products do
  post :select,           :on => :collection
  post :edit_multiple,    :on => :collection
  put  :update_multiple,  :on => :collection
  put  :create_multiple,  :on => :collection
end

but you have to join it to get something like this:

resources :products do
  post :select,           :on => :collection
  post :edit_multiple,    :on => :collection
  put  :update_multiple,  :on => :collection
  put  :create_multiple,  :on => :collection
  resources :comments do
    post :select,           :on => :collection
    post :edit_multiple,    :on => :collection
    put  :update_multiple,  :on => :collection
    put  :create_multiple,  :on => :collection
  end
end

Moreover, the list of resources shows a link to the nested resource for the parent who uses the name field. If not present you should replace it with some other field otherwise you can add the method name into the model:

class Product < ActiveRecord::Base

  def name
    self.title
  end
end

Monday, 10 October 2011

Leonardo: Update to version 1.8

With the latest version 1.8 I added the chance to sort the columns by this approach:
http://railscasts.com/episodes/228-sortable-table-columns

I also added by default the auto submit to the search form on the index view when the values are changed for fields like select, radio ... those with a net change of the value, avoiding the text box to not encumbering the database. However it is possible to add or remove this feature, by adding or removing the autosubmit class :

<%= f.collection_select :category_id, Category.all, :id, :name, {:prompt => true}, {:class => 'autosubmit'} %>

In custom.js file (called from application.js) I added a bind (or more precisely, a live):

$('.autosubmit').live('change', function() {
  setTimeout("$('#"+this.id+"').parents('form:first').submit();", 300);
  return false;
});

I use a timeout to run the submit after eventual updates, usually hidden fields coupled to self completions which currently leonardo does not handle but it will do soon.

With version 1.7 released in late September that I have introduced the option leospace:

rails g leosca decode name:string --leospace=admin

I fixed the code for compatibility with Ruby 1.9.2 which has a slightly different String class.

Now the development users are no longer created by migration, being a database population their place is in the file db / seeds.rb used with rake db: seed. Handy for db:reset or db: schema: load utilities that previously they deleted users.

Create the resource "decode " in no namespace and only the management under the namespace specified in the parameter leospace.

The original scaffold has something similar as well, let's take this example:

rails g scaffold admin/decode name:string

This creates the resource and its management in admin namespace.

We examine the differences:
Both will create paths in admin namespace:
/admin/decodes
/admin/decodes/:id
ecc.

With the scaffold, the resource is called Admin:: Decode and table admin_decodes

With leonardo, the resource is called only Decode and it is not under a namespace. The table is called decodes.

The transaction meets the need to create different interfaces for the same resource. Of course you can also use the original leonardo Methodology:

rails g leosca admin/decode name:string

The latest updates details:

1.8.3 (November 8th, 2011) Marco Mastrodonato

Controller/sort_column => Fixed an issue with multiple tables
Dev users are now added by rake db:seed, no more by migration

1.8.2 (October 10th, 2011) Marco Mastrodonato

List: Added id as first column
Replaced String.any? with String.size>0 for ruby 1.9.2 support

1.8.1 (October 10th, 2011) Marco Mastrodonato

Updated rspec to pass index view test with two new controller helpers
Date managed with datepicker are now in year-month-day format and now works with sqlite3
Added autosubmit class by default. A new function in custom.js let you to autosubmit searches on filter fields change event

1.8.0 (October 7th, 2011) Marco Mastrodonato

Added sortable columns

1.7.2 (October 5th, 2011) Marco Mastrodonato

Updated formtastic support to version 2.0
Updated template.rb

1.7.1 (October 3th, 2011) Marco Mastrodonato

Fixed a layout issue

1.7.0 (September 28th, 2011) Marco Mastrodonato

New feature to add a resource management within a namescape. Unlike the original way (rails g leosca admin/resource ...which remains available), the resource is not under the namespace but only its management so you could create one for several kind of users. Try it adding the new option like this example: --leospace=admin
Huge code improvements

Thursday, 21 April 2011

Huge files on windows server 2008 R2 64bit

In the previous article I discussed the simple operation read and write of three files with sizes between 330Mb to 2.6 GB on a standard PC with Windows XP. Now we see the changes to a virtual server, a production system that involves an ever increasing number of companies.

The languages examined are:

Ruby 1.8.6 p383 (2009-08-04) [i386-mingw32]
Ruby 1.8.7 p334 (2011-02-18) [i386-mingw32]
Ruby 1.9.2 p180 (2011-02-18) [i386-mingw32]
jruby 1.6.1 (ruby-1.8.7-p330) (2011-04-12) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_23) [Windows Server 2008 R2-amd64-java]
IronRuby 1.1.3.0 (ruby-1.9.2) on .NET 4.0.30319.225
Python 2.7.1 32bit
Python 2.7.1 64bit
Python 3.2.0 32bit
Python 3.2.0 64bit
Php 5.3.6 vc9 unsafe thread
Lua 5.1.4 40
C# 32bit on .NET 2.0.50727.4927
C# 64bit on .NET 2.0.50727.4927
C# 32bit on .NET 4.0.30319.1
C# 64bit on .NET 4.0.30319.1

Only python provides x64 installation packages and I took the opportunity to compare them with the 32-bit versions. Perhaps the differences will be more relevant with math operations instead of IO, but I opened the way for the next comparison.
The Ruby version is the 1.8.6 mingw32 and not the mswin32 as in the previous test. IronRuby instead is the latest 1.1.3 which support ruby 1.9.2 and not 1.8.6 as the version of the previous test with which, however, shares the same framework. net and the same IO section.
This time I also added C # in comparison. I have compiled four different versions for the platform, x86 and x64, and also for the framework, 3.5 and 4. .NET Framework 2.0, 3.0 and 3.5 uses the same CLR version.
Honor to IronRuby, the first of the group that is even above than C #, a compiled language and with which it shares a lot. It 's true that this test does not require very high computing power but it is certainly a curious result.

And here also a summary on the memory usage:

Lua 5.1.4	0,7mb
Php 5.3.6	2,2mb
Python 2.7.1 32bit	2,5mb
Python 3.2.0 32bit	3,7mb
Python 2.7.1 64bit	4mb
Python 3.2.0 64bit	5,5mb
Ruby 1.9.2p180	4-6mb
Ruby 1.8.6p383	4-9mb
Ruby 1.8.7p334	4-9mb
C# 32bit on .NET 2.0.50727.4927	7mb
C# 32bit on .NET 4.0.30319.1	7mb
C# 64bit on .NET 2.0.50727.4927	9mb
C# 64bit on .NET 4.0.30319.1	9mb
IronRuby 1.1.3.0 on .NET 4.0.30319.225	11mb
jruby 1.6.1 (JVM 64-Bit Server 1.6.0_23)	jruby 1mb + java 200mb

This is the C# code that I compiled with Visual Studio 2010:

using System;
using System.IO;

namespace Split
{
    class Program
    {

        ///

        /// To split a file into n output files
        /// 

        ///
Filename and records number to split
        static void Main(string[] args)
        {
            string strInput = args[0];
            string strOutput = "out_{0:000}.txt";
            Int32 nrec_to_split = Convert.ToInt32(args[1]);

            DateTime t1 = DateTime.Now;
            Console.WriteLine("C# {1} Started at {0:R}, please wait...", t1, System.Environment.Version);

            StreamReader sr;
            StreamWriter sw = null;
            sr = new StreamReader(strInput);
            Int16 nsplit = 0;
            Int64 nrec = 0;
            while (sr.Peek() >= 0)
            {
                if (nrec % nrec_to_split == 0)
                {
                    ++nsplit;
                    if (sw != null) sw.Close();
                    sw = new StreamWriter(String.Format(strOutput, nsplit));
                }
                sw.WriteLine(sr.ReadLine());
                ++nrec;
            }

            Console.WriteLine("Ended at {0:R}, please wait...", DateTime.Now);
            Console.WriteLine("Elapsed time {0}", DateTime.Now - t1);
        }

    }
}

Tuesday, 19 April 2011

Ruby python php lua at work with huge files

Let's see how well does the IO section of some of the most popular scripting languages. The exercise consists of reading sequentially several large input file and split it into smaller files.

The languages under consideration are:

Ruby 1.8.6 p287 (2008-08-11) [i386-mswin32]
Ruby 1.8.7 p334 (2011-02-18) [i386-mingw32]
Ruby 1.9.2 p180 (2011-02-18) [i386-mingw32]
jruby 1.5.1 (ruby 1.8.7 patch 249) (Java HotSpot(TM) Client VM 1.6.0_14) [x86-java]
jruby 1.5.1 (ruby 1.8.7 patch 249) (Java HotSpot(TM) Client VM 1.6.0_24) [x86-java]
jruby 1.6.1 (ruby-1.8.7-p330) (Java HotSpot(TM) Client VM 1.6.0_24) [Windows XP-x86-java]
IronRuby 1.1.0.0 on .NET 4.0.30319.225
Python 2.6.2
Python 2.7.1
Python 3.2.0
Php 5.3.6 vc9 unsafe thread
Lua 5.1.4 40

We start by creating the three input files needed for the test:

ruby new.rb input1.txt 185000 1799 => 330Mb
ruby new.rb input2.txt 500000 1799 => 880Mb
ruby new.rb input3.txt 1500000 1799 => 2,6Gb

These measurements were made on a PC Cpu Intel E7300 Core2 Duo 2,66Ghz Ram 3,25Gb with Windows XP Professional 32bit, Hard Disk ST3250310AS Barracuda 7200.10 SATA 3.0Gb/s 250Gb.
Soon it will also perform on a Server Windows 2008 R2 64bit on VMWare Xeon X7460 Dual Core at 2,66Ghz and 2Gb di ram with SCSI disks.
Before and after creating the three input files I defragmented the disk. If the times are erratic means that the disk should be defragmented or there is something that slows down the system such as the antivirus which must be disabled.
For every file I run six benchs and considering the poor performance of the IO system, I dropped the three worst. Of course, before each test I removed the output files.
The graphs are explicit.
Only one comment about ruby 1.9.2 which has obvious problems of IO and these results are not in line with the overall performance of this language that, as I have checked from previous tests, are very good.

These are the scripts that I wrote:

# Written by Marco Mastrodonato on 19/04/2011
# Script to split a file into n output files
# Example:
# ruby split.rb par1 par2
# par1 => name [default => input1.txt]
# par2 => record number that determines the number of output files [default => 1650]

strinput = ARGV[0] || 'input1.txt'
nrec_to_split = ARGV[1] ? ARGV[1].to_i : 1650

unless File.exists? strinput
	puts "File #{strinput} doesn't exists!" 
	exit 1
end

stroutput = "out_%03d.txt"

t1= Time.now
puts "Ruby #{RUBY_VERSION} #{strinput} started at #{t1}, wait please..."

File.open(strinput, "r") do |f|
	nsplit = 0
	nrec = 0
	fileoutput = nil
	
	while line = f.gets
		if nrec % nrec_to_split == 0
			nsplit += 1
			fileoutput.close if fileoutput
			fileoutput = File.open(stroutput % nsplit, 'w')
		end
		fileoutput.write line
		nrec += 1
	end
	
	fileoutput.close if fileoutput
end

puts "Ended at #{Time.now}"
puts "Elapsed time #{Time.now - t1}"
exit 0

# Written by Marco Mastrodonato on 19/04/2011
# Script to split a file into n output files
# Example:
# python split.py par1 par2
# par1 => name [default => input1.txt]
# par2 => record number that determines the number of output files [default => 1650]

from time import time, gmtime, strftime
import sys

try:
	strinput = sys.argv[1]
except:
	strinput = 'input1.txt'

stroutput = "out_%03d.txt"

try:
	nrec_to_split = int(sys.argv[2])
except:
	nrec_to_split = 1650

t1 = time()
print(sys.version)
print(strftime("Started at %a, %d %b %Y %H:%M:%S +0000, wait please...", gmtime()))

nrec = 0
nsplit = 0

fileinput = open(strinput, "r")
for line in fileinput:
	if nrec % nrec_to_split == 0:
		try:
			fileoutput.close()
		except NameError:
			fileoutput = None
		nsplit += 1
		fileoutput = open(stroutput %nsplit , "w")
	fileoutput.write(line)
	nrec += 1    
fileoutput.close()
fileinput.close()

print(strftime("Ended at %a, %d %b %Y %H:%M:%S +0000", gmtime()))
print("Elapsed time %f" %(time() - t1))

 name [default => input1.txt]
// par2 => record number that determines the number of output files [default => 1650]

$strinput = isset($argv[1]) ? $argv[1] : 'input1.txt';
$nrec_to_split = isset($argv[2]) ? $argv[2] : 1650;
$stroutput = 'out_%03d.txt';

$t1 = microtime_float();
echo "Php ".phpversion()." started at ".date('D, d M Y H:i:s T').", wait please...\n";

$nsplit = 0;
$nrec = 0;
$fileinput=fopen($strinput,"r");

while(!feof($fileinput)) {
	if ($nrec % $nrec_to_split == 0) {
		++$nsplit;
		if (isset($fileoutput)) fclose($fileoutput);
		$fileoutput = fopen(sprintf($stroutput, $nsplit), 'w');
	}
	$buffer = fgets($fileinput);
	fwrite($fileoutput, $buffer);
	++$nrec;
}

fclose ($fileinput);

echo "Ended at ".date('D, d M Y H:i:s T')."\n"; 
echo "Elapsed time ".(microtime_float() - $t1)."\n";


function microtime_float() {
	list($usec, $sec) = explode(" ", microtime());
	return ((float)$usec + (float)$sec);
}

?>

--[[
Written by Marco Mastrodonato on 19/04/2011
Script to split a file into n output files
Example:
lua split.lua par1 par2
par1 => name [default => input1.txt]
par2 => record number that determines the number of output files [default => 1650]
--]]
strinput = arg and arg[1] or "input1.txt"
stroutput = "out_%03d.txt"
nrec_to_split = arg and arg[2] and tonumber(arg[2]) or 1650

local t1 = os.clock()
print(_VERSION .. " started at " .. os.date("%a, %d %b %Y %H:%M:%S +0000"))

nsplit = 0
nrec = 0
for line in io.lines(strinput) do
  if nrec % nrec_to_split == 0 then
    if fileOut ~= nil then io.close(fileOut) end
    nsplit = nsplit + 1
    fileOut = io.open(string.format(stroutput, nsplit) , 'w')
  end
  fileOut:write (line .. '\n')
  nrec = nrec + 1
end

io.close(fileOut)

print("Ended at " .. os.date("%a, %d %b %Y %H:%M:%S +0000"))
print(string.format("Elapsed time: %.2f\n", os.clock() - t1))

To create the files I've used this simple ruby script:

# Example:
# ruby new.rb [NOME] [LINES] [RECORD SIZE]

stroutput = ARGV[0] || 'input1.txt'
num = ARGV[1] ? ARGV[1].to_i : 185000
size = ARGV[2] ? ARGV[2].to_i : 1799

if File.exists? stroutput
	puts "File #{stroutput} already exists!" 
	exit 1
end

t1= Time.now
puts "Ruby #{RUBY_VERSION} #{stroutput} started at #{t1}, wait please..."

line = "*" * size

File.open(stroutput, "w") do |f|
	num.times do
		f.puts line
	end
end

puts "Ended at #{Time.now}"
puts "Elapsed time #{Time.now - t1}"
exit 0