Thursday, 21 April 2011

Huge files on windows server 2008 R2 64bit

In the previous article I discussed the simple operation read and write of three files with sizes between 330Mb to 2.6 GB on a standard PC with Windows XP. Now we see the changes to a virtual server, a production system that involves an ever increasing number of companies.

The languages ​​examined are:

Ruby 1.8.6 p383 (2009-08-04) [i386-mingw32]
Ruby 1.8.7 p334 (2011-02-18) [i386-mingw32]
Ruby 1.9.2 p180 (2011-02-18) [i386-mingw32]
jruby 1.6.1 (ruby-1.8.7-p330) (2011-04-12) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_23) [Windows Server 2008 R2-amd64-java]
IronRuby 1.1.3.0 (ruby-1.9.2) on .NET 4.0.30319.225
Python 2.7.1 32bit
Python 2.7.1 64bit
Python 3.2.0 32bit
Python 3.2.0 64bit
Php 5.3.6 vc9 unsafe thread
Lua 5.1.4 40
C# 32bit on .NET 2.0.50727.4927
C# 64bit on .NET 2.0.50727.4927
C# 32bit on .NET 4.0.30319.1
C# 64bit on .NET 4.0.30319.1

Only python provides x64 installation packages and I took the opportunity to compare them with the 32-bit versions. Perhaps the differences will be more relevant with math operations instead of IO, but I opened the way for the next comparison.
The Ruby version is the 1.8.6 mingw32 and not the mswin32 as in the previous test. IronRuby instead is the latest 1.1.3 which support ruby 1.9.2 and not 1.8.6 as the version of the previous test with which, however, shares the same framework. net and the same IO section.
This time I also added C # in comparison. I have compiled four different versions for the platform, x86 and x64, and also for the framework, 3.5 and 4. .NET Framework 2.0, 3.0 and 3.5 uses the same CLR version.
Honor to IronRuby, the first of the group that is even above than C #, a compiled language and with which it shares a lot. It 's true that this test does not require very high computing power but it is certainly a curious result.

And here also a summary on the memory usage:

Lua 5.1.4 0,7mb
Php 5.3.6 2,2mb
Python 2.7.1 32bit 2,5mb
Python 3.2.0 32bit 3,7mb
Python 2.7.1 64bit 4mb
Python 3.2.0 64bit 5,5mb
Ruby 1.9.2p180 4-6mb
Ruby 1.8.6p383 4-9mb
Ruby 1.8.7p334 4-9mb
C# 32bit on .NET 2.0.50727.4927 7mb
C# 32bit on .NET 4.0.30319.1 7mb
C# 64bit on .NET 2.0.50727.4927 9mb
C# 64bit on .NET 4.0.30319.1 9mb
IronRuby 1.1.3.0 on .NET 4.0.30319.225 11mb
jruby 1.6.1 (JVM 64-Bit Server 1.6.0_23) jruby 1mb + java 200mb






This is the C# code that I compiled with Visual Studio 2010:

using System;
using System.IO;

namespace Split
{
    class Program
    {

        ///

        /// To split a file into n output files
        /// 

        ///
Filename and records number to split
        static void Main(string[] args)
        {
            string strInput = args[0];
            string strOutput = "out_{0:000}.txt";
            Int32 nrec_to_split = Convert.ToInt32(args[1]);

            DateTime t1 = DateTime.Now;
            Console.WriteLine("C# {1} Started at {0:R}, please wait...", t1, System.Environment.Version);

            StreamReader sr;
            StreamWriter sw = null;
            sr = new StreamReader(strInput);
            Int16 nsplit = 0;
            Int64 nrec = 0;
            while (sr.Peek() >= 0)
            {
                if (nrec % nrec_to_split == 0)
                {
                    ++nsplit;
                    if (sw != null) sw.Close();
                    sw = new StreamWriter(String.Format(strOutput, nsplit));
                }
                sw.WriteLine(sr.ReadLine());
                ++nrec;
            }

            Console.WriteLine("Ended at {0:R}, please wait...", DateTime.Now);
            Console.WriteLine("Elapsed time {0}", DateTime.Now - t1);
        }

    }
}

No comments:

Post a Comment