Wednesday, March 28, 2012

DataTable vs List


  Recently just make in depth research about DataTable. DataTable is easy to use variable which allows user to pool data into it with consistence format, embed with other properties that allow us to do sorting, search etc. Currently I am researching on high performance application, which need to store large amount of data into memory. Soon I hit a bottleneck with DataTable. When inserting more than 5 million of records into DataTable, my desktop starts to struggle, even with 5G Ram So I need to find an alternative.

  The main purpose of my tool  is analyzing large amount of data. It uses to compare ID and do heavy calculation on selected field. After do some research, I found list can on par with DataTable’s needed function, at the same time increase efficiency.
So lets prove it!

DataTable button click event
private void button3_Click(object sender, EventArgs e)
        {
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
            DataTable dt = new DataTable();
            dt.Columns.Add("a");
            dt.Columns.Add("b");
            dt.Columns.Add("c");
            dt.Columns.Add("d");
            dt.Columns.Add("e");
            long i = 0;
            while (i < 1000000)
            {
                dt.Rows.Add("a" + i.ToString(),"b" + i.ToString(), "c" + i.ToString(), "d" + i.ToString(), "e" + i.ToString());
                i++;
            }
            stopwatch.Stop();
            MessageBox.Show(stopwatch.Elapsed.TotalMilliseconds.ToString());
        }

List button click event
private void button4_Click(object sender, EventArgs e)
        {
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
            List<List<lst>> lists = new List<List<lst>>();
            long i = 0;
            while (i < 1000000)
            {
                List<lst> list = new List<lst>();
                lst superlist = new lst
                {
                    field1 = "a" + i.ToString(),
                    field2 = "b" + i.ToString(),
                    field3 = "c" + i.ToString(),
                    field4 = "d" + i.ToString(),
                    field5 = "e" + i.ToString()
                };
                lists.Add(list);
                i++;
            }
            stopwatch.Stop();
            MessageBox.Show(stopwatch.Elapsed.TotalMilliseconds.ToString());
        }
        class lst
        {
            public string field1 { get; set; }
            public string field2 { get; set; }
            public string field3 { get; set; }
            public string field4 { get; set; }
            public string field5 { get; set; }         
        }

Benchmark result:
datatable memory usage


list memory usage


time taken in milisecond, for DataTable

time taken in milisecond, for List

List is 4X faster then datatable
DataTable uses 6X more memory then list

Conclusion:
there are some performance tweak can be done on source code. 



18 comments:

Unknown said...

thanks,
i like it.
and i am try it.

Anonymous said...

Thanks alot

Anonymous said...

Thanks alot

Sri said...

Nice article. Very useful for me.

David said...

I like your example, so I decided to make a sample project. Unless I am missing something in the code, you are never adding "lst superlist" to a collection.

Shouldn't you have a line as follows "list.Add(superlist);"

Anonymous said...

nice
4

Anonymous said...

Your sample is wrong, check your lists. Nice catch David.

Anonymous said...

Copied from my Linqpad:

DataTable
void Main()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
DataTable dt = new DataTable();
dt.Columns.Add("a");
dt.Columns.Add("b");
dt.Columns.Add("c");
dt.Columns.Add("d");
dt.Columns.Add("e");

for(int i = 0; i < 1000000; i++)
{
dt.Rows.Add("a" + i.ToString(),"b" + i.ToString(), "c" + i.ToString(), "d" + i.ToString(), "e" + i.ToString());
}
stopwatch.Stop();
stopwatch.Elapsed.TotalMilliseconds.ToString().Dump();
dt.Rows[0].Dump(); // Just to show that there is content here
}

Lists
void Main()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
var lists = new List();

for(var i = 0; i < 1000000; i++)
{
TestList l = new TestList
{
field1 = "a" + i.ToString(),
field2 = "b" + i.ToString(),
field3 = "c" + i.ToString(),
field4 = "d" + i.ToString(),
field5 = "e" + i.ToString()
};
lists.Add(l);
}
stopwatch.Stop();
stopwatch.Elapsed.TotalMilliseconds.ToString().Dump();
lists.Take(1).Dump(); // Just to show that there is content here unlike example above
}

class TestList
{
public string field1 { get; set; }
public string field2 { get; set; }
public string field3 { get; set; }
public string field4 { get; set; }
public string field5 { get; set; }
}


Your conclusion is however still mostly correct, lists do seem to be quicker and more memory efficient. Thanks for sharing.

Selva.G said...

Its really help me to understand :) thanks a lot :)

Selva.G said...
This comment has been removed by the author.
Anonymous said...

Genius!

raynnowui21 said...

After I originally commented I clicked the -Notify me when new feedback are added- checkbox and now every time a remark is added I get 4 emails with the identical comment. Is there any method you possibly can take away me from that service? Thanks! online casinos for us players

yaklibber924 said...

After examine a few of the weblog posts in your web site now, and I truly like your way of blogging. I bookmarked it to my bookmark website record and might be checking back soon. Pls take a look at my website online as nicely and let me know what you think. online casino games

Maf said...

You are a Hero! Thanks dude!

ajeet said...

helpful

Anonymous said...

Helpul hopefully you continue this

Anonymous said...

I already knew that the list have better performance than datatable, but i like your example... Very helpful.

Anonymous said...

This is a very good tip. you can also visit my article at ...

sagilas.blogspot.com