Wednesday, March 28, 2012

DataTable vs List


  Recently just make in depth research about DataTable. DataTable is easy to use variable which allows user to pool data into it with consistence format, embed with other properties that allow us to do sorting, search etc. Currently I am researching on high performance application, which need to store large amount of data into memory. Soon I hit a bottleneck with DataTable. When inserting more than 5 million of records into DataTable, my desktop starts to struggle, even with 5G Ram So I need to find an alternative.

  The main purpose of my tool  is analyzing large amount of data. It uses to compare ID and do heavy calculation on selected field. After do some research, I found list can on par with DataTable’s needed function, at the same time increase efficiency.
So lets prove it!

DataTable button click event
private void button3_Click(object sender, EventArgs e)
        {
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
            DataTable dt = new DataTable();
            dt.Columns.Add("a");
            dt.Columns.Add("b");
            dt.Columns.Add("c");
            dt.Columns.Add("d");
            dt.Columns.Add("e");
            long i = 0;
            while (i < 1000000)
            {
                dt.Rows.Add("a" + i.ToString(),"b" + i.ToString(), "c" + i.ToString(), "d" + i.ToString(), "e" + i.ToString());
                i++;
            }
            stopwatch.Stop();
            MessageBox.Show(stopwatch.Elapsed.TotalMilliseconds.ToString());
        }

List button click event
private void button4_Click(object sender, EventArgs e)
        {
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
            List<List<lst>> lists = new List<List<lst>>();
            long i = 0;
            while (i < 1000000)
            {
                List<lst> list = new List<lst>();
                lst superlist = new lst
                {
                    field1 = "a" + i.ToString(),
                    field2 = "b" + i.ToString(),
                    field3 = "c" + i.ToString(),
                    field4 = "d" + i.ToString(),
                    field5 = "e" + i.ToString()
                };
                lists.Add(list);
                i++;
            }
            stopwatch.Stop();
            MessageBox.Show(stopwatch.Elapsed.TotalMilliseconds.ToString());
        }
        class lst
        {
            public string field1 { get; set; }
            public string field2 { get; set; }
            public string field3 { get; set; }
            public string field4 { get; set; }
            public string field5 { get; set; }         
        }

Benchmark result:
datatable memory usage


list memory usage


time taken in milisecond, for DataTable

time taken in milisecond, for List

List is 4X faster then datatable
DataTable uses 6X more memory then list

Conclusion:
there are some performance tweak can be done on source code.