Wednesday, November 16, 2011

C# 3.0 for Beginners - Learning LINQ - An Overview


 LINQ was introduced by Microsoft with the objective to reduce the complexity of accessing and integrating information. With the LINQ project, Microsoft has added query facilities to the .Net Framework that apply to all sources of information, not just relational or XML data. Everyday programmers write code that accesses a data source using looping and/or conditional constructs etc. The same constructs can be written using query expressions that are far lesser in code size. LINQ makes it possible to write easily readable and elegant code. The examples that follow will imply how easily understandable LINQ code can be.
 LINQ defines a set of standard query operators that you can use for traversal, filter and projection operations. These standard operators can be applied to any IEnumerable<T>based information source. The set of standard query operators can be augmented with new domain-specific operators that are more suitable for the target domain or technology. This extensibility in the query architecture is used in the LINQ project itself to provide implementations that work over both XML (LINQ to XML) and SQL (LINQ to SQL) data. Lets write some code to understand the query operators in more detail:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Diagnostics;
 /// <summary>
/// Using standard query operators.
/// </summary>
public static void GetIExplorer()
{
    //  1. Data Source
    Process[] processes = Process.GetProcesses();
     //  2. Query Creation
    IEnumerable<int> query = from p in processes
  where p.ProcessName.ToLower().Equals("iexplore")
                             select p.Id;
     //  3. Query execution
    foreach (int pid in query)
    {
        Console.WriteLine("Process Id : "+pid);
    }
}
  All LINQ query operations consist of three distinct operations:

  1. Identify the data source
  2. Query creation
  3. Query execution

Calling the method would show you the currently running Internet Explorer processes (their process ids). The heart of the method lies in the following statement of our program.

IEnumerable<int> query = from p in processes
                         where p.ProcessName.ToLower().Equals("iexplore")
                         select p.Id;
 The expression on the right hand side of this statement is called the query expression. The output of this expression is held in the local variable ‘query’. The query expression operates on one or more information sources by applying the query operators from the standard or domian specific set of query operators. We have used standard query operators here namely where and select.
 The from clause select the list of processes which becomes the input for the where operator which filters the list and selects only those elements that satisfy the condition specified with the where operator. The selected elements are then processed by the select operator that determines any specific information selection for each element.
 The above statement can also be written using explicit syntax as shown below:

IEnumerable<int> query = Process.GetProcesses()
               .Where(s => s.ProcessName.ToLower().Equals("iexplore"))
               .Select(s => s.Id);
This form of query is called a method-based query and the arguments to the query operators are called lambda expressions. They allow query operators to be defined individually as methods and are connected using the dot notation. I will deal with lambda expressions in my following posts.