Monday, July 12, 2010

You Can Actually Cast To Anonymous Types

Just a quick note, one can really cast to anonymous types!

There's a very interesting quote in the C# specifications regarding anonymous types:

"Within the same program, two anonymous object initializers that specify a sequence of properties of the same names and types in the same order will produce instances of the same anonymous type"


This means this:
var x = new { Name = "Jon Doe", Position = "Manager" };

and this:
var y = new { Name = "Chuck Norris", Position = "The One" };

will produce instances of the same type, hence x.GetType() and y.GetType() will yield the same result.
Now how to cast? For example, if you have a method like the following:

object Test() { 
return new { Name = "XYZ", Position = "ABC" };
}

One way to cast the return of this method is as follows:
var ret = Test();
var retCasted  = Cast(ret, new { Name = "", Position=""});
Console.WriteLine("Name: {0} -- Position: {1}"retCasted.Name, retCasted.Id);

Here's my Cast method:
T Cast<T>(object o, T type)
{
  return (T)o;
}

OK, just that quick tip!

Tuesday, July 6, 2010

List.Find(Predicate <T>) Considered Harmful

Hold on, it's not goto!

I dare to say that every program have ever written on this entire planet needed some sort of searh functionality, and if it didn't, it's probably because it's too lame and frickin' uslelss.

Today I was working on a piece of code that part of it is concerned about finding items in a generic List<T>. The code I wrote was some like this:

var products = ProductsCollection.Find(p => p.Price > 500);

Doesn't that look concise and elegant. For me it does, the problem is, however, this code is not SAFE (yeah, i was surprised too).

When I run this code I got a System.NullReferenceException. WTF is that? ProductsCollection is totally a valid reference, however, it was empty.

OK, wait a second why should searching an empty list throw an exception? The expected result should be null or something but not an exception. After thinking about it a little, I thought, Oh, what if the list contains value types? In such case null is not a valid return type, so an exception makes sense to me now.

Here I thought I really got it and understood that List.Find will throw a null reference exception if called on an empty list. I was totally wrong if you must know. The call would simply return default(T) which is null for reference types. The exception was actually thrown when I used the return of the find!

Now I wondered "OK, now what if i'm searching a list that only contains value types and the item i'm searching for doesn't exist in that list? What would be the result in this case? Well, let's try it out":

List<int> evens = new List<int> { 0, 2, 4, 6, 8, 10};
var evenGreaterThan10 = evens.Find(e => e > 10);
Console.WriteLine(evenGreaterThan10);

What the search returned in this case is the value 0, yes zero! because nothing in the list is greater than 10 so Find will just give you default(T). This can lead to some really nasty bugs. Don't curse, this is frickin' documented you lazy sloths.

Important Note:
When searching a list containing value types, make sure the default value for the type does not satisfy the search predicate. Otherwise, there is no way to distinguish between a default value indicating that no match was found and a list element that happens to have the default value
for the type. If the default value satisfies the search predicate, use the FindIndex method instead.


Trying to be safe in this case you could simply check after finding to see if the returned value is what you actually asked for, something like that:

List<int> evens = new List<int> { 0, 2, 4, 6, 8, 10};
var evenGreaterThan10 = evens.Find(e => e > 10);
if(evenGreaterThan10 > 10)
{
   // valid value
}
else 
{
   // none found    
}

I'm not sure how do you feel about that, but for me, I really hate it! So what I ended up doing is something similar to the well known TryParse style, I overloaded the Find method with an extension method that would allow the usage to be something like this:

List<int> evens = new List<int> { 0, 2, 4, 6, 8, 10};
int i;
if(evens.Find(e => e > 6, out i))
  Console.WriteLine(i);
else 
  Console.WriteLine("None found");

The extension method is as simple as:

public static bool Find<T>(this List<T> list, Predicate<T> predicate, out T output)
{
  output = list.Find(predicate);
  return predicate(output);
}

EDIT:
As reported by Kevin Hall in the comment, this method has a serious bug. Consider the following code segment:
List<int> x = new List<int> { -10, -8, -6, -4 };
int myResult = -9999;
bool resultFound = x.Find(e => e > -3, out myResult);

In this case result found would be true and myResult would be zero! A yet better way to do this is by making use of the FindIndex method like so:

public static bool Find<T>(this List<T> list, Predicate<T> predicate, out T output)
{
  int index = list.FindIndex(predicate);
  if (index != -1)
  {
    output = list[index];
    return true;
  }
  output = default(T);
  return false;
}

Thanks Kevin for pointing that out.

I'm much happier now, I can drop the suicide thought for a while! What do you think dear reader?

Tuesday, March 23, 2010

Usage Of The "is" Operator Should Be Handled With Care

C# provides means to explicitly cast from a type to another type. If you want to cast from float to int you can use the (int) cast operator to acheive that. This operator simply says to the compiler, I know you don't like this, but, please let the runtime try to do the cast. This operation can either succed or result in an System.InvalidCastException to be thrown.

In addition to this, you can overload the explicit cast operator in case you wanted the cast to happen by your own defined rules. Ok let's see an example of this. Suppose that we have two types, Human and Employee, and in our very unfair world, an Employee is not a Human! The layout of these two classes might look like this:

public class Human 
    {
        public string Name { get; set; }       

    }
    
    public class Employee 
    {
        public string Name { get; set; }

        public string Job { get; set; }

        public override string ToString()
        {
            return string.Format("Employee: {0} is {1}", Name, Job);
        }
    }

Now let's say that you want to support an explicit custome conversion from Humans to Employees -for a fictious rule, let's say that every human is unemployed employee. The human class after adding the conversion operator should look like this:
public class Human
        {
            public string Name { get; set; }
            public static explicit operator Employee(Human h)
            {
                return new Employee()
                {
                    Name = h.Name,
                    Job = "Happily Unemployeed"
                };
            }
        } 
 

You can now try to cast your Humans to Employees, and see if the cast is really applying your rules, here's how I might attempt to cast one of humans to employee:
Human h = new Human {Name = "John"}; 
            Employee s = (Employee) h;
            Console.WriteLine(s);

If you run this code you should see the output on the console screen saying:

Employee: John is Happily Unemployeed.

Now let's see how this plays with the famous "is" operator. The is operator is binary operator with a return type of boolean. What it does, is that it checks to see if the left hand operand is actually of the same type of the right hand operand -By the same type here, I mean, the same as an instance of the same class, or an instance of a derived class, or an instance of a class the implements the right hand operand in case the right hand operand is an interface.

Here's a simple example to see this operator in action:

bool shouldBeTrue = "Hello" is string; // true
bool shouldBeTrueToo = "Hello" is object; // true
bool shouldBeFalse =  "Hello" is ICollection; // false

Now, the interesting part:
bool shouldBeWhat = new Employee() is Human; // ?? guess guess

Pause a minute and think of the above statement. What should the value of "shouldBeWhat" be? True of False? ...

OK, the value of the boolean variable "shouldBeWhat" will actually be false. Yes, Employees are not Humans! Even though you have provided an explicit cast rule that , by the virtue of its existence, states that humans can be employees. "can be" doesn't equal to "is", does it? So, yeah the is operator doesn't take in account your explicit casting operators. So this is the first gottcha!

The second point I wanna mention is that, the "is" operator works by actually performing a cast. Yes it casts and checks if the cast succedes it returns true otherwise, it returns false. A typical usage of the is operator is probably as follows:

if(h is Employee)
            {
                var x = ((Employee) s).Job;
            }

This should look familiar to you, a typical pattern when using the is operator is by checking first if a variable is of a given type, then if it is, cast it to that given type and use it. How many casts does the above code segment contain? 2 is the answer! Yes two, one is obvious in the statment
var x = ((Employee) s).Job; 
and the other one is, yeah you guessed it, the cast performed by the is operator. This is not very ideal, as it simply, adds an overhead of a second cast which should is not necessary. So, what should you do to avoid that second cast?

Use "as" instead of "is":

The "as" operator allows you to do safe casts and aslo avoid the probability of throwing any InvalidCastException, by assigning null to the variable if the cast failed. The following segment is semantically equivalent to the previous segment, but is considered faster and safer:
var x = h as Employee;
if(x != null)
 string job = x.Job;

The above segment is faster because it includes only one cast.

Conclusion:
There's not much to say here, just deal with the is operator carefully, and if possible avoid it and use the better alternative "as" operator.

Sunday, March 14, 2010

New, But Not So Obvious, Features in .NET 4.0

.NET 4.0 came out lately with a lot of new, cool, and somewhat game-changing features. These features include language features like the famous dynamic keyword in C# and this whole dynamic dispatching thing that made possible by the DLR (Dynamic Language Runtime).There are also some additions on the library level, the parallel extensions is an obvious example of that.

In this post I will mention some of the new additions that are not so well propagated.

First The additions to the string class:
The BCL guys are still working on the core. Apparently they are trying to lessen the number of extension methods that you need to write as a complement to some of the very core, and widely used in any application, classes -Pretty much everyone has a StringExtensions, and DateTimeExtensions dlls.

string.IsNullOrWhiteSpace()
The old string.IsNullOrEmpty() was used to check if the string variable is null or if it's equal to the empty string ("", or string.empty)In many cases a string that contains only white spaces is considered to be an empty string. People used to do the following check over and over again:
if(string.IsNullOrEmpty(s) && s.Trim() != string.Empty)
  {
   // do my job;
  }
Now string.IsNullOrWhiteSpace() is designed to save you that extra Trim call.

string.Join()
Prior to .NET 4.0 string.Join was designed to accept two parameters a separator and an array of string, and it was expected to output one single string that contains the strings in the array separated by the separator. The problem with this is, if you have two separate strings and you want to join them together you would have to create an array and insert those two strings inside the array, then call string.Join passing in your separator of choice and the array. A call should look like this:
string first = "first";
string second = "second";
string[] sequence = { first, second };
string joined = string.Join(" - ", sequence);

New overloads had been added to the string.Join method, one of them accepts a params of objects and it automatically calls ToString, so that you don't have to do that extra step of creating an array that holds your string values. Now the call to the method is simplified:
string joined = string.Join(" - ", first, second);

Modern collections support with string methods
If you examine the old overloads of string methods that accept a collection, you shall find that the only collection these methods accept, is array. With .NET 4.0 things are different, now these methods support IList<T>, ICollection<T>, and IEnumerable<T> (yeah, I know, this should have been possible since .NET 2.0). With this support statments like this are possible:
string joined  = string.Join(" - ", stringList.Where(s => s.Length > 3).Select(s=> s));

Second, Lazy<T>

Lazy Loading, is a technique that implies, creating and initializing expensive objects on demand. Most nowadays ORMs follow this technique when fetching data from the database. The Lazy<T> is a new type introduced in .NET 4.0 that enables you to lazily create your instances and validate whether an instance has been created or not without accidently creating it.For example if I have an object named ExpensiveObject like so:

class ExpensiveObject
     {
         public ExpensiveObject() { }
         public ExpensiveObject(string connection)
         {
             Console.WriteLine("Constructing expensive object");
             Connection = connection;
         }
 
         public string Connection { get; set; }
     }
Here's how I would Lazily create an instance of this object:
Lazy<expensiveobject> a = new Lazy<expensiveobject>();

To check if the object has been already created or not, you can use IsValueCreated boolean propery on the Lazy<T> object:
Console.WriteLine(a.IsValueCreated);
This should print false. If I tried to access the underlying expensive object (through the Value propery on Lazy<T>) and then check to see IsValueCreated, the result should be true:
string  dummy = a.Value.Connection; 
Console.WriteLine(a.IsValueCreated);

This statement should print True on the console window.

Note: In the above example we created an instance of ExpensiveObject using the default constructor. You can create it using a custom consructor by passing in a Func (i.e. any method that returns an expensive object)
Lazy<expensiveobject> custom = new Lazy<expensiveobject>(() => new ExpensiveObject("My Connection"));

Hope this helps!

Monday, March 8, 2010

Programmers And The Value Of Hard Focus

There was a very interesting discussion on reddit a few days ago. Some fresh programmer asked how many pure work hours do programmers work per day?


By pure work he meant, the number of hours spent on (coding/designing/testing, etc.) a task that is probably assigned to him by one of his superiors. Other activities like personal researching, blogging, etc. are not counted.

The question sound pretty interesting to me and the answers came much more interesting.

Most of the answers implied that programmers put less that four hours of pure work per day. This is half the assumed time (8 hours for most companies).

So, why do programmers work less than most of other carriers' professionals? An immediate answer that might came to my mind "well, that's normal because they are sloths lazy." 


False Assumption:
 If I don't work HARD I will eventually fail to finish my work, and will certainly feel incompetent.

There's a lot to programming than HARD work: 
For me, it doesn't matter how many hours I work during the day. What matters most, however, is the number of tasks I manage to finish. This is ths only productivity metric to me, and it's not related to the number of working hours, it's related to the number of hours (or minutes) I managed to keep my focus hard. 

The value of Hard Focus: 
Haruki Murakami in his excellent memori (What I Talk About When I Talk About Running) noted: 
If I’m asked what the next most important quality is for a novelist, that’s easy too: focus the ability to  concentrate all your limited talents on whatever’s critical at the moment. Without that you can’t accomplish anything of value.

The key is to concentrate all your mental abilities on one specific task and foreget about anything else as long as you're working on this one task. This is very important -actually a lot of programming techniques and tools are proven usefeul and highly adopted because it lets you forget about other details and foucs on your specific task.

Luckily enough, Marukami pointed that: "Fortunately [sustaining focus for a long period of time] can be acquired and sharpened through training."

Pomodoro
The pomodoro technique is designed to address this very problem. It's mainly focused on helping you stay focused by working for shorter periods of time, named pomodoros, each is 25 minutes long and during each you're focused on only one task. And while working on this one task you are not allowed to think or worry about anything else. This really helps you cultivate all your energy on attacking one problem. 

Conclusion: 
The number of working hours is not that crucial in CS-related carriers or any other carrier that requires a non-trivial amout on creativity. What matters most is to keep your mind clear, break your problem into smaller problems, and focus on these smaller problems one at a time.

Now I see (and hopefully you dear reader) what every GTD book meant when they loudly screamed in readers' faces "WORK SMART NOT HARD"!

What do you think dear reader? 

Tuesday, March 2, 2010

Routing in ASP.NET 4.0 : A Sneak Peak

Routing

One of the cool features of ASP.NET MVC is the ability to provide clean, extension less, and SEO/user friendly urls. This is accomplished by using the new routing system in ASP.NET.
 
Before ASP.NET 4.0, people used to get these clean urls using a technique called UrlRewriting or UrlRewiring. The technique did get the job done, but unfortunately was somewhat complicated and involved the use of third party components.

Now with ASP.NET 4.0, and with the addition of the new Routing System, we can get clean urls, like those we get with MVC, in Web Forms.

To do this we need to first define our routes, seconde register them in the current RouteTable when the application starts.

For example let's assume that we have a page that should display the details of a certain product given its Id.
The url for this action should look something like "MySite.Com/Products/1". Routes have to be registered in your Application_Start in Global.asax. The code should look something like the following:

void Application_Start(object sender, EventArgs e)
 {
         RegisterRoutes(RouteTable.Routes);

 }

  void RegisterRoutes(RouteCollection routes)
     {
            routes.MapPageRoute(
                "ProductDetails", 
                "Products/{id}", 
                "~/ProductDetails.aspx");
}

I'm leveraging the new MapPageRoute method on the RouteCollection class inside System.Web.Routing.
The first argument of the method is the name of the route, in our case "ProductDetails".
The second parameter is the Url pattern for this route which in this case is the string "Products" followed by
the Id of the product.
The {Id} parameter is the name of the actual parameter added to the RouteData collection. This parameter can then be accessed by the page RouteData property like so:
int id = Convert.ToInt32(RouteData.Values["Id"]);

RouteData is a shortcut property of RequestContext.RouteData. The above statment is equivalent to this one:
int id = Convert.ToInt32(HttpContext.Current.Request.RequestContext.RouteData.Values["Id"]);


Note:
This is a very trivial example. The routing system is pretty powerfull. It can enable you to do a lot of neat stuff with it for example, you can generate urls out of the routing values, you can also add regular expression constraints on your route parameters to differentiate what route matches which request.

Wednesday, February 3, 2010

Why this is not possible in C# Generics?


A colleague pointed me to a strange case in C# (not so sure if this actually strange though).
Suppose you have a class Employee. If you want to create a Generic List<> of type Employee, you can simply do:
List<Employee> x = new List<Employee>;
I understand that I need to pass the Employee type to the Generic list so that it knows the required type information about Employee and generates methods that return and accept parameters that are compatible with Employee. 
Now my question is, why isn't it possible to do the following?
Employee x = new Employee();
List<typeof(x)> list = new List<typeof(x)>();
Shouldn't this suffice the information required for List<> to know, in order to create a list? In other words, the type of x which is the type of Employee is now passed as a generic type parameter to List<>, which (as I used to believe) is the same as passing list the type name (in this case Employee). 
I wondered so, I posted a question on Stackoverflow.
If you're lazy enough to check the question, here's my selected answer, and boy it was submitted by Jon Skeet himself: 
typeof(x) doesn't work in C# either to get the type of a variable - it returns a Typereference for the type name, e.g. typeof(string) will return a reference to the Type object associated with the System.String type. That's equivalent to using String.class in Java. (Note that again, that's applying .class to a type name, not a variable name.)
Java generics don't support anything like your final statement either. If you believe they do, please give a sample :)
What you can do in C# is use type inference to do what you want:
public static List<T> CreateListForSampleType<T>(T sample)
{
    return new List<T>();
}
...
Employee x = new Employee();
var list = CreateListForSampleType(x);
Note that there's no reason why C# couldn't be extended to allow something liketypeof(variablename) or List - it's all compile-time type information, after all. However, I can't see that it would meet the team's requirements for usefulness... there are other far more useful features I'd like to see first :)


Thursday, January 21, 2010

A Misleading Name of Computer Science Concept: Dynamic Programming

Have you heard of these words before?
Probably yes!
First let's make a clear (well, not so clear) difference, clearer. Dynamic Programming is not about Dynamic Typing. Dynamic typing is a property of  a particular programming language. Languages like Lisp, Python, and Ruby are all dynamically typed languages. Unlike those, languages like C/C++, Java, and C# are statically typed language. The difference relies on whether the compiler checks certain things (types, overload resolution, etc.)  before running the program or not.
Dynamic Programming on the other hand is something very different. It's an ancient method for solving problems by basically dividing these problems into smaller, and easier to solve problems.

It's mainly focused on addressing these two issues:

  1. Overlapping subproblems 
  2. Optimal Substructures
I think the best way to explain these two fuzzy concepts is by using examples. To start let's see an example of using a famous dynamic programming technique called  Memorization. Assume that we need to implement a function that calculates a Fibonacci sequence. I know this is easy, but let's look at this first implementation:
static int FibClassic(int n, ref int numberOfStepsTaken)
        {
            numberOfStepsTaken += 1;
            if (n <= 1)
                return 1;
            Console.WriteLine("FibClassic called with: {0}", n);
            return FibClassic(n - 1, ref numberOfStepsTaken) + FibClassic(n - 2, ref numberOfStepsTaken);
        }
Note: on the above code I used two counter variables to count the number of times the method executed. I also printed the input on which the method is called every time, just to give you a hint of how dividing a problem can cause the same subproblem to be computed more than once --Subproblem Overlapping, remember! Now let's run the code given the number 6 as input and see what happens:
int z = 0;
            int x = FibClassic(6, ref z);
            Console.WriteLine("{0}: {1}", x, z);

And here's the result of this call:

As you can see the method has been called first with input 6 which is the initial input we passed in, then 5, 4, 3,  2, 2, ... oops!! Can you spot it? the method is being called on the same input more than once! Think about this for a while, pretty logical, yeah!? Our strategy is based on dividing the problem into simpler subproblems. For example to get the 6th item in the Fibonacci sequence we divide the problem into two smaller problems, getting the 5th item, and 4th item and adding them together, then to get the 5th, you should get the 4th and 3rd, etc. The next figure shows how is this working.



As you notice the overlapping happens when solving one part of the problem includes solving another part of the problem, in such a case we can take advantage of this and simply memorize the solution for the overlapped problem and each time we need that result, we don't have to compute it again, we just supply it from wherever we stored it. Here's the modified method to do it:
static int FibFast(int n, ref int numberOfStepsTaken, Dictionary store)
        {
            numberOfStepsTaken += 1;
            if (n <= 1)
                return 1;
            if (!store.ContainsKey(n))
                store[n] = FibFast(n - 1, ref numberOfStepsTaken, store) + FibFast(n - 2, ref numberOfStepsTaken, store);
            return store[n];
        }

If you run that same method on the same input (6) you should get 13 as the result (the same old result) but the number of iterations would be 11 which is about half the number of iterations the first method take. This doesn't seem to be a very huge enhancement, but let's see how the two methods act for bigger numbers. Running the first method of input equals to 30 we get the result  1346269 and number of iterations  2692537. Now running the second method on the same input (30) we get the result 1346269 -which is the same result- and number of iterations 59! That's a HUGE difference!

Now back to core, what does this have to do with the term Dynamic Programming??
Actually, it's a very misleading term, historically it was invented by a mathematician called Bellmen and he was at the time being paid by the US defense department to work on something else. He didn't want them to know what he was doing, so he made up  a name that he was sure it has no clue what it actually meant. Now we have to live with this forever :)

The Zen Of Python

I found these fantastic guiding principles of the design of Python published here by Tim Peters, and  once I saw them, I was like, Man! I ought to share these, so look them up on the site, or if you don't want to leave programming for COWARDS just yet, I'm quoting them here for you ;)  :

The Zen of Python


Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Those quotes are pure gold. One could think of writing them down on a board and keep that board hanged in front of his eyes, right above the computer screen!

Awesome, awesome stuff!

Tuesday, January 19, 2010

Into Git and Loving It

As it is gaining popularity day by day,and as everyone I know likes it, I thought .. well, let's give it a try.
Git is a powerful source control system. It's free, and open source. It's pretty simple to use, it's intended to handle both small and large projects, and it's so freakingly FAST.



Unlike SVN and TFS, Git is not centeralized. This means that every Git clone is a fully fledged repository with all versions and history information available.

Git is written in C (well, mostly) and is available for many platforms. If you are a linux guy you can download Git from here and if you are a Windows fellow then download msygit.

This post is aimed to be a tutorial introduction to Git. Together we will walk through the process of creating the repository, adding initial files to the repository, committing those files, changing them, creating branches, viewing differences, resolving conflicts, merging and finally pushing to the centeralized repository.

Now to get started let's assume that you want to create a simple project which is basically an html file, a bunch of javascript files and a css file. Pick whatever directory you wish to create your initial files. After you're done right click the directory containing your files and select "Git Bash Here" (assuming that you already installed msygit). A command window will show up, to initialize your repository enter:
git init-db
This will create the repository for you (and will also create a default "master" branch). Now enter:
git add -a
to add all the files in this directory to the repository. You can select specific files by specifying the required file name like so:
git add filename.css
The selected files are now tracked by Git to your repsoitory, to commit those files to the repository (the LOCAL repository, remember?) :
git commit -a
This will open vim for you to enter a commit message. Enter your message, and quite vi, then Git will commit your changes to the DB.

Note: 
If you're not familiar with vim or vi, to quite the editor saving the changes type ":wq" -that's colon wq- or ":q" to quit without saving. Or you can specify -m to the git commit command followed by the message you want in double quotes, like so: 
git commit -a -m "my initial commit"

Now try editing some files and then, to view the changes, type:
git diff
This will highlight the changes between your uncommitted version and the last committed version. If you want to see the history of your changes at any time, user:
git whatchanged
or use:
git whatchanged -p
to see the complete differences at each change.

The very cool thing about Git is how easily it enables you to create new branches. For example, if you want to create an expreimental function inside a class file you're working on or inside a javascript file but you want it away from the master branch, you can easily create a new branch, checkout this branch, edit your file, and all the changes will be completely isolated from the master branch. To create a new branch:
git branch testBranch
git branch
The first command will create testBranch for you and the second command will list all the available branches on your repository (by far there should be only testBranch and the master branch "master").

Note: The selected branch has an * displayed before it. 


Now enter :
git  checkout testBranch
to switch to the newly created branch. Try editing somefiles, and show the diffs.
git diff
You should now see the changes you made, commit those changes. And switch to the master branch:
git commit -a
git checkout master
If you look at the files you just edited when you were on testBranch -after switching to the master branch- you would notice that the changes you made while you were on testBranch are gone. However, if you switch to the testBranch again you will see your changes there. If you're happy with the changes you made on testBranch, you can merge those changes to the master branch by using the following command:
git merge testBranch
If there is no confilcts, you're allset. If there are conflicts you will have to resolve them manually and then commit the file. Git will show you which files have conflicts.
Now, if you are done with the test branch and want to delte it, use:
git branch -d testBranch

If you used to be a TFS and VSS guy like myself, take a deep sigh of relief and enjoy Git.
I will try to blog more on Git on the next posts, stay tuned!

Saturday, January 16, 2010

The Selected ORM and Isolation Framework

A while ago I posted two little polls about the favorited isolation framework, and ORM. The results of the two posts were as follows

Which ORM?
1- NHibernate 49%(44 votes)
2- LLBLGEN 34%(31 votes)
3- Entity Framework 6%(6 votes)
4- LINQ to SQL 6%(6 votes)
5- Subsonice 2%(2 votes)

Which Isolation Framework?
1- Typemock Isolator 50%(10 votes)
2- Moq 45%(9 votes)
3- NMock2 5%(1 votes)
4- RhinoMocks 0%(0 votes)
5- Stubs 0%(0 votes)

For myself I voted for NHibernate and Moq. I know all of those tools are purely awesome, though.


So, dear reader, do you have any other suggestions for an ORM or an Isolation Framework?

Thursday, January 14, 2010

A Quick Tip: The Different Classes Of Algorithms

Algorithms are the heart of computer science. They are the thoughts, the ideas, and the most fun part of this industry. Scientists categorize the various known algorithms into 4 classes: Logarithmic, Linear, Quadratic and Exponential. Let's look at those briefly:
1- Logarithmic Algorithms:
   This type is the fastest of those 4 classes. It has a run curve that looks something like this:




Where the x here is the number of items to be processed by the algorithm, and y is the time takes by the algorithm. As you can see from the figure, the time taken increases slowly when the number of items to be processed increase. Binary Search is a perfect example for a logarithmic algorithm. If you recall, binary search, divides the array into two halves and excludes one half each time it does a search. Here's a code example of binary search implementation in C#:
bool BSearch(int[] list, int item, int first, int last)
{
 if(last - first < 2)
             return list[first] == item || list[last] == item;
             
        int mid = (first + last)/2;
        if(list[mid] == item)
             return true;
             
        if (list[mid] > item)
                return BSearch(list, item, first, mid - 1);
 
 return BSearch(list, item, mid + 1, last);
}

2- Linear Algorithms:
  Runs in time, linear to the input items. It's curve looks like the following:



A linear search is a typical example of that, where one would traverse the array or, whatever data structure, item by item. An implementation of linear search looks like this:

bool LinearSearch(int[] list, int item)
{
     for(int i = 0; i < list.Length; i++)
          if(list[i] == item)
              return true;
        return false;
}
3- Quadratic Algorithm:    This one runs time grows to to the power of 2, with each increase in the input sizes, which means while processing 2 items the algorithm will do 4 steps, 3 items will take 9 steps, 4 items will take 16 steps, etc. The next figure shows how a quadratic curve might look like:


A famous example of an algorithm with quadratic growth is Selection Sort:

void SelectionSrot(int[] list)
{
            int i, j;
            int min, temp;

            for (i = 0; i < list.Length - 1; i++)
            {
                min = i;

                for (j = i + 1; j < list.Length; j++)
                {
                    if (list[j] < list[min])
                    {
                        min = j;
                    }
                }

                temp = list[i];
                list[i] = list[min];
                list[min] = temp;
            }
}

4- Exponential Algorithm:
This is the super slow of this list of four. It grows exponentially (that is, for 2 items it takes 2 ^ 2, for 3 items it takes 2^3, for 4 items it takes 2 ^ 4, etc.
Again algorithms are the key to computer science, the most fun part of programmer's  job. Choose your algorithms carefully and always try to improve.

Hope this helps.

Tips For Web Developers: Minify Javascript Using Google's Closure Compiler

A faster web site is a goal for every develoepr. We spend a lot of time optimizing server code and processes, parallelizing stuff, indexing
and trying to enhance database performance. The goal behind all of these complicated actions is the ultimate goal: A Faster Site.
These actions are necessary and handy to decrease your site response time, however, there are a lot of stuff that we, developers, usually ignore. Those are the optimizations required to happen on the client side. These client side optimizations are as important (probably more imprtant) as the server side optimizations.
There are a lot of techniques to optimize the client side performance. These techniques include:


  • Making Less HTTP Requests
  • Optimizing JavaScript
  • Optimizing CSS


And many more. For full details about all the possible techniques, check out Google's page speed initiative.
In this post I will focus on optimizing javascript by minifying js files. For this I leverage Closure Compiler, which is a very awesome tool to optimize Javascript developed by Google.
Now suppose that I have an html document that looks like the following:




This document shall do a very little job actually. Simply a user keys his name in the textbox and clicks salute me which we will display a simple alert saying hello to this user.

Below are two buttons -hide, and show. As the name of each button implies, the hide button will hide the area including the label, text box, and the Salute me button.

To do this I'm gonna make use of jQuery 1.3.2. Here's the code needed to achieve the required functionality:
/*
 The follwoing code is not part of jqueyr framework
*/
$(document).ready(function() { 
 $('#hiFiver').click(function() { 
  var userName = $('#txtName').val();
  alert ("Hello " + userName); 
 });
 
 $('#showButton').click(function() { 
  $('#hidden').show();
 });
 
 $('#hideButton').click(function() { 
  $('#hidden').hide(); 
 });
});
 For the sake of this demo I will append my code at the end of the actual jQuery code itself. Now my app is working as required, and my Javascript file  size is 122 KB.

 Now let's run Closure Compiler and try to minify this.
 Closure Compiler is a java application, which means that you will need jre (Java Runtime Engine) to run it. You can download it from here

 Now I got my files (sample.html, script.js, compiler.jar) in one directory. To run the compiler, launch your terminal, and enter the follwoing command:

 java -jar compiler.jar --js script.js --js_output_file scriptMini.js

 Check your directory. You should find a new file with the name "scriptMini.js" created. The new script file is 55 KB in size, which is less the half size  of the original file. To make sure it's working, change the script src attribute in the sample document to point to the new file. You should see that  the app is still functioning  properly.

 If you examin the newly created file, you will find that it doesn't contain any comments or spaces, all the optional semi-colons are removed, variable names are  changed to shorter names (usually one character long).

 The closure complier is a fascinating tool, it's really handy and easy to use.
 From now on you should develop the habit of always minifing your Javascript files.

Wednesday, January 13, 2010

Why is O(n) Is Pronounced "Big Oh" of n And Why Is "Geek" So Close To "Greek"?



This one is really short, Today, some guy on the internet asked why is the asymptotic form O(n) is pronounced Big Oh? I instantly answered "because we are using the capital letter O to write it".

This answer is wrong, the letter used is the Greek letter Omicron.
So that's why it's pronounced Big Oh!




And the same reason applies to Big Theta, and Big Omega, because of using the Greek letters Theta and Omega represented in the above picture (the second and third figure respectively). It's all coming from the Greeks buddy!
And probably this is also why the word Geek is so close to the word Greek in spelling, I'm not sure about this one though.


Remove Duplicate Items From an Array - A Classic Puzzle

 Today I cam a cross a kinda cool problem. A friend of mine who is at the same time a colleague working with me at the same office was trying to remove a duplicate item from an array of positive integers.
 The array has n items all unique except only one item. We need to come up with an algorithm that tells us which item is duplicated in maximum time of
 O(n) where n is the length of the array.

 To do that, let's first come up with some (less efficient) working algorithms. Here's the first one:

 We loop over the array starting from the item at index 0, then we traverse the remaining part of the array (1 : n-1) searching for the first item
 The C# code for this algorithm looks like the following:
  static int FirstDuplicate(int[] arr)
         {
             for(int i = 0; i < arr.Length - 1; i++)
             {
                 for(int j = i + 1; j < arr.Length; j++)
                 {
                     if (arr[i] == arr[j])
                         return arr[i];
                 }
             }
             return -1;
        }
 As you can see, this algorithm is pretty bad. Foreach item in the array an inner loop is initiated to linerally look for that specific element in the
rest of the array. If you do the math, you shall find that this algorithm runs in O(n2) order of growth.

One way to improve this is to use an extra HashSet to store the items, and then look up each item in the HashSet. This is considered improvement as the
lookup inside the HashSet is really fast.
Here's the code in C#:

 static int FirstDuplicateWithHashSet(int[] arr)
        {
            HashSet hashHset = new HashSet(); 
            for(int i = 0; i < arr.Length; i++)
            {
                if (hashHset.Contains(arr[i]))
                    return arr[i];
                hashHset.Add(arr[i]);
            }

            return 0;
        }
This is pretty good, but still not O(n).
The next algorithm is quite tricky. The idea simply is to create a second array and insert each elemnt in the first array at an index equivalent to its
value in the second array.
For example if we have a list of 5 items [2, 4, 5, 2, 6], where the item 2 is duplicated at the 0 index and third index, and the maximum value in this
array is 6. Now to find the duplicates in this list we create a second list with length 6 ( the length of the second array equals the maximum value in
the first array). After creating this second list we loop over the first array take the first item (2 in our case) and insert it at the index 2 in
the second array, then we take the second item (which is 4) and insert it at the 4th index in the second array, and so on. Each time we try to insert
an item in the second array we check if it has a value first, if it is, then this item is duplicated.
here's how the code would look like:

static int FirstDuplicate(int[] arr, int maxVal)
        {
            int[] temp = new int[maxVal+1];
            for(int i =0; i < arr.Length; i++)
            {
                if (temp[arr[i]] == arr[i])
                    return arr[i];
                temp[arr[i]] = arr[i];
            }
            return 0;
        }
 Note: The method expects the maximum value as an input, however, if you don't get the maximum value you can create the temp array with a size
 equal to  int.MaxValue (which is not a good idea)

 This algorithm is probably not realistic but it doesn run in O(n) time.
 One more trick to add here, if you know the range of the items in the array (e.g. from 1 to 10) you can get the sum of the numbers of the array
 then subtract from it the sum of the numbers from 1 to 10, the remainder is the duplicated value.

 That was a quick tip that I thought is cool and wanted to share with ya! So what do you think dear fellows? Do you know of any possibly better
 algorithms? Do you suggest any optimization to the current ones?

Wednesday, January 6, 2010

Scalability Tip In ASP.NET And The MachineKey Element

Yesterday, I came a cross a pretty annoying problem with a web application I'm working on nowadays.
In short, the app is an ASP.NET MVC 1 app, that uses forms authentications to handle users logins.
The app was working just fine, but when I started to scale the app and deploy it to more servers in the server farm, the weired behavior started to show up. When a user logs in to the application, the application preforms the operation successfully, logs in the user to the system and takes him to his personalized page. That sounds normal, however, if the user navigated to another part of the application (or just refreshes the current page), the application no longer recognizes him as a logged in user! If he refreshes two or three times, the app will see him as a logged in user again, a few more refreshes and he's nor more logged in and so on.

Similar Problem: 
I encountered  a similar problem before, but the other one was because of accidental session expiration, and this happened because I was saving SessionState InProc, which (as you may have guessed) will be stored on the server memory (i.e. will not be shared between all server in the web farm).

This Problem: 
This problem is different because I'm not using SessionState at all. I'm just using cookies and you know that cookies are stored on the client, and is sent to the server with every request, so it will be sent to all servers (i.e. all servers should be able to read the cookie and determine if the user is logged in or not).

How Cookies Are Written: 
This got me thinking, the problem must be with the cookie itself. It seems like some servers can read the cookie successfully, and some can't. Why would that be?!!

Different Encryption/Decryption Key/Algorithm:
Aha .... Cookies are encrypted before they are written on the client and decrypted before they are read again by the server. When the server that served the login request wrote the cookie, it encrypted it first (using an AutoGenerated encryption key and its chosen encryption algorithm. So apparently the chosen encryption keys and/or algorithms are different across the severs!

The Solution: 
The solution is quite simple actually. All I need to do is to ensure that all the servers use the same encryption algorithms and keys.
This can be done by explicitly specifying the keys and algorithms in web.config inside the machineKey  tag. 
It should look something like this:



PS: The keys lengths depend on the algorithms selected

  • For SHA1, set the validationKey to 64 bytes (128 hexadecimal characters).
  • For AES, set the decryptionKey to 32 bytes (64 hexadecimal characters).
  • For 3DES, set the decryptionKey to 24 bytes (48 hexadecimal characters).


The keys can be generated whatever way you like. Here's a simple function for generating these keys: 




static string GenerateKey(int requiredLength)
    {
        byte[] buffer = new byte[requiredLength / 2];
        RNGCryptoServiceProvider rng = new
                                RNGCryptoServiceProvider();
        rng.GetBytes(buffer);
        StringBuilder sb = new StringBuilder(requiredLength);
        foreach (byte t in buffer)
            sb.Append(string.Format("{0:X2}", t));
        return sb.ToString();
    }





For more information about the tag see here, and here to how to configure it, and if you wanna scroll a full page see this for recommendations about deploying to server farms. 

Hope this helps.

Mix 2010 and My Chosen Sessions

Mix 2010 is open for public!



Microsoft is using a different strategy for a major conference (Mix) this year. Developers and designers can now submit their sessions and those sessions will be voted up by the community. The chooses sessions will be included in the conference.
Here's a list of all the sessions available for voting. The voting started yesterday, Jan 5th. and will last for for 10 days. The selected sessions will be announced Jan 18th.
Go ahead and vote for your session of choice. A lot of sessions out there? Would you like me to recommend some sessions to you?
O.K I will ....
Here are the sessions that I voted for:

Go ahead now, visit VisitMix.com and vote!