Comparing Uppercase and Lowercase in SQL Server

Occasionally it is necessary to really tell whether or not “THIS” is equal to “this” or “This”, including the cases.  For me, this came up when trying to assess whether a name in the database was already in mixed case or not.  If it was upper, I needed to convert it to mixed case, but if it was already in mixed case, I wanted to leave it alone.

If you just do a straight compare, case is not taken into account.  So, we need to somehow compare the actual character values of these.  It turns out that if we cast them to varbinary values, the uppercase letters do get different values than the lowercase letters that we can compare and determine the difference.

So, in my case, I decided that if the first character was in lower case or the second character was in upper case, I was probably dealing with a name that was in either all caps or all lowercase.  This is how I tested for that:

SELECT *
FROM Person
WHERE CAST(SUBSTRING(LastName, 1,1) as varbinary) = CAST(SUBSTRING(LOWER(LastName), 1,1) as varbinary)
	OR CAST(SUBSTRING(LastName, 2,1) as varbinary) = CAST(SUBSTRING(UPPER(LastName), 2,1) as varbinary)

Basically, it checks to see if the varbinary representation of the first character is equal to the varbinary representation of the uppercase value of that character OR if the varbinary representation of the second character is equal to the varbinary representation of the lowercase of that character.

This allowed me to focus in on converting only the names that were not already in mixed case/camel case.

This also works well in a function:

CREATE FUNCTION [dbo].[IsUpperCase]
(
	-- Add the parameters for the function here
	@Char char
)

RETURNS BIT WITH SCHEMABINDING
AS
BEGIN

	DECLARE @IsUpperCase BIT

	if (CAST(@Char AS VARBINARY) = CAST(UPPER(@Char) AS VARBINARY))
		SELECT @IsUpperCase = CAST(1 AS BIT)
	ELSE
		SELECT @IsUpperCase = CAST(0 AS BIT)

	RETURN @IsUpperCase
END

CREATE FUNCTION [dbo].[IsLowerCase]
(
	-- Add the parameters for the function here
	@Char char
)

RETURNS BIT WITH SCHEMABINDING
AS
BEGIN

	DECLARE @IsLowerCase BIT

	if (CAST(@Char AS VARBINARY) = CAST(LOWER(@Char) AS VARBINARY))
		SELECT @IsLowerCase = CAST(1 AS BIT)
	ELSE
		SELECT @IsLowerCase = CAST(0 AS BIT)

	RETURN @IsLowerCase
END

Unexpected benefits of Precompilation of LINQ

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine

I once had a manager who told me – I can solve any maintenance problem by adding a layer of abstraction.  I can solve any performance problem by removing a layer of abstraction.

I think LINQ to SQL is a wonderful way to abstract the persistence layer elegant, easy to use, easy to manipulate, and easy to maintain lines of code.  Instead of writing SQL which amounts to “how to retrieve” the data – you manipulate an expression tree that gets closer to specifying “what data I want”.  The upside of this is huge – you can change the expression tree at any level of your code, and let .NET decide how to best write the SQL at the last possible moment – which effectively gits rid of dealing with intermediate results and inefficiently written SQL.  Unfortunately, this abstraction does indeed cause a performance hit – the translation/compilation of the tree to SQL – and it’s probably much bigger than you would think.  See http://peterkellner.net/2009/05/06/linq-to-sql-slow-performance-compilequery-critical/ to see what I mean.  In my analysis (using ANTS Profiler), when using uncompiled LINQ – the performance hit is usually about 80% compilation and only 20% retrieving the data!  Thankfully, .NET does allow you to precompile a LINQ query and save the compilation to use over and over again.

Your natural tendency when hearing those kind of numbers might be to precompile every single LINQ query you write.  There’s a big downside to doing that, though – you lose the ability to manipulate the compiled query in other parts of your code.  Another downside is that the precompilation code itself is fairly ugly and hard to read/maintain.

I’m a big believer in avoiding “premature optimization”.  What happens if you precompile everything, and in a version or two Microsoft resolves the issue and caches compilations for you behind the scenes?  You have written a ton of ugly code that breaks a major benefit of LINQ to SQL and is totally unnecessary.

Instead, I recommend you go after the low hanging fruit first – precompile the most frequently accessed queries in your application and the ones that gain no benefit from manipulating the expression tree.  In the applications I work on – there is a perfect case that fits both of these – the “get” method that returns the LINQ object representation of a single row in the database.  These are hit quite often – and there is absolutely no case where the expression tree is further refined.

The old way it was written:

	public static Item Get(int itemid) {
		return (from i in DataContext.Items
			where i.ItemID == itemid
		select i).First();
	}

The new way with Precompiled LINQ:

	private static Func<ModelDataContext, int, Item>
		GetQuery = CompiledQuery.Compile(
			(ModelDataContext DataContext, int itemid) =>

				(from i in DataContext.Items
					where i.ItemID == itemid
				select i).First()

				);

	public static Item Get(int itemid) {
		return GetQuery.Invoke(DataContext,itemid);
	}

Applying this fairly simple change to the application, I’d estimate we got 80%+ of the benefits of compiled LINQ, at the expense of a few extra lines of code per object/table and absolutely no loss of the tree manipulation.

Internationalization Data Storage in .NET (Part 1)

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine

.NET has many built in structures for setting up your web application to support multiple languages. Most of which reside in the System.Globalization namespace. Within the System.Globalization namespace you get access to the CultureInfo class. This class defines how datatypes such as Dates, Currency and other culture related objects are displayed.

For example if the CultureInfo object is set to “mx-es” (spanish) the date would be displayed dd/mm/yyyy instead of mm/dd/yyyy. You can manage all the output settings by setting their respective properties in the CultureInfo class.

Now that we have a very basic idea of what the built in .NET tools provide us in terms of displaying internationalized data lets take a look at the solutions we have for storing globalized data.

So lets say we have a database table called Product with a field called “Description” we want to internationalize:

ProductID Name Description Date Added
1 Vista If you own or run a small business, you’ll want Windows Vista Business. 6/22/2009 3:57:07 PM

To store the multiple langages we could add a columns for each language to the table structure and subsequently update all related dependencies. Storing values in this manner is something we want to avoid here at LANIT.

Another option would be to build a look up table that stores translated values and replaces our description with a reference. This method would allow for efficiently searching and indexing but would increase the complexity of our database structure.

In our situation we decided to store every language into an XML document. The benefits of doing it this way is it allows us to add infinitely many translated versions of the product name without affecting are database structure. The draw back of using this method is the inability to efficiently search through the database on the internationalized column(s).

Let’s say we wanted to store English/French/Spanish/German translations of our product description we could do so by generating an XML file that looks similar to this.

<?xml version="1.0" encoding="utf-8" ?>
<localizedString>
<value language="en-US"> If you own or run a small business, you'll want Windows Vista Business.</value>
<value language="es-MX">Si usted es dueño o ejecutar un pequeño negocio, usted desea que Windows Vista Business.</value>
<value language="fr-FR">Si vous êtes propriétaire ou de lancer une petite entreprise, vous souhaitez que Windows Vista Business.  </value>
<value language="de-DE">Wenn Sie selbst oder ein kleines Unternehmen, Sie wollen Windows Vista Business.</value>
</localizedString>

Each value element represents a language and has an attribute called language which is set to a culture name. A list of culture names can be found here MSDN Culture Names

Updated Table Data:

ProductID Name Description Date Added
1 Vista <?xml version=”1.0″ encoding=”utf-8″ ?> 6/22/2009 3:57:07 PM

This provides us with a data storing model that will fit very easily into the current database structure and also provide us the ability to scale to as many languages as we see fit in the future.

That is it for part 1 . In part 2 I will discuss building a class/datatype that will get,set and build the above XML document for storing our translated values.

Avoid static variables in ASP.NET

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine

Occasionally, I like to write static methods on classes. They are useful whenever the method loosely relates to to the class – but it doesn’t involve a specific instance of the class.

A good example of a static method in the .NET library is the int.Parse(string val) method. This method basically reads a string and tries to return an integer representation. The method logically fits with the int Class (I suppose a string could have a ToInt() instance method…but that would be overwhelming as you added ToFloat(), ToDecimal(), To…()) – but when you run it, there’s no reason to have an instance of int already. You run it to create a new instance of int.

Commonly, I follow similar logic in the business layer of my webapps. I write a static Add(param1,….,paramx) method that returns an instance of the class after it is persisted to the database. I could accomplish this other ways, but I think the resulting application code reads better like:

      User.Add(username,password);

Than:

      new User(username,password).Add();

or, worse:

      DataContext.Users.InsertOnSubmit(new User(username,password));
      DataContext.SubmitChanges();

The issue with static methods in your business logic is that you often need a common object to talk to the database through: a SqlConnection, a TableAdapter, A LINQ DataContext, etc. I could certainly create those objects locally inside each static method, but that’s time consuming and hard to maintain. I want instead to define a common property (in the business class or a parent class of it) that lazy-initializes and returns an instance of the object when I need it. The problem is that the property must also be static for a static method to have access to it.

The easiest way to accomplish this is something like:

      private static ModelDataContext dataContext=null;
      protected static ModelDataContext DataContext
      {
            get
            {
                 if(dataContext==null)
                     dataContext = new ModelDataContext();
                 return dataContext;
             }
       }

The tricky thing is that this will probably work in development and testing, until you get some load on your website. Then, you’ll start seeing all kinds of weird issues that may not even point to this code as the problem.

Why is this an issue? It’s all about how static variables are scoped inside a ASP.NET application. Most web programmers think of each page in their application as its own program. You have to manually share data between pages when you need to. So, they incorrectly assume that static variables are somehow tied to a web request or page in their application. This is totally wrong.

Really, your whole website is a single application, which spawns threads to deal with requests, and requests are dealt with by the code on the appropriate page. Think of a page in your webapp as a method inside of one big application, and not an application of its own – a method which is called by the url your visitor requests.

Why does this matter? Because static variables are not tied to any specific instance of any specific class, they must be created in the entire application’s scope. Effectively, ASP.NET static variables are the same as the global variables that all your programming teachers warned you about.

That means that, for the property above, every single request/page/user of your website will reuse the first created instance of DataContext created. That’s bad for several reasons. LINQ DataContexts cache some of the data and changes you make – you can quickly eat up memory if each instance isn’t disposed fairly quickly. TableAdapters hold open SQLConnections for reuse – so if you use enough classes of TableAdapters, you can have enough different static vars to tie up all of your db connections. Because requests can happen simultaneously, you can also end up with lots of locking/competing accesses to the variable. Etc.

What should you do about it? In my case, I take advantage of static properties that reference collections that are scoped to some appropriately short-lived object for my storage. For instance, System.Web.HttpContext.Current.Items:

      protected static ModelDataContext DataContext
      {
            get
            {
                 if(System.Web.HttpContext.Current.Items["ModelDataContext"]==null)
                     System.Web.HttpContext.Current.Items["ModelDataContext"] = new ModelDataContext();
                 return (ModelDataContext)System.Web.HttpContext.Current.Items["ModelDataContext"];
             }
       }

In this case, each instance of DataContext will automatically be disposed for each hit on the site – and DataContexts will never be shared between two users accessing the site simultaneously. You could also use collections like ViewState, Session, Cache, etc (and I’ve tried several of these). For my purposes, the HttpContext.Items collection scopes my objects for exactly where I want them to be accessible and exactly how long I want them to be alive.

Using Custom Properties Inside LINQ to SQL Queries

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine

One thing that initially caused me some trouble with Linq to SQL was that properties
and functions defined on the object cannot be translated to SQL. For example, this code throws an
“The member ‘Table1.DisplayName’ has no supported translation to SQL.” exception when executed.

public string DisplayName
   {
      get
      {
         return this.NameFirst + " " + this.NameLast;
      }
}

public static IQueryable<Table1> GetAll()
{
   return (from t in DataContext.Table1s
             select t).OrderBy(t => t.DisplayName);
}

One simple solution is to just define your sort option each time it needs used.

public static IQueryable<Table1> GetAll()
{
   return (from t in DataContext.Table1s
             select t).OrderBy(t => t.NameFirst + " " + t.NameLast);
}

This option is pretty tedious to write and maintain, especially if you’re going to be using the function often.

Another option is to define a System.Linq.Expression, which is a lambda expression that can be converted to SQL.

static Expression<Func<Table1, string>> DisplayNameExpr = t => t.NameFirst + " " + t.NameLast;

public static IQueryable<Table1> GetAll()
{
   return (from t in DataContext.Table1s
             select t).OrderBy(Table1.DisplayNameExpr);
}

Note that if you do choose this way, the “DisplayName” is essentially defined in two different places. To solve this problem, use the expression to define the property.

{
   get
   {
      var nameFunc = DisplayNameExpr.Compile(); // compile the expression into a Function
      return nameFunc(this); // call the function using the current object
   }
}

Note this method can also be used to define functions that accept arguments,  so the following code also works as expected.

static Expression<Func<Table1, bool>> HasReq(int numReq)
{
   return (t => (t.IntItem + t.AnotherInt) > numReq);
}

public static IQueryable<Table1> GetWithReq(int req)
{
   return (from t in DataContext.Table1s
             select t).Where(Table1.HasReq(req));
}