String.ToLower and the dotless i

13 Aug 2012 by Dennis Geerling

Category:

    Development

String.ToLower and the dotless i

Table of Content

A project I am working on uses the following C# code to compare strings: This code works perfectly except when users use the Turkish regional settings. Suddenly the code stops working when FieldName = “Issued”;

The case “issued”: section is never hit. So I start troubleshooting. That is weird. FieldName to lower case is ıssued but that doesn’t equal issued. What is going on here? Let’s have a look at the ASCII codes for these two strings. Wait a second. The first character is ASCII code 305? It turns out that the Turkish alphabet distinguish between a dotted and dotless i. ASCII code 305 is a lowercase dotless i.

Looking at the ASCII codes for both words makes it clear why the case section was never hit. The fix was fairly easy at that point: The InvariantCulture parameter is essentially telling the ToLower function to ignore language/country/region culture information when converting to lower case. The documentation says the following about the InvariantCulture property, emphasis mine.

The InvariantCulture property represents neither a neutral nor a specific culture. It represents a third type of culture that is culture-insensitive. It is associated with the English language but not with a country or region. Your applications can use this property with almost any method in the System.Globalization namespace that requires a culture. However, an application should use the invariant culture only for processes that require culture-independent results, such as formatting and parsing data that is persisted to a file. In other cases, it produces results that might be linguistically incorrect or culturally inappropriate.

If you zoom in to the result returned from FieldName.ToLower(); you will see that the i actually lacks a dot. I overlooked that in my troubleshooting and got stuck wondering why issued would not equal issued.

More information about the dotless i: http://en.wikipedia.org/wiki/Dotted_and_dotless_I http://webdesign.about.com/od/localization/l/blhtmlcodes-tr.htm

Tags:
    c#
    development
    i
    turkish regional settings
Dennis Geerling
Written by Dennis Geerling

Dennis is a troubleshoot guru and has over 10 years of EUC experience. Dennis is a proud Dutch man, but currently living the American dream in sunny California.

Search

    Follow me

    Community