patterncsharpModerate
Removing accents from certain characters
Viewed 0 times
fromremovingcharacterscertainaccents
Problem
I have a method that I am using to remove accents from certain characters. The problem is the massive slew of characters I am expected to work with. I have to, basically, remove accents from all Latin characters that fit within the 26 English Latin characters. (
Currently, I use a gigantic
As of now, my
As you can probably imagine, this method is over 200 lines, and this is the only thing it does.
Literally, that is it. My questions come down to the following, and this is more of a question of performance/better ways of handling the situati
A through Z.) Performance is a very large requirement. It has to be lightning fast, as I have to run this on every character within a string, and process many large strings at a time.Currently, I use a gigantic
switch statement to detect what character it is, and return the appropriate A through Z "naked" character, while preserving case.As of now, my
switch looks something like the following:switch (input)
{
case 'À': // 0192
case 'Á': // 0193
case 'Â': // 0194
case 'Ã': // 0195
case 'Ä': // 0196
case 'Å': // 0197
case 'Ā': // 0256
case 'Ă': // 0258
case 'Ą': // 0260
return 'A';
case 'Ç': // 0199
case 'Ć': // 0262
case 'Ĉ': // 0264
case 'Ċ': // 0266
case 'Č': // 0268
return 'C';
case 'Ď': // 0270
case 'Đ': // 0272
return 'D';
// Other upper case characters
case 'à': // 0224
case 'á': // 0225
case 'â': // 0226
case 'ã': // 0227
case 'ä': // 0228
case 'å': // 0229
case 'ā': // 0257
case 'ă': // 0259
case 'ą': // 0261
return 'a';
case 'ç': // 0231
case 'ć': // 0263
case 'ĉ': // 0265
case 'ċ': // 0267
case 'č': // 0269
return 'c';
case 'ď': // 0271
case 'đ': // 0273
return 'D';
// Other lower case characters
default:
return input;
}As you can probably imagine, this method is over 200 lines, and this is the only thing it does.
private char RemoveAccent(char input)
{
switch (input)
{
// You saw all the case statements
}
}Literally, that is it. My questions come down to the following, and this is more of a question of performance/better ways of handling the situati
Solution
This appears to be a duplicate of this question. The link suggests using .NET's
String.Normalize. If it's too slow, you could simply create an associative array (e.g., a Dictionary that maps char->char) for constant-time lookup. This is going to be large, too, but I would think it's probably easier to maintain.Context
StackExchange Code Review Q#93438, answer score: 11
Revisions (0)
No revisions yet.