patternswiftMinor
Hex String to Bytes (NSData)
Viewed 0 times
hexnsdatastringbytes
Problem
I'm trying to convert this Objective-C code (originally found in this Stack Overflow question) which turns an
My first pass looked like this:
I realized I could optimize this further, since
And then as an extension on String:
```
extension String {
func dataFromHexString() -> NSData?
NSString into NSData to Swift:- (NSData *)dataFromHexString {
const char *chars = [self UTF8String];
int i = 0, len = self.length;
NSMutableData *data = [NSMutableData dataWithCapacity:len / 2];
char byteChars[3] = {'\0','\0','\0'};
unsigned long wholeByte;
while (i < len) {
byteChars[0] = chars[i++];
byteChars[1] = chars[i++];
wholeByte = strtoul(byteChars, NULL, 16);
[data appendBytes:&wholeByte length:1];
}
return data;
}My first pass looked like this:
func hexStringToBytes(hexString: String) -> NSData? {
guard let chars = hexString.cStringUsingEncoding(NSUTF8StringEncoding) else { return nil}
var i = 0
let length = hexString.characters.count
let data = NSMutableData(capacity: length/2)
var byteChars: [CChar] = [0, 0, 0]
var wholeByte = CUnsignedLong()
while i < length {
byteChars[0] = chars[i++]
byteChars[1] = chars[i++]
i+=1
wholeByte = strtoul(byteChars, nil, 16)
data?.appendBytes(&wholeByte, length: 1)
}
return data
}I realized I could optimize this further, since
++ is deprecated and will be removed in Swift 3:func hexStringToBytes(hexString: String) -> NSData? {
guard let chars = hexString.cStringUsingEncoding(NSUTF8StringEncoding) else { return nil}
var i = 0
let length = hexString.characters.count
let data = NSMutableData(capacity: length/2)
var byteChars: [CChar] = [0, 0, 0]
var wholeByte: CUnsignedLong = 0
while i < length {
byteChars[0] = chars[i]
i+=1
byteChars[1] = chars[i]
i+=1
wholeByte = strtoul(byteChars, nil, 16)
data?.appendBytes(&wholeByte, length: 1)
}
return data
}And then as an extension on String:
```
extension String {
func dataFromHexString() -> NSData?
Solution
First note that your code does not detect invalid input data.
For example, the string
Detecting invalid input with
alternative suggestion below.
creates an optional
simply does nothing, i.e. the error is ignored. Better check the
success immediately:
How can I optimize the ugly
By using
This does not change the performance. The time to convert a 512,000
character string is 0.0140 sec on my computer (test code at the end).
Is
That is fine as far as I can see. There is also
which I personally prefer, but I could not detect a difference in
the performance.
Since this is almost a direct translation from Objective-C, is there any way to make this more "Swifty," say by using
for
But the performance can be improved considerably.
As observed here
and here, accessing the UTF-16 view
of a Swift string is very fast. This leads to the following implementation:
which converts the 512,000 character string in 0.00185 sec.
Note that invalid input is still not detected.
We can still make it faster by converting the UTF-16 code points
"manually" instead of using
but again faster, and also detects all kinds of invalid input:
The time to convert the 512,000 character string is now
0.0008 seconds. This is more than 17 times faster than the original code.
Test code:
The tests were done on a MacBook, with the program compiled in
Release mode.
For example, the string
"XX" is just converted to a zero byte.Detecting invalid input with
strtoul() is a bit tricky, there is analternative suggestion below.
let data = NSMutableData(capacity: length/2)creates an optional
NSData. If that fails then the optional chainingdata?.appendBytes(&wholeByte, length: 1)simply does nothing, i.e. the error is ignored. Better check the
success immediately:
guard let data = NSMutableData(capacity: length/2) else { return nil }How can I optimize the ugly
i+=1 lines?By using
stride:func dataFromHexString() -> NSData? {
guard let chars = cStringUsingEncoding(NSUTF8StringEncoding) else { return nil}
let length = characters.count
guard let data = NSMutableData(capacity: length/2) else { return nil }
var byteChars: [CChar] = [0, 0, 0]
var wholeByte: CUnsignedLong = 0
for i in 0.stride(to: length, by: 2) {
byteChars[0] = chars[i]
byteChars[1] = chars[i + 1]
wholeByte = strtoul(byteChars, nil, 16)
data.appendBytes(&wholeByte, length: 1)
}
return data
}This does not change the performance. The time to convert a 512,000
character string is 0.0140 sec on my computer (test code at the end).
Is
cStringUsingEncoding(NSUTF8StringEncoding) the correct way to get the [CChar] from the string?That is fine as far as I can see. There is also
self.withCString {
// $0 is a pointer to the NUL-terminated UTF-8 string
}which I personally prefer, but I could not detect a difference in
the performance.
Since this is almost a direct translation from Objective-C, is there any way to make this more "Swifty," say by using
map or stride, without sacrificing speed?stride() is already used for the iteration. I do not see an use-casefor
map() here.But the performance can be improved considerably.
As observed here
and here, accessing the UTF-16 view
of a Swift string is very fast. This leads to the following implementation:
func dataFromHexString() -> NSData? {
let utf16 = self.utf16
guard let data = NSMutableData(capacity: utf16.count/2) else { return nil }
var byteChars: [CChar] = [0, 0, 0]
var wholeByte: CUnsignedLong = 0
var i = utf16.startIndex
while i != utf16.endIndex {
byteChars[0] = CChar(truncatingBitPattern: utf16[i])
byteChars[1] = CChar(truncatingBitPattern: utf16[i.advancedBy(1, limit: utf16.endIndex)])
wholeByte = strtoul(byteChars, nil, 16)
data.appendBytes(&wholeByte, length: 1)
i = i.advancedBy(2, limit: utf16.endIndex)
}
return data
}which converts the 512,000 character string in 0.00185 sec.
Note that invalid input is still not detected.
We can still make it faster by converting the UTF-16 code points
"manually" instead of using
strtoul(). This is more code,but again faster, and also detects all kinds of invalid input:
func dataFromHexString() -> NSData? {
// Convert 0 ... 9, a ... f, A ...F to their decimal value,
// return nil for all other input characters
func decodeNibble(u: UInt16) -> UInt8? {
switch(u) {
case 0x30 ... 0x39:
return UInt8(u - 0x30)
case 0x41 ... 0x46:
return UInt8(u - 0x41 + 10)
case 0x61 ... 0x66:
return UInt8(u - 0x61 + 10)
default:
return nil
}
}
let utf16 = self.utf16
guard let data = NSMutableData(capacity: utf16.count/2) else {
return nil
}
var i = utf16.startIndex
while i != utf16.endIndex {
guard let
hi = decodeNibble(utf16[i]),
lo = decodeNibble(utf16[i.advancedBy(1, limit: utf16.endIndex)])
else {
return nil
}
var value = hi << 4 + lo
data.appendBytes(&value, length: 1)
i = i.advancedBy(2, limit: utf16.endIndex)
}
return data
}The time to convert the 512,000 character string is now
0.0008 seconds. This is more than 17 times faster than the original code.
Test code:
let s1 = (0 ... 255).map { String(format:"%02x", $0) }.joinWithSeparator("")
let str = Repeat(count: 100, repeatedValue: s1).joinWithSeparator("")
print(str.characters.count) // 51200
let start = NSDate()
if let data = str.dataFromHexString() {
let duration = NSDate().timeIntervalSinceDate(start)
print(duration)
} else {
print("failed")
}The tests were done on a MacBook, with the program compiled in
Release mode.
Code Snippets
let data = NSMutableData(capacity: length/2)data?.appendBytes(&wholeByte, length: 1)guard let data = NSMutableData(capacity: length/2) else { return nil }func dataFromHexString() -> NSData? {
guard let chars = cStringUsingEncoding(NSUTF8StringEncoding) else { return nil}
let length = characters.count
guard let data = NSMutableData(capacity: length/2) else { return nil }
var byteChars: [CChar] = [0, 0, 0]
var wholeByte: CUnsignedLong = 0
for i in 0.stride(to: length, by: 2) {
byteChars[0] = chars[i]
byteChars[1] = chars[i + 1]
wholeByte = strtoul(byteChars, nil, 16)
data.appendBytes(&wholeByte, length: 1)
}
return data
}self.withCString {
// $0 is a pointer to the NUL-terminated UTF-8 string
}Context
StackExchange Code Review Q#135424, answer score: 7
Revisions (0)
No revisions yet.