principlesqlMinor
Suggest better design pattern for reverse DNS lookups
Viewed 0 times
reversedesignbetterlookupsfordnssuggestpattern
Problem
I'm working on a project where I need to look up the hostnames associated with IP addresses who were logged making HTTP requests. The lookups currently happen as part of a daily ETL job. The current method is to use a scalar CLR function (similar code to this is posted a number of places on the web, posted below with my revisions; I'm not sure who the original author was):
I'm not a C# developer so the quality of the CLR code is likely not great.
Then I call the function like this after loading new rows into the dimension:
This method works but is very slow, for at leas
using System.Data.SqlTypes;
using System.Net;
using System.Security;
using System.Text.RegularExpressions;
using Microsoft.SqlServer.Server;
public partial class udfn_GetHostName
{
[Microsoft.SqlServer.Server.SqlFunction]
public static string udfn_GetHostname(string IPAddr)
{
try
{
/*
Using deprecated method intentionally.
GetHostEntry() is now recommended.
But it does some irritating things like returning an error if a PTR
record points to a name that doesn't have an A record.
*/
IPHostEntry IpEntry = Dns.GetHostByAddress(IPAddr);
// Test whether the record returned has at least one alphabetic character
// If it does, then it's a name
// Otherwise the DNS server might have returned the IP address
Match match = Regex.Match(IpEntry.HostName.ToString(), @"[a-zA-Z]+");
if (match.Success)
{
return IpEntry.HostName.ToString();
}
else
{
return "None";
}
}
catch(Exception ex)
{
return "Failed";
//return ex.Message.ToString();
}
}
}I'm not a C# developer so the quality of the CLR code is likely not great.
Then I call the function like this after loading new rows into the dimension:
-- Update only rows that we just inserted
UPDATE DIM.Network_Addresses
SET reverse_dns = dbo.[udfn_GetHostname](client_ip)
WHERE reverse_dns IS NULL
AND is_current = 1
AND created_date = (SELECT MAX(created_date) FROM DIM.API_Network_Address);This method works but is very slow, for at leas
Solution
Do not do it in SQL. From the few things that can be categorized as 'stupid use of SQLCLR', making expensive lengthy network calls ranks as #1. At the very very very very least, make sure the CLR code calls
The proper way to handle this is to use an external process, place the IPs to resolve in a queue, dequeue in batches and resolve several (tens) of IPs in parallel using asynchronous I/O, eg. the non-obsolete
Thread.BeginThreadAffinity() before it goes into waiting for the intertubez to respond (DNS lookup and reverse lookup included).The proper way to handle this is to use an external process, place the IPs to resolve in a queue, dequeue in batches and resolve several (tens) of IPs in parallel using asynchronous I/O, eg. the non-obsolete
Dns.BeginGetHostEntry().Context
StackExchange Database Administrators Q#53483, answer score: 8
Revisions (0)
No revisions yet.