principlesqlMinor
What is the best index strategy or query SELECT when performing a search/lookup BETWEEN IP address (IPv4 and IPv6) ranges?
Viewed 0 times
performingtheaddresswhatsearchqueryselectipv4rangeslookup
Problem
Question: Is there a better indexing strategy or query SELECT that I can use for looking up one large data set against another large data set? Or, should I look at placing the lookup dimension table in memory (all 125 GB of it)?
Server Configuration:
Aug 22 2017 17:04:49
Copyright (C) 2017 Microsoft Corporation
Standard Edition (64-bit) on Windows Server 2016 Standard 10.0 (Build 14393: ) (Hypervisor)
IIS Exchange Server log table Schema:
```
CREATE TABLE [FWY].ExchangeServerLogTest NOT NULL,
[SourceFileName] varchar NOT NULL,
[SourceServer] varchar NOT NULL,
[SourceService] varchar NOT NULL,
[EventOccuranceTs] [datetime] NOT NULL,
[ServiceType] varchar NOT NULL,
[UserNameType] varchar NOT NULL,
[DomainId] varchar NULL,
[DomainName] varchar NULL,
[UserNameToLookup] varchar NOT NULL,
[UserAgent] varchar NULL,
[OutsideProtocolId] varchar NOT NULL,
[OutsideIp] varchar NULL,
[OutsideIpHex] varbinary NULL,
[InsideProtocolId] varchar NOT NULL,
[InsideIp] varchar NULL,
[InsideIpHex] varbinary NULL,
[DeviceId] varchar NULL,
[DeviceType] [va
Server Configuration:
- The server is a virtual server running on top of VMWare, so additional hardware can be added in the background without having to reinstall the operating system
- Microsoft SQL Server 2017 (RTM) - 14.0.1000.169 (X64)
Aug 22 2017 17:04:49
Copyright (C) 2017 Microsoft Corporation
Standard Edition (64-bit) on Windows Server 2016 Standard 10.0 (Build 14393: ) (Hypervisor)
- Note: I was previously on 2014 Enterprise - I have inquired why I was placed on Standard.
- There is only one instance that is running 2 databases: mine and the DBAs
- 2 File groups, with 1 file each: PRIMARY (system tables : not-default) and SECONDARY (non-system tables : default). The SECONDARY was meant to be scalable to hold more files once more CPUs were added. When the file group was initially created the server only had 2 CPUs
- 8 GB memory
- 500 GB disk storage (ISCSI SAN)
- 4 CPUs (Intel I assume)
IIS Exchange Server log table Schema:
```
CREATE TABLE [FWY].ExchangeServerLogTest NOT NULL,
[SourceFileName] varchar NOT NULL,
[SourceServer] varchar NOT NULL,
[SourceService] varchar NOT NULL,
[EventOccuranceTs] [datetime] NOT NULL,
[ServiceType] varchar NOT NULL,
[UserNameType] varchar NOT NULL,
[DomainId] varchar NULL,
[DomainName] varchar NULL,
[UserNameToLookup] varchar NOT NULL,
[UserAgent] varchar NULL,
[OutsideProtocolId] varchar NOT NULL,
[OutsideIp] varchar NULL,
[OutsideIpHex] varbinary NULL,
[InsideProtocolId] varchar NOT NULL,
[InsideIp] varchar NULL,
[InsideIpHex] varbinary NULL,
[DeviceId] varchar NULL,
[DeviceType] [va
Solution
-
First, I suggest you add two separate indexes, on
and try the queries again. Your 4-column index is not good for the "Outside" query as the columns appear in the 2nd and 4th position and only slightly good for the "inside" query (1st and 3rd). Plus, these 2 indexes will be half in size (20 bytes vs 40 bytes per row).
-
Second, a minor improvement. Since you only have two options for the
It won't be a huge saving, but for big tables, it would help. For the not so big, 200M rows x 3 bytes = 600MB save, for every index where the columns appear. I'm not entirely sure about space use of indexes
Smaller indexes, smaller size on disk and more important, less memory and more probable to stay in memory, especially with the low RAM server you have.
-
Third, 8GB sounds like a very small amount of RAM these days, especially when you have tables of this size. RAM is cheap (at least until you pass the 128GB Standard/Enterprise threshold and then you have the bigger licence charge).
First, I suggest you add two separate indexes, on
(InsideProtocolKey, InsideIpHex) INCLUDE (RowKey)
(OutsideProtocolKey, OutsideIpHex) INCLUDE (RowKey)and try the queries again. Your 4-column index is not good for the "Outside" query as the columns appear in the 2nd and 4th position and only slightly good for the "inside" query (1st and 3rd). Plus, these 2 indexes will be half in size (20 bytes vs 40 bytes per row).
-
Second, a minor improvement. Since you only have two options for the
ProtocolKey column (and its variations, Inside/Outside), you could conevert (all of them) from int (4 bytes) to tinyint (1 byte) or even to bit (1 bit) and save 3 bytes per row (or 3 + 7/8). It won't be a huge saving, but for big tables, it would help. For the not so big, 200M rows x 3 bytes = 600MB save, for every index where the columns appear. I'm not entirely sure about space use of indexes
bit columns but surely the save would be either the same as with tinyint (600MB) or more (up to 775MB) for the same table size. Still, and I mention this again, for every index that uses the column. Smaller indexes, smaller size on disk and more important, less memory and more probable to stay in memory, especially with the low RAM server you have.
-
Third, 8GB sounds like a very small amount of RAM these days, especially when you have tables of this size. RAM is cheap (at least until you pass the 128GB Standard/Enterprise threshold and then you have the bigger licence charge).
Code Snippets
(InsideProtocolKey, InsideIpHex) INCLUDE (RowKey)
(OutsideProtocolKey, OutsideIpHex) INCLUDE (RowKey)Context
StackExchange Database Administrators Q#231689, answer score: 4
Revisions (0)
No revisions yet.