patternsqlMinor
Frequency distributions segmented by a field
Viewed 0 times
frequencyfielddistributionssegmented
Problem
Some background:
I have a sample population data file. Each record in the data file has a frequency weight (FIELD NAME:
Challenge:
I want to further segregate these frequencies based on other values. Say a frequency distribution of weights for different household sizes. One way to achieve this is to use different where conditions in the statement above:
But is there any way to create a single table with all of the different segments? Something like this:
A fancier solution I am looking for: A stored procedure where I can just specify the variable (say household size or household income) and the code should be able to generate separate frequency distributions for all distinct values for that variable as a view/table.
Any thoughts folks?
I have a sample population data file. Each record in the data file has a frequency weight (FIELD NAME:
wgt) indicating how many times the records needs to be replicated to get the true population. The data is set up in Microsoft SQL 2008 R2. It is usually a good practice to generate frequency distribution of the weights to see if one particular type of record is being over-represented in any way - thus help identify trends/ outliers. This task is straightforward in SQL:SELECT wgt, COUNT(*) FROM tablename
GROUP BY wgtChallenge:
I want to further segregate these frequencies based on other values. Say a frequency distribution of weights for different household sizes. One way to achieve this is to use different where conditions in the statement above:
SELECT wgt, COUNT(*) FROM tablename
WHERE household_size=x --x being the desired segment
GROUP BY wgtBut is there any way to create a single table with all of the different segments? Something like this:
WGT | SIZE1 SIZE2 SIZE3 SIZE4
--------------------------------------------------
1 | 2,034 1,025 502 234
2 | 215 253 142 23
3 | 31 25 21 34
4 | 7 1 3 7
5 | 5 NULL 2 5
6 | 1 1 NULL NULL
7 | NULL 1 NULL NULLA fancier solution I am looking for: A stored procedure where I can just specify the variable (say household size or household income) and the code should be able to generate separate frequency distributions for all distinct values for that variable as a view/table.
Any thoughts folks?
Solution
For the first challenge, are you looking for something like this?
Now for the fancier challenge, where @column will be your sp parameter:
This part selectes the distinct values of the tablename's column received into a cursor
Next we will iterate over the distinct values to dynamically build up a query that will look just like the one at the top of my post, so here is the basis for it:
start the select:
the iterative part which dynamically adds the successive
finish the select:
SELECT wgt, SUM(Case when household_size=1 then 1 else 0 end) AS SIZE1
, SUM(Case when household_size=2 then 1 else 0 end) AS SIZE2
, SUM(Case when household_size=3 then 1 else 0 end) AS SIZE3
, SUM(Case when household_size=4 then 1 else 0 end) AS SIZE4
FROM tablename
GROUP BY wgtNow for the fancier challenge, where @column will be your sp parameter:
DECLARE @column VARCHAR(100);
SET @column = 'household_size';
DECLARE @main_SQL VARCHAR(4000);
SET @main_SQL =
'DECLARE dynamic_cursor CURSOR FOR
SELECT DISTINCT ' + @column + '
FROM tablename
ORDER BY ' + @column + '
--Now we will be using a cursor to iterate over
DECLARE @value INT;
OPEN dynamic_cursor
FETCH NEXT FROM dynamic_cursor INTO @value
DECLARE @sql VARCHAR(2000);
SET @sql = ''SELECT wgt'';
WHILE @@FETCH_STATUS = 0
BEGIN
SET @sql = @sql + '', SUM(Case when ' + @column + '='' + CAST(@value AS VARCHAR) + '' then 1 else 0 end) AS SIZE'' + CAST(@value AS VARCHAR)
FETCH NEXT FROM dynamic_cursor INTO @value
END
SET @sql = @sql + '' FROM tablename GROUP BY wgt''
close dynamic_cursor
DEALLOCATE dynamic_cursor
EXEC(@sql) '
EXEC(@main_SQL)This part selectes the distinct values of the tablename's column received into a cursor
DECLARE dynamic_cursor CURSOR FOR
SELECT DISTINCT ' + @column + '
FROM tablename
ORDER BY ' + @column + 'Next we will iterate over the distinct values to dynamically build up a query that will look just like the one at the top of my post, so here is the basis for it:
start the select:
SET @sql = ''SELECT wgt''the iterative part which dynamically adds the successive
, SUM(Case when household_size=n then 1 else 0 end) AS SIZEn to the select (note that @value will be the distinct values that we got selected into the cursor):SET @sql = @sql + '', SUM(Case when ' + @column + '='' + CAST(@value AS VARCHAR) + '' then 1 else 0 end) AS SIZE'' + CAST(@value AS VARCHAR)finish the select:
SET @sql = @sql + '' FROM tablename GROUP BY wgt''Code Snippets
SELECT wgt, SUM(Case when household_size=1 then 1 else 0 end) AS SIZE1
, SUM(Case when household_size=2 then 1 else 0 end) AS SIZE2
, SUM(Case when household_size=3 then 1 else 0 end) AS SIZE3
, SUM(Case when household_size=4 then 1 else 0 end) AS SIZE4
FROM tablename
GROUP BY wgtDECLARE @column VARCHAR(100);
SET @column = 'household_size';
DECLARE @main_SQL VARCHAR(4000);
SET @main_SQL =
'DECLARE dynamic_cursor CURSOR FOR
SELECT DISTINCT ' + @column + '
FROM tablename
ORDER BY ' + @column + '
--Now we will be using a cursor to iterate over
DECLARE @value INT;
OPEN dynamic_cursor
FETCH NEXT FROM dynamic_cursor INTO @value
DECLARE @sql VARCHAR(2000);
SET @sql = ''SELECT wgt'';
WHILE @@FETCH_STATUS = 0
BEGIN
SET @sql = @sql + '', SUM(Case when ' + @column + '='' + CAST(@value AS VARCHAR) + '' then 1 else 0 end) AS SIZE'' + CAST(@value AS VARCHAR)
FETCH NEXT FROM dynamic_cursor INTO @value
END
SET @sql = @sql + '' FROM tablename GROUP BY wgt''
close dynamic_cursor
DEALLOCATE dynamic_cursor
EXEC(@sql) '
EXEC(@main_SQL)DECLARE dynamic_cursor CURSOR FOR
SELECT DISTINCT ' + @column + '
FROM tablename
ORDER BY ' + @column + 'SET @sql = ''SELECT wgt''SET @sql = @sql + '', SUM(Case when ' + @column + '='' + CAST(@value AS VARCHAR) + '' then 1 else 0 end) AS SIZE'' + CAST(@value AS VARCHAR)Context
StackExchange Database Administrators Q#44346, answer score: 3
Revisions (0)
No revisions yet.