snippetMinor
How can I restructure matrices to have non-zero elements close to the diagonal?
Viewed 0 times
canclosethehowelementsrestructurenondiagonalmatriceszero
Problem
I have a matrix $C \in \mathbb{N}^{n \times n}$. Semantically, it is a confusion matrix where the element $c_{ij}$ denotes how often members of class $i$ are predicted by a given classifier as members of class $j$.
The order of elements does not matter, but $c_{ii}$ has to be the correct predction of class $i$. So for any given matrix $C$ you can swap columns if you swap the same rows.
How can I order the classes $1, \dots, n$ so that the biggest elements are close to the diagonal?
I thought one might pose this as an optimization problem, e.g. minimize
$$\sum_{i = 1}^n \sum_{j=1}^n C_{ij} \cdot {|i-j|}$$
How could I minimize this?
Code and example
I've already visualized a $369 \times 369$ matrix without any optimization. The confuscation matrix as JSON file and the code are here.
Without modification, you get a score of
This looks as if there could be some improvement. A quick first thought was to just randomly swap rows. Letting this run for $10^4$ steps (~5 minutes) leads to a score of
Doing this, I realized that my score might also need some improvement. I instead of moving elements to the diagonal, it would be nice if big blocks within the matrix would only contain zeros.
The total number of possibilities to arrange the items in $C$ is equal to the number of permutations of a list of length $n$ and hence it is $n!$. Hence for $369$ classes it is already $369! \approx 10^{788}$ - too much to brute force.
See also
The order of elements does not matter, but $c_{ii}$ has to be the correct predction of class $i$. So for any given matrix $C$ you can swap columns if you swap the same rows.
How can I order the classes $1, \dots, n$ so that the biggest elements are close to the diagonal?
I thought one might pose this as an optimization problem, e.g. minimize
$$\sum_{i = 1}^n \sum_{j=1}^n C_{ij} \cdot {|i-j|}$$
How could I minimize this?
Code and example
I've already visualized a $369 \times 369$ matrix without any optimization. The confuscation matrix as JSON file and the code are here.
Without modification, you get a score of
303535 for:This looks as if there could be some improvement. A quick first thought was to just randomly swap rows. Letting this run for $10^4$ steps (~5 minutes) leads to a score of
82552 and a visualization which looks a bit cleaner:Doing this, I realized that my score might also need some improvement. I instead of moving elements to the diagonal, it would be nice if big blocks within the matrix would only contain zeros.
The total number of possibilities to arrange the items in $C$ is equal to the number of permutations of a list of length $n$ and hence it is $n!$. Hence for $369$ classes it is already $369! \approx 10^{788}$ - too much to brute force.
See also
- My similar question on datascience.SE
Solution
The random swapping approach (simulated annealing with extremely low temperature) yields to a score of
which corresponds to the symbol classes
```
['\blacktriangleright', '\nvDash', '\AE', '7', '\guillemotleft', '\perp',
'\bot', '\therefore', '\boxtimes', '\vdots', '\Leftarrow',
'\ddots', '\because', '\iddots',
'\multimap', '\L', '+', '\nearrow', '\boxplus', 'F', 'E', 'Q', '\checked', '\checkmark', '\rceil', 'h', '\uparrow',
'\div', '\doteq', '\longmapsto', '\mapsto', '\pitchfork', '\boxdot',
'\varsubsetneq', '\subsetneq', '\nsubseteq', '\nRightarrow', '\gtrless', '\triangleq', '\parr', '\Sigma', '\sum', 'k', '\sharp', '\#', '\sun', '\ast', '\star', '\not\equiv', '\neq', '\rightleftarrows',
'\rightleftharpoons', '\rightrightarrows', '\Longrightarrow', '\Rightarrow', '\pm',
'\dots', '\dotsc', '\aa', ']', '\ohm', '\Omega',
'\exists', '\ni', '3', '\}', '\pounds', '\mathscr{L}', '\mathcal{L}',
'\twoheadrightarrow', '\shortrightarrow', '\rightarrow', '\longrightarrow',
'\nrightarrow', '\rightharpoonup', '\hookrightarrow', '\cong', '\equiv',
'\Xi', '\diamondsuit', '\lozenge', '\diamond', 'V', '\vee', 'v', '2', '\sphericalangle', '\simeq', '\approx', '\gtrsim', '\nu', '\forall', '\triangleright',
'D', '\mathcal{D}', '\mathscr{D}',
'\barwedge', '\sqrt{}', '\iota', 'L', '\lfloor', '\{', '\coprod', '\amalg', '\sqcup', '\mp', '\between', '\mathds{E}', '\mathcal{F}', '\mathscr{F}', 'J', 'f', '\fint', '\S', '\mathsection', '\Im', '\nexists', '\mathfrak{X}', '\hbar', '\dag', '\nmid', '\vdash', '\notin', '\lightning', '\xi', '\zeta', '\mathcal{E}', '\varepsilon', '\epsilon', '\in', '\mathscr{E}', 'n', 'e', '\varrho', '\eta', '\mathscr{S}', '\int', '\square', '\sqcap', '\prod', '\Pi', '\pi', '\models', '\vDash', 'P', '\mathcal{P}', '\rho', '[', '\lceil', '\llbracket', '\Gamma', 'r', 'c', 'C', '\subset',
'\mathcal{C}', '\Lambda', '\wedge', '\mathds{C}', '\varpropto', '\infty', '\propto', '\alpha', '\ae', 'p', '\mathds{P}', '\gamma', '\mathscr{P}',
'\wp', '\mathscr{C}', '\varphi', '\triangledown', '\nabla', '\diameter',
'\o', '\varnothing', '\emptyset', '\O', '\phi', '\Phi', '\tau',
'\mathcal{T}', '\top', 'T', '\oint', '\celsius', '\%', '\rrbracket', '\frown', '\cap', '\curvearrowright', '\clubsuit', '\psi', '\Psi', '\mathbb{1}', '\mathds{1}', '1', '\trianglelefteq', '\ell', '\lambda',
'\kappa', '\varkappa', '\mathcal{X}', '\chi', '\vartriangle', '\Delta',
'\triangle', 'x', '\times', 'X', '\aleph', '\mathscr{H}', '\mathcal{H}',
'\rtimes', 'H', '\$', '\vartheta', '\astrosun', '\odot',
'\|', '\parallel', '\angle', '/', '\prime', '\circledcirc', '8', '\leftmoon',
'\ltimes', '\prec', '\preceq', '\sqsubseteq', '\subseteq', '\female',
'\venus', '\omega', 'j', '\setminus', '\backslash', '\Bowtie', '\bowtie',
'a', '6', '\delta', '\partial', 'd', '\supseteq', '\succ', '\succeq',
'\geq', '\geqslant', '\searrow', '\leq', '\leqslant', '\lesssim',
'\preccurlyeq', '\circledR', '\sigma', '\mars', '\male', '\mathbb{H}',
'\mathfrak{A}',
'\mathcal{N}', 'N', 'b', '\flat', '\mathds{N}', '\mathbb{N}', '\mu', '\mathcal{M}', 'M',
'\circledast', '\otimes',
'\oplus', '\ss', '\beta', '\mathcal{B}', 'B', '\mathfrak{S
64496 (20 minutes or so with Python and seed 0, ~60s with C++ and playing with seeds by a friend -.-). The permutation is[213, 201, 367, 34, 368, 174, 249, 193, 159, 275, 225, 276, 194, 300, 191, 362, 113, 230, 158, 5, 4, 16, 352, 126, 265, 49, 224, 139, 187, 221, 228, 192, 156, 205, 204, 203, 241, 208, 214, 166, 67, 40, 52, 283, 124, 354, 133, 152, 173, 206, 235, 231, 237, 223, 217, 138, 118, 277, 361, 269, 344, 98, 258, 251, 30, 119, 122, 339, 309, 240, 245, 26, 226, 242, 232, 218, 110, 172, 86, 282, 297, 137, 21, 146, 62, 29, 293, 189, 171, 210, 84, 250, 136, 3, 304, 335, 154, 292, 78, 11, 266, 116, 164, 129, 148, 144, 195, 327, 306, 337, 9, 47, 168, 120, 128, 259, 261, 323, 254, 121, 200, 183, 256, 246, 85, 72, 305, 77, 76, 255, 336, 55, 46, 89, 73, 341, 100, 294, 145, 163, 87, 37, 185, 199, 15, 313, 88, 268, 264, 273, 69, 59, 44, 2, 106, 303, 82, 149, 326, 197, 279, 111, 38, 366, 57, 329, 68, 340, 257, 334, 93, 295, 286, 353, 365, 298, 285, 364, 91, 92, 90, 316, 252, 19, 165, 342, 125, 274, 176, 143, 239, 288, 95, 96, 324, 325, 28, 212, 253, 81, 79, 80, 318, 94, 299, 71, 291, 64, 132, 23, 278, 338, 308, 160, 7, 115, 247, 347, 147, 271, 188, 281, 272, 280, 155, 35, 349, 157, 177, 180, 202, 108, 350, 345, 97, 51, 141, 284, 355, 179, 42, 33, 70, 99, 45, 109, 178, 181, 103, 207, 220, 102, 211, 209, 196, 127, 41, 346, 351, 359, 320, 311, 13, 43, 287, 328, 357, 83, 310, 12, 161, 135, 131, 360, 39, 302, 1, 322, 123, 167, 6, 117, 31, 330, 356, 74, 75, 151, 104, 262, 289, 296, 140, 101, 236, 312, 343, 150, 348, 56, 234, 27, 14, 114, 331, 260, 314, 17, 54, 321, 105, 263, 130, 20, 186, 190, 63, 22, 162, 134, 363, 333, 301, 0, 169, 170, 175, 10, 112, 317, 61, 50, 248, 18, 315, 60, 32, 222, 244, 290, 48, 36, 307, 184, 58, 233, 238, 229, 227, 219, 153, 66, 319, 332, 358, 25, 53, 270, 182, 8, 216, 65, 24, 267, 107, 215, 142, 198, 243]which corresponds to the symbol classes
```
['\blacktriangleright', '\nvDash', '\AE', '7', '\guillemotleft', '\perp',
'\bot', '\therefore', '\boxtimes', '\vdots', '\Leftarrow',
'\ddots', '\because', '\iddots',
'\multimap', '\L', '+', '\nearrow', '\boxplus', 'F', 'E', 'Q', '\checked', '\checkmark', '\rceil', 'h', '\uparrow',
'\div', '\doteq', '\longmapsto', '\mapsto', '\pitchfork', '\boxdot',
'\varsubsetneq', '\subsetneq', '\nsubseteq', '\nRightarrow', '\gtrless', '\triangleq', '\parr', '\Sigma', '\sum', 'k', '\sharp', '\#', '\sun', '\ast', '\star', '\not\equiv', '\neq', '\rightleftarrows',
'\rightleftharpoons', '\rightrightarrows', '\Longrightarrow', '\Rightarrow', '\pm',
'\dots', '\dotsc', '\aa', ']', '\ohm', '\Omega',
'\exists', '\ni', '3', '\}', '\pounds', '\mathscr{L}', '\mathcal{L}',
'\twoheadrightarrow', '\shortrightarrow', '\rightarrow', '\longrightarrow',
'\nrightarrow', '\rightharpoonup', '\hookrightarrow', '\cong', '\equiv',
'\Xi', '\diamondsuit', '\lozenge', '\diamond', 'V', '\vee', 'v', '2', '\sphericalangle', '\simeq', '\approx', '\gtrsim', '\nu', '\forall', '\triangleright',
'D', '\mathcal{D}', '\mathscr{D}',
'\barwedge', '\sqrt{}', '\iota', 'L', '\lfloor', '\{', '\coprod', '\amalg', '\sqcup', '\mp', '\between', '\mathds{E}', '\mathcal{F}', '\mathscr{F}', 'J', 'f', '\fint', '\S', '\mathsection', '\Im', '\nexists', '\mathfrak{X}', '\hbar', '\dag', '\nmid', '\vdash', '\notin', '\lightning', '\xi', '\zeta', '\mathcal{E}', '\varepsilon', '\epsilon', '\in', '\mathscr{E}', 'n', 'e', '\varrho', '\eta', '\mathscr{S}', '\int', '\square', '\sqcap', '\prod', '\Pi', '\pi', '\models', '\vDash', 'P', '\mathcal{P}', '\rho', '[', '\lceil', '\llbracket', '\Gamma', 'r', 'c', 'C', '\subset',
'\mathcal{C}', '\Lambda', '\wedge', '\mathds{C}', '\varpropto', '\infty', '\propto', '\alpha', '\ae', 'p', '\mathds{P}', '\gamma', '\mathscr{P}',
'\wp', '\mathscr{C}', '\varphi', '\triangledown', '\nabla', '\diameter',
'\o', '\varnothing', '\emptyset', '\O', '\phi', '\Phi', '\tau',
'\mathcal{T}', '\top', 'T', '\oint', '\celsius', '\%', '\rrbracket', '\frown', '\cap', '\curvearrowright', '\clubsuit', '\psi', '\Psi', '\mathbb{1}', '\mathds{1}', '1', '\trianglelefteq', '\ell', '\lambda',
'\kappa', '\varkappa', '\mathcal{X}', '\chi', '\vartriangle', '\Delta',
'\triangle', 'x', '\times', 'X', '\aleph', '\mathscr{H}', '\mathcal{H}',
'\rtimes', 'H', '\$', '\vartheta', '\astrosun', '\odot',
'\|', '\parallel', '\angle', '/', '\prime', '\circledcirc', '8', '\leftmoon',
'\ltimes', '\prec', '\preceq', '\sqsubseteq', '\subseteq', '\female',
'\venus', '\omega', 'j', '\setminus', '\backslash', '\Bowtie', '\bowtie',
'a', '6', '\delta', '\partial', 'd', '\supseteq', '\succ', '\succeq',
'\geq', '\geqslant', '\searrow', '\leq', '\leqslant', '\lesssim',
'\preccurlyeq', '\circledR', '\sigma', '\mars', '\male', '\mathbb{H}',
'\mathfrak{A}',
'\mathcal{N}', 'N', 'b', '\flat', '\mathds{N}', '\mathbb{N}', '\mu', '\mathcal{M}', 'M',
'\circledast', '\otimes',
'\oplus', '\ss', '\beta', '\mathcal{B}', 'B', '\mathfrak{S
Code Snippets
[213, 201, 367, 34, 368, 174, 249, 193, 159, 275, 225, 276, 194, 300, 191, 362, 113, 230, 158, 5, 4, 16, 352, 126, 265, 49, 224, 139, 187, 221, 228, 192, 156, 205, 204, 203, 241, 208, 214, 166, 67, 40, 52, 283, 124, 354, 133, 152, 173, 206, 235, 231, 237, 223, 217, 138, 118, 277, 361, 269, 344, 98, 258, 251, 30, 119, 122, 339, 309, 240, 245, 26, 226, 242, 232, 218, 110, 172, 86, 282, 297, 137, 21, 146, 62, 29, 293, 189, 171, 210, 84, 250, 136, 3, 304, 335, 154, 292, 78, 11, 266, 116, 164, 129, 148, 144, 195, 327, 306, 337, 9, 47, 168, 120, 128, 259, 261, 323, 254, 121, 200, 183, 256, 246, 85, 72, 305, 77, 76, 255, 336, 55, 46, 89, 73, 341, 100, 294, 145, 163, 87, 37, 185, 199, 15, 313, 88, 268, 264, 273, 69, 59, 44, 2, 106, 303, 82, 149, 326, 197, 279, 111, 38, 366, 57, 329, 68, 340, 257, 334, 93, 295, 286, 353, 365, 298, 285, 364, 91, 92, 90, 316, 252, 19, 165, 342, 125, 274, 176, 143, 239, 288, 95, 96, 324, 325, 28, 212, 253, 81, 79, 80, 318, 94, 299, 71, 291, 64, 132, 23, 278, 338, 308, 160, 7, 115, 247, 347, 147, 271, 188, 281, 272, 280, 155, 35, 349, 157, 177, 180, 202, 108, 350, 345, 97, 51, 141, 284, 355, 179, 42, 33, 70, 99, 45, 109, 178, 181, 103, 207, 220, 102, 211, 209, 196, 127, 41, 346, 351, 359, 320, 311, 13, 43, 287, 328, 357, 83, 310, 12, 161, 135, 131, 360, 39, 302, 1, 322, 123, 167, 6, 117, 31, 330, 356, 74, 75, 151, 104, 262, 289, 296, 140, 101, 236, 312, 343, 150, 348, 56, 234, 27, 14, 114, 331, 260, 314, 17, 54, 321, 105, 263, 130, 20, 186, 190, 63, 22, 162, 134, 363, 333, 301, 0, 169, 170, 175, 10, 112, 317, 61, 50, 248, 18, 315, 60, 32, 222, 244, 290, 48, 36, 307, 184, 58, 233, 238, 229, 227, 219, 153, 66, 319, 332, 358, 25, 53, 270, 182, 8, 216, 65, 24, 267, 107, 215, 142, 198, 243]['\blacktriangleright', '\nvDash', '\AE', '7', '\guillemotleft', '\perp',
'\bot', '\therefore', '\boxtimes', '\vdots', '\Leftarrow',
'\ddots', '\because', '\iddots',
'\multimap', '\L', '+', '\nearrow', '\boxplus', 'F', 'E', 'Q', '\checked', '\checkmark', '\rceil', 'h', '\uparrow',
'\div', '\doteq', '\longmapsto', '\mapsto', '\pitchfork', '\boxdot',
'\varsubsetneq', '\subsetneq', '\nsubseteq', '\nRightarrow', '\gtrless', '\triangleq', '\parr', '\Sigma', '\sum', 'k', '\sharp', '\#', '\sun', '\ast', '\star', '\not\equiv', '\neq', '\rightleftarrows',
'\rightleftharpoons', '\rightrightarrows', '\Longrightarrow', '\Rightarrow', '\pm',
'\dots', '\dotsc', '\aa', ']', '\ohm', '\Omega',
'\exists', '\ni', '3', '\}', '\pounds', '\mathscr{L}', '\mathcal{L}',
'\twoheadrightarrow', '\shortrightarrow', '\rightarrow', '\longrightarrow',
'\nrightarrow', '\rightharpoonup', '\hookrightarrow', '\cong', '\equiv',
'\Xi', '\diamondsuit', '\lozenge', '\diamond', 'V', '\vee', 'v', '2', '\sphericalangle', '\simeq', '\approx', '\gtrsim', '\nu', '\forall', '\triangleright',
'D', '\mathcal{D}', '\mathscr{D}',
'\barwedge', '\sqrt{}', '\iota', 'L', '\lfloor', '\{', '\coprod', '\amalg', '\sqcup', '\mp', '\between', '\mathds{E}', '\mathcal{F}', '\mathscr{F}', 'J', 'f', '\fint', '\S', '\mathsection', '\Im', '\nexists', '\mathfrak{X}', '\hbar', '\dag', '\nmid', '\vdash', '\notin', '\lightning', '\xi', '\zeta', '\mathcal{E}', '\varepsilon', '\epsilon', '\in', '\mathscr{E}', 'n', 'e', '\varrho', '\eta', '\mathscr{S}', '\int', '\square', '\sqcap', '\prod', '\Pi', '\pi', '\models', '\vDash', 'P', '\mathcal{P}', '\rho', '[', '\lceil', '\llbracket', '\Gamma', 'r', 'c', 'C', '\subset',
'\mathcal{C}', '\Lambda', '\wedge', '\mathds{C}', '\varpropto', '\infty', '\propto', '\alpha', '\ae', 'p', '\mathds{P}', '\gamma', '\mathscr{P}',
'\wp', '\mathscr{C}', '\varphi', '\triangledown', '\nabla', '\diameter',
'\o', '\varnothing', '\emptyset', '\O', '\phi', '\Phi', '\tau',
'\mathcal{T}', '\top', 'T', '\oint', '\celsius', '\%', '\rrbracket', '\frown', '\cap', '\curvearrowright', '\clubsuit', '\psi', '\Psi', '\mathbb{1}', '\mathds{1}', '1', '\trianglelefteq', '\ell', '\lambda',
'\kappa', '\varkappa', '\mathcal{X}', '\chi', '\vartriangle', '\Delta',
'\triangle', 'x', '\times', 'X', '\aleph', '\mathscr{H}', '\mathcal{H}',
'\rtimes', 'H', '\$', '\vartheta', '\astrosun', '\odot',
'\|', '\parallel', '\angle', '/', '\prime', '\circledcirc', '8', '\leftmoon',
'\ltimes', '\prec', '\preceq', '\sqsubseteq', '\subseteq', '\female',
'\venus', '\omega', 'j', '\setminus', '\backslash', '\Bowtie', '\bowtie',
'a', '6', '\delta', '\partial', 'd', '\supseteq', '\succ', '\succeq',
'\geq', '\geqslant', '\searrow', '\leq', '\leqslant', '\lesssim',
'\preccurlyeq', '\circledR', '\sigma', '\mars', '\male', '\mathbb{H}',
'\mathfrak{A}',
'\mathcal{N}', 'N', 'b', '\flat', '\mathds{N}', '\mathbb{N}', '\mu', '\mathcal{M}', 'M',
'\circledast', '\otimes',
'\oplus', '\ss', '\beta', '\mathcal{B}', 'B', '\mathfrak{S}', '\&',
'\with',Context
StackExchange Computer Science Q#70627, answer score: 2
Revisions (0)
No revisions yet.