Difference between revisions of "Relational Topic Models"
From CSWiki
m |
m (→Choosing the sparsity parameter) |
||
Line 32: | Line 32: | ||
| .114 | | .114 | ||
|} | |} | ||
+ | |||
+ | Even with 30 topics, this would imply that we're not seeing at least around 15% of true links. Since spectral clustering is likely to be overfitting in this case, a reasonable compromise between all the K might be 25%. Although, since for this dataset we'd expect the true K to be small, 50% might be a better estimate. | ||
--[[User:Jcone|Jcone]] 18:27, 7 April 2008 (EDT) | --[[User:Jcone|Jcone]] 18:27, 7 April 2008 (EDT) |
Revision as of 18:36, 7 April 2008
Choosing the sparsity parameter
On the senate dataset, running spectral clustering for various values of K gives the following:
K | False positives | False negatives |
---|---|---|
5 | .606 | .058 |
10 | .354 | .078 |
15 | .126 | .078 |
20 | .193 | .094 |
25 | .157 | .107 |
30 | .135 | .114 |
Even with 30 topics, this would imply that we're not seeing at least around 15% of true links. Since spectral clustering is likely to be overfitting in this case, a reasonable compromise between all the K might be 25%. Although, since for this dataset we'd expect the true K to be small, 50% might be a better estimate.
--Jcone 18:27, 7 April 2008 (EDT)