Difference between revisions of "MLReadingGroup"
From CSWiki
(→Proposed Topics and Papers) |
m |
||
Line 9: | Line 9: | ||
We maintain an announcement/discussion list for the reading group. You may sign up for the list [https://lists.cs.princeton.edu/mailman/listinfo/ml-reading/ here]. | We maintain an announcement/discussion list for the reading group. You may sign up for the list [https://lists.cs.princeton.edu/mailman/listinfo/ml-reading/ here]. | ||
− | == Schedule == | + | ==Schedule (Spring 2007) == |
+ | |||
+ | |||
+ | == Schedule (Fall 2006) == | ||
Our weekly meetings are <b>Fridays, 3pm to 5pm, in CS 402</b>. | Our weekly meetings are <b>Fridays, 3pm to 5pm, in CS 402</b>. | ||
Revision as of 16:34, 26 January 2007
Machine Learning Reading Group
Welcome to the wiki of the machine learning reading group.
Contents
Mailing list
We maintain an announcement/discussion list for the reading group. You may sign up for the list here.
Schedule (Spring 2007)
Schedule (Fall 2006)
Our weekly meetings are Fridays, 3pm to 5pm, in CS 402.
Scheduled readings:
- 5 October 2006 THURSDAY 4:30PM this week
- Leader: Zafer
- Paper: Horst, R., Thoai, N. V. 1999. DC Programming: Overview. J. Optim. Theory Appl.
- (If you need a review: Hindi, H. 2004. A Tutorial on Convex Optimization. American Control Conference.)
- (Application for the interested: Argyriou A. et al. 2006. A DC-Programming Algorithm for Kernel Selection. ICML.)
- And another application Ellis, S. and Nayakkankuppam, V. Phylogenetic Analysis Via DC Programming .
- 13 October 2006
- Leader: Miro
- Papers:
- Paper:
- 20 October 2006
- Leader: Florian
- Paper: N Meinshausen and P Bühlmann, High Dimensional Graphs and Variable Selection With the Lasso, Annals of Statistics 34(3), 1436-1462
- Background reading: D Heckerman, DM Chickering, C Meek, R Rounthwaite, C Kadie, Dependency Networks for Inference, Collaborative Filtering, and Data Visualization, JMLR, 1(Oct):49-75, 2000
- 27 October 2006
- Leader:
- Paper:
- 3 November 2006
- Leader:
- Paper:
- 10 November 2006
- "Leader": Jordan
- Paper: Understanding the Yarowsky Algorithm Abney, S. 2004. Understanding the Yarowsky Algorithm. Comput. Linguist. 30, 3 (Sep. 2004), 365-395.
- Background: The original paper Yarowsky, D. 1995. Unsupervised word sense disambiguation rivaling supervised methods. In Proceedings of the 33rd Annual Meeting on Association For Computational Linguistics (Cambridge, Massachusetts, June 26 - 30, 1995). Annual Meeting of the ACL. Association for Computational Linguistics, Morristown, NJ, 189-196.
- Background: A good overview of the problem area Ide, N. and Véronis, J. 1998. Introduction to the special issue on word sense disambiguation: the state of the art. Comput. Linguist. 24, 1 (Mar. 1998), 2-40.
- 17 November 2006
- Boss: Berk Kapicioglu
- Paper: Weighted One-Against-All
- Nostalgia: Rifkin and Klautau. "In Defense of One-Vs-All Classification." Journal of Machine Learning Research, Volume 5, pp. 101-141, 2004.
- 1 December 2006
- Leader: Jonathan Chang
- Paper: Snow, Jurafsky, and Ng. "Semantic Taxonomy Induction from Heterogenous Evidence." Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pages 801-808, July 2006.
- Some useful (perhaps) background on the technique: Snow, Jurafsky, and Ng. "Learning Syntactic Patterns for Automatic Hypernym Discovery."
- Background on what they're trying to do: Miller. "WordNet: A Lexical Database for English." Communications of the ACM, Volume 38, Issue 11 (November 1995), Pages 39-41.
- 15 December 2006
Proposed Topics and Papers
Please add further topics, suggest papers for particular topics, etc. here.
- Lasso
- The original paper: Robert Tibshirani. Regression shrinkage and selection via the Lasso. J. R. Statist. Soc. B 58, 1995.
- The Lasso Page
- Generalization properties:
- Sparse approximation:
- David Donoho. For most large underdetermined systems of linear equations, the minimal l1-norm near-solution approximates the sparsest near-solution, Tech. Report, August 2004.
- David Donoho. For most large underdetermined systems of linear equations, the minimal l1-norm solution is also the sparsest solution, Tech. Report, September 2004.
- Model selection: P. Buhlmann and B. Yu. Boosting, Model Selection, Lasso and Nonnegative Garotte. Tech. Report, 2005.
- Relatives:
- RODEO
- Least angle regression
- Hui Zou and Trevor Hastie. Regularization and Variable Selection via the Elastic Net. J. R. Statist. Soc. B, 2005 + Addendum
- Su-In Lee, Honglak Lee, Pieter Abbeel and Andrew Y. Ng. Efficient L1 regularized logistic regression. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), 2006.
- Optimization
- Language Applications
- Music
- Markov Decision Processes (MDPs) and Reinforcement Learning
- State-space abstraction/aggregation in MDPs
- The E^3 algorithm (Kearns and Singh)
- Inverse reinforcement learning
- Active Learning
- Deep Neural Networks
- Factor Graphs
Algorithms that must deal with complicated global functions of many variables often exploit the manner in which the given functions factor as a product of "local" functions, each of which depends on a subset of the variables.- Tutorial: Factor Graphs and the Sum-Product Algorithm, F.R. Kschischang, B. Frey, H-A Loelinger, IEEE Transactions on Information Theory, Vol 47, No 2, Feb 2001.
- Application: Physical Network Models, C-H Yeang, T Ideker, T Jaakkola, Journal of Computational Biology, Vol 11, No 2-3, 2004
- Cross-validation
- Kohavi, R. 1995. "A study of cross-validation and bootstrap for accuracy estimation and model selection." Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI).
- "Do 10-fold CV instead of leave-one-out," at least when doing model selection
- Rivals, I., and L. Personnaz. 1999. "On cross-validation for model selection." Neural Computation 11 (4).
- "Do statistical tests instead of leave-one-out," at least when doing model selection
- Elisseeff, A., and M. Pontil. 2003. "Leave-one-out error and stability of learning algorithms with applications." Advances in Learning Theory: Methods, Models and Applications, NATO Science Series III: Computer and Systems Sciences, Vol. 190, J. Suykens et al. Eds.
- Reasons about sufficient conditions for LOO-CV error to approach generalization error for fixed algorithms (including kernel methods). A more theoretical (and maybe less practical) paper.
- see also Evgeniou, T., M. Pontil, and A. Elisseeff. 2004. "Leave-one-out error, stability, and generalization of voting combinations of classifiers." Machine Learning 55:(1): 71-97. (I haven't read this but it's very related.)
- Kohavi, R. 1995. "A study of cross-validation and bootstrap for accuracy estimation and model selection." Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI).
Participants
(Participants, please add your name to the list below.)
Faculty
- David Blei
- Rob Schapire
PostDocs
- Edo Airoldi, LSI & CS
- Florian Markowetz, LSI
Students
- Zafer Barutcuoglu, CS
- Jordan Boyd-Graber, CS
- Joseph Calandrino, CS
- Jonathan Chang, EE
- Miroslav Dudik, CS
- Rebecca Fiebrink, CS
- Berk Kapicioglu, CS
- Umar Syed, CS