原文:http://chentingpc.me/article/?id=616
Topic Modeling(主题模型)是一个比较神奇的东西,之前听说过,没意识到它的重要性。直到唐总的点拨后重新认真看看,可以说文本挖掘的一个基础吧(比较 高级的基础?)。问题的输入是文档,输出是低维空间的主题,是unsupervised算法。基本经历发展是 LSI->pLSI->LDA->various LDA,pLSI和LDA都是生成模型,特别是LDA,这种看待文本的思想是很奇妙的。LDA的思想虽简单,但是利用EM/Gibbs等进行概率推导学起 来就没那么简单(写此文时候这部分还没完全弄清楚;唐总说TM是用一个月来学的问题或用两三个月来学的问题,呼呼,真的假的。。不知道他说这句话时候的要 求是多高)。
仔细看LDA有两三天了,今晚也跑了跑Mallet,也有了感性的认识。下面就把入门的文章整理下吧(这些文章都可以从网上公开下载,所以这里附件其中不算侵权吧。。。):
Survey
- David M. Blei主页上的Topic modeling页面,有很多资料(从tutorial到implementation)
- 自然语言处理中主题模型的发展
- Probabilistic Topic Models.pdf
- Introduction to Probabilistic Topic Models.pdf
Specific
- LSI : Latent semantic indexing a probabilistic analysis.pdf
- pLSI : Probabilistic Latent Semantic Indexing.pdf
- LDA : Latent Dirichlet Allocation.pdf
Video Lecture
- D.Blei的一个很不错的lecture,由于网速原因,我只能看到其课件不能看lecture,但毫无疑问是好lecture(这东西就是D.Blei等人03年提出的)。
- 另一个D. Blei的lecture
Open Source
Derived (not recommended for newcomers)
- dynamic LDA : dynamic_topic_models.pdf
- The Author-Topic Model for Authors and Documents
- Correlated Topic Models
- Automatic Labeling of Multinomial Topic Model
相关推荐
Chapter 5: Frequent Pattern Mining and Topic Modeling Chapter 6: Recommendation with Mahout Chapter 7: Clustering with Mahout Chapter 8: New Paradigm in Mahout Chapter 9: Case Study – Churn Analytics...
Machine Learning Algorithms ... Topic Modeling And Sentiment Analysis In Nlp Chapter 14. A Brief Introduction To Deep Learning And Tensorflow Chapter 15. Creating A Machine Learning Architecture
– Basic algorithms: Chapters 1 through 7 discuss the classical algorithms for machine learning from text such as preprocessing, similarity computation, topic modeling, matrix factorization, ...
Lifelong Topic Modeling Lifelong Information Extraction Continuous Knowledge Learning in Chatbots Lifelong Reinforcement Learning Conclusion and Future Directions Bibliography Authors’Biographies ...
Starting from the beginning, this book introduces you to unsupervised learning and provides a high-level introduction to the topic. We quickly move on to discuss the application of key concepts and ...
Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection ...
ASIN: B077NQGV1G, ISBN: 1788392019 Year: 2017 Format: AZW3 ... Practical tips and examples are provided at every step to ensure you are able to grasp each topic as quickly as possible.
Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection ...
Tibshirani proposed the Lasso and is co-author of thevery successful An Introduction to the Bootstrap. Friedman is theco-inventor of many data-mining tools including CART, MARS, andprojection pursuit...
Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection ...
Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection ...
Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection ...
We will learn how to use these techniques to do sentiment analysis and topic modeling. Chapter 11, Probabilistic Reasoning for Sequential Data, shows you techniques used to analyze time series and ...
Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection ...
Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection ...
Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection ...
Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection ...
Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection ...
Tibshirani proposed the Lasso and is co-author of the very successful <EM>An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection ...