This second edition textbook covers a coherently organized framework for text analytics, which integrates material drawn from the intersecting topics of information retrieval, machine learning, and natural language processing. Particular importance is placed on deep learning methods. The chapters of this book span three broad categories:1. Basic algorithms: Chapters 1 through 7 discuss the classical algorithms for text analytics such as preprocessing, similarity computation, topic modeling, matrix factorization, clustering, classification, regression, and ensemble analysis.
2. Domain-sensitive learning and information retrieval: Chapters 8 and 9 discuss learning models in heterogeneous settings such as a combination of text with multimedia or Web links. The problem of information retrieval and Web search is also discussed in the context of its relationship with ranking and machine learning methods.
3. Natural language processing: Chapters 10 through 16 discuss various sequence-centric and natural language applications, such as feature engineering, neural language models, deep learning, transformers, pre-trained language models, text summarization, information extraction, knowledge graphs, question answering, opinion mining, text segmentation, and event detection.
Compared to the first edition, this second edition textbook (which targets mostly advanced level students majoring in computer science and math) has substantially more material on deep learning and natural language processing. Significant focus is placed on topics like transformers, pre-trained language models, knowledge graphs, and question answering.