▶Book Description
Natural Language Processing (NLP) allows you to take any sentence and identify patterns, special names, company names, and more. The second edition of Natural Language Processing with Java teaches you how to perform language analysis with the help of Java libraries, while constantly gaining insights from the outcomes.
You’ll start by understanding how NLP and its various concepts work. Having got to grips with the basics, you’ll explore important tools and libraries in Java for NLP, such as CoreNLP, OpenNLP, Neuroph, and Mallet. You’ll then start performing NLP on different inputs and tasks, such as tokenization, model training, parts-of-speech and parsing trees. You’ll learn about statistical machine translation, summarization, dialog systems, complex searches, supervised and unsupervised NLP, and more.
By the end of this book, you’ll have learned more about NLP, neural networks, and various other trained models in Java for enhancing the performance of NLP applications.
▶What You Will Learn
⦁ Understand basic NLP tasks and how they relate to one another
⦁ Discover and use the available tokenization engines
⦁ Apply search techniques to find people, as well as things, within a document
⦁ Construct solutions to identify parts of speech within sentences
⦁ Use parsers to extract relationships between elements of a document
⦁ Identify topics in a set of documents
⦁ Explore topic modeling from a document
▶Key Features
⦁ Use deep learning and NLP techniques in Java to discover hidden insights in text
⦁ Work with popular Java libraries such as CoreNLP, OpenNLP, and Mallet
⦁ Explore machine translation, identifying parts of speech, and topic modeling
▶Who This Book Is For
Natural Language Processing with Java is for you if you are a data analyst, data scientist, or machine learning engineer who wants to extract information from a language using Java. Knowledge of Java programming is needed, while a basic understanding of statistics will be useful but not mandatory.
▶What this book covers
⦁ Chapter 1, Introduction to NLP, explains the importance and uses of NLP. The NLP techniques used in this chapter are explained with simple examples illustrating their use.
⦁ Chapter 2, Finding Parts of Text, focuses primarily on tokenization. This is the first step in more advanced NLP tasks. Both core Java and Java NLP tokenization APIs are illustrated.
⦁ Chapter 3, Finding Sentences, proves that sentence boundary disambiguation is an important NLP task. This step is a precursor for many other downstream NLP tasks in which text elements should not be split across sentence boundaries. This includes ensuring that all phrases are in one sentence and supporting Parts-of-Speech analysis.
⦁ Chapter 4, Finding People and Things, covers what is commonly referred to as Named Entity Recognition (NER). This task is concerned with identifying people, places, and similar entities in text. This technique is a preliminary step for processing queries and searches.
⦁ Chapter 5, Detecting Parts of Speech, shows you how to detect Parts-of -Speech, which are grammatical elements of text, such as nouns and verbs. Identifying these elements is a significant step in determining the meaning of text and detecting relationships within text.
⦁ Chapter 6, Representing Text with Features, explains how text is presented using N-grams and outlines role they play in revealing the context.
⦁ Chapter 7, Information Retrieval, deals with processing the huge amount of data uncovered in information retrieval and finding the relevant information using various approaches, such as Boolean retrieval, dictionaries, and tolerant retrieval.
⦁ Chapter 8, Classifying Texts and Documents, proves that classifying text is useful for tasks such as spam detection and sentiment analysis. The NLP techniques that support this process are investigated and illustrated.
⦁ Chapter 9, Topic Modeling, discusses the basics of topic modeling using a document that contains some text.
⦁ Chapter 10, Using Parsers to Extract Relationships, demonstrates parse trees. A parse tree is used for many purposes, including information extraction. It holds information regarding the relationships between these elements. An example implementing a simple query is presented to illustrate this process.
⦁ Chapter 11, Combined Pipeline, addresses several issues surrounding the use of combinations of techniques that solve NLP problems.
⦁ Chapter 12, Creating a ChatBot, looks at different types of chatbot, and we will be developing a simple appointment-booking chatbot too.