![]() |
||||
![]() |
![]() |
![]() |
![]() |
![]() |
Natural Language Processing
|
![]() |
Welcome! This course is designed to introduce you to some of the problems and solutions of NLP, and their relation to linguistics and statistics. You need to know how to program (e.g., 600.120) and use common data structures (600.226). It might also be nice to have some previous familiarity with automata (600.271) and probabilities (550.310). At the end you should agree (I hope!) that language is subtle and interesting, feel some ownership over some of NLP's formal and statistical techniques, and be able to understand research papers in the field.
Course catalog entry: This course is an in-depth overview of techniques for processing human language. How should linguistic structure and meaning be represented? What algorithms can recover them from text? And crucially, how can we build statistical models to choose among the many legal answers? The course covers methods for trees (parsing and semantic interpretation), sequences (finite-state transduction such as morphology), and words (sense and phrase induction), with applications to practical engineering tasks such as information retrieval and extraction, text classification, part-of-speech tagging, speech recognition and machine translation. There are a number of structured but challenging programming assignments. Prerequisite: 600.226. [Eisner, Applications, Fall] 3 credits
Lectures: | MTW 2-3 pm, Shaffer 301 | ||
Prof: | Jason Eisner - ![]() ![]() | ||
TA: | Omar Zaidan - ![]() ozaidan at cs dot jhu dot edu ) | ||
CA: |
Asheesh Laroia - ![]() asheesh at asheesh dot org )Constantinos Michael - ![]() stvorph at gmail dot com )
| ||
Office hrs: |
For Prof: MT 3-4pm, or by appt, in NEB 324A For TA: T 4-5:30, W 12:15-1:45, in undergrad lab (NEB 225) | ||
Mailing list: |
![]() | ||
Web page: | http://cs.jhu.edu/~jason/465 | ||
Textbook: |
Jurafsky &
Martin (required - online partial draft of
next edition wants your comments!) Manning & Schütze (recommended - online PDF version is accessible for free from within JHU) | ||
Policies: |
Grading: homework 45%, participation 10%, midterm 15%, final 30% Submission: via this web form Lateness: floating late days policy Cheating: here's what it means Intellectual engagement: much encouraged Announcements: Read mailing list and this page! | ||
Related course sites: |
|
Warning: For future lectures and assignments, the links below take you to last year's versions, which are subject to change.
Week | Monday | Tuesday | Wednesday | Readings | |
9/11 |
Introduction
(ppt)
|
Assignment 1 given: Designing CFGs Chomsky hierarchy (ppt) |
Language models
(ppt)
|
J&M chapters 1, 13, 6.2; for assignment, J&M 9 (or M&S 3) | |
9/18 |
Probability concepts
(ppt)
Bayes' Theorem (ppt) |
Smoothing n-grams
(ppt)
|
Postponed this material till 9/27 Human sentence processing (ppt) |
M&S chapters 2, 6 | |
9/25 |
(& another sign meant 3 ... ?) Assignment 2 given: Using n-Grams Limitations of CFG |
Improving CFG with features
(ppt)
|
Skipped this material since we were behind Extending CFG (summary (ppt)) |
J&M 11.1-11.4 | |
10/2 | No class (Yom Kippur) But could have a Q&A / homework help session with the TA ... |
Context-free parsing
(ppt)
|
Context-free parsing
|
J&M 10 | |
10/9 |
Earley's algorithm
(ppt)
|
Probabilistic parsing
(ppt)
|
Parsing tricks
(ppt)
Assignment 2 due on Friday ---> |
J&M 12 (or M&S 11.1-11.3 and 12.1.1-12.1.5) | |
10/16 | No class (fall break) |
Semantics
(ppt)
|
Semantics continued
|
J&M 14-15; also this web page, up to but not including "denotational semantics" section | |
10/23 | Midterm exam |
Assignment 3 given: Parsing and Semantics Finite-state functions (ppt) |
Finite-state implementation
(ppt)
|
chap 2 of xfst book draft (only accessible from CS research and undergrad networks; don't distribute) | |
10/30 |
Programming with Regexps
(ppt)
|
Noisy Channels and FSTs
(ppt)
Morphology and Phonology
(ppt)
|
chap 3 of xfst book draft;
perhaps also this paper |
| |
11/6 |
Finite-state parsing
|
Finite-state tagging
(ppt)
|
HMMs
|
J&M 8 or M&S 10 | |
11/13 |
Assignment 3 due Assignment 4 given: Finite-State Grammars Forward-backward algorithm (Excel spreadsheet; Viterbi version; lesson plan) |
Forward-backward continued
|
Expectation Maximization
(ppt)
|
Allen pp. 195-208 (handout); M&S 11 | |
11/20 |
Grouping words
(ppt; Excel spreadsheet)
|
More on learning
(ppt)
Assignment 4 due |
Assignment 5 given: Training an HMM No class (Thanksgiving coming) M&S 14 |
| |
11/27 |
Splitting words
(ppt)
|
Words vs. senses in IR
(ppt)
|
Final FSM Examples
(ppt)
|
M&S 7, 5, 15.2, 15.4 (since J&M 16-17 covers only some of this) | |
12/4 |
Machine Translation
(ppt)
|
Text categorization (ppt)
|
Maximum entropy (ppt) | M&S 13, 16 | |
12/11 |
Assignment 5 due Current and Future Research (ppt) |
Sun 12/17 is absolute deadline for late assignments ---> |
Final exam is Thu
12/21, 9am-noon ---> |