EXTRACTIVE SUMMARIZATION John Cadigan David Ellison and Ethan

  • Slides: 20
Download presentation
EXTRACTIVE SUMMARIZATION John Cadigan, David Ellison, and Ethan Roday

EXTRACTIVE SUMMARIZATION John Cadigan, David Ellison, and Ethan Roday

Approach 1. Preprocessing and data cleanup 2. Vectorization 3. K-means 4. Information ordering with

Approach 1. Preprocessing and data cleanup 2. Vectorization 3. K-means 4. Information ordering with the experts system 5. CLASSY-style content realization Raw Input Docset (XML) <DOCSET_ID>/*. txt (plain text representation) ACQUAINT corpus Term weighting: • Compute idf scores for all terms in ACQUAINT LDA: • Compute LDA topic models over ACQUAINT Term Weighting: • Compute documentlevel tf-idf Preprocessing: • Sentence splitting • Tokenization • “Junk” removal Content Selection: • k-means clustering on sentence vectors Content Realization: • CLASSY-style scrubbing • Untokenization Glo. Ve vectors Sentence Vectorization: • Compute tf-idf weighted average Glo. Ve vectors • Compute LDA topic weights Information Ordering: • Four-expert panel Summaries

Preprocessing

Preprocessing

Preprocessing • Simple regex substitutions to remove non-content • Remove things like “ARVADA, Colo.

Preprocessing • Simple regex substitutions to remove non-content • Remove things like “ARVADA, Colo. (AP) –” at beginning of article • With photo. • By John T. Mc. Quiston • QUESTIONS OR RERUNS: … • The late-night supervisor is…

Content Selection

Content Selection

Vectorization Changes • Glo. Ve vectors are now tf-idf weighted averages • Previously: unweighted

Vectorization Changes • Glo. Ve vectors are now tf-idf weighted averages • Previously: unweighted averages • tf-idf is computed over the entire ACQUAINT corpus

K-means and information ordering • K-means centroids were not ordered • Tried: • Most

K-means and information ordering • K-means centroids were not ordered • Tried: • Most similar to other centroids • Most contained sentences • In this release, we used information ordering on top sentences with a cutoff totaling 100+ words

Content Realization

Content Realization

Content Realization • Implemented CLASSY-style cleanup heuristics (from 2006 paper): Remove bylines, etc. (this

Content Realization • Implemented CLASSY-style cleanup heuristics (from 2006 paper): Remove bylines, etc. (this is always done in preprocessing) Remove adverbs, limited list of conjunctions at BOS Remove ages (“Bill, 50, ate. ” “Bill ate. ”) Remove relative clause attributions (“Bill, who already ate, ate again” “Bill ate again. ”) • Remove attributions, as long as it isn’t a direct quotation (“Bill said he already ate. ” “He already ate. ”) • • • Untokenize sentences before presentation of summaries • did n’t didn’t, … • untokenize punctuation

CLASSY 2006 Sentence Trimming configurations

CLASSY 2006 Sentence Trimming configurations

Content Realization: some issues • Possible quote mangling • Oddly placed commas • Too-aggressive

Content Realization: some issues • Possible quote mangling • Oddly placed commas • Too-aggressive adverb removal? • “Physically, it’s the same town it was Monday. ” • “…the Guinean capital of Conakry was unexpectedly closed Monday…” • District Attorney Robert Johnson plans to meet with the Diallos shortly before 2 p. m. , when the grand jury indictments are scheduled to be unsealed in open court. • Police officers have rarely been convicted for killings that occurred while they were on duty. • How quickly did he fall?

Quantitative Results

Quantitative Results

Quantitative Results D 4: Evaltest D 4: Devtest Metric Precision Recall F-Score ROUGE-1 0.

Quantitative Results D 4: Evaltest D 4: Devtest Metric Precision Recall F-Score ROUGE-1 0. 260 0. 239 0. 248 ROUGE-1 0. 241 0. 212 0. 225 ROUGE-2 0. 059 0. 055 0. 057 ROUGE-2 0. 052 0. 046 0. 049

Game of Qualitative Results Best, worst and mediocre

Game of Qualitative Results Best, worst and mediocre

MEDIOCRE ROUGE 1: 0. 16461 ROUGE 2: 0. 04603 But for now, for the

MEDIOCRE ROUGE 1: 0. 16461 ROUGE 2: 0. 04603 But for now, for the next several weeks, people seem able only to get through the worst of it, to handle the realization that some people are not coming back and that yes, things like this do happen here. Students returned to classes Thursday at Chatfield High School, but the bloodbath at rival Columbine High haunted the halls. Investigators, spending the day at the memorial service, were to resume their work this morning, conducting more interviews and eyeing the possibility of additional suspects in Tuesday's massacre. Team members decided they wanted to play out the rest of the season. • Really long, non-specific first sentence • Variation of themes

WORST (D 1030 ): ROUGE-1: 0. 09921 ROUGE-2: 0. 01210 the current regulations have

WORST (D 1030 ): ROUGE-1: 0. 09921 ROUGE-2: 0. 01210 the current regulations have created a quagmire of consumer confusion and set up potential health crises that even industry officials say could hurt producers as well as users of herbal products. `The main thing you want is someone who knows enough to keep you out of trouble, '' said Dr. John B. Neeld Jr. , president of the American Society of Anesthesiologists. While over-the-counter drugs are subject to Food and Drug Administration regulation, herbal supplements are assumed safe unless proved otherwise. If the products were safe, companies could say what they wished, so long as they did not claim their products could prevent, treat or cure disease. • News-speak (not newspeak) • “quagmire” • “hurt producers” • Not that bad • It’s about FDA regulations

Best: ROUGE-1: 0. 41803 ROUGE-2: 0. 17917 An Indonesian minister, Aburizal Bakrie, claimed last

Best: ROUGE-1: 0. 41803 ROUGE-2: 0. 17917 An Indonesian minister, Aburizal Bakrie, claimed last month the flow was a ``natural disaster'' unrelated to the drilling activities of a company, Lapindo Brantas Inc, which belongs to a group controlled by his family. President Susilo Bambang Yudhoyono has ordered Lapindo to pay 3. 8 trillion rupiah -LRB- 420 million dollars -RRBin compensation and costs related to the mud flow. A gas well near Surabaya in East Java has spewed steaming mud since May last year, submerging villages, factories and fields and forcing more than 15, 000 people to flee their homes. • All themes: • Money • Disaster • Government • Could improve ordering

Discussion

Discussion

Discussion The good: • Content being selected is mostly relevant • Topicality has improved

Discussion The good: • Content being selected is mostly relevant • Topicality has improved over time The bad: • Lack of thematic cohesion seems to predominate • Possibly a drawback of k-means

Discussion Parameter tuning matters: • tf scheme, idf scheme, Glo. Ve weight, LDA weight,

Discussion Parameter tuning matters: • tf scheme, idf scheme, Glo. Ve weight, LDA weight, k Worst and best devtest: Devtest worst Devtest best Metric Precision Recall F-Score ROUGE-1 0. 183 0. 149 0. 163 ROUGE-1 0. 241 0. 212 0. 225 ROUGE-2 0. 035 0. 028 0. 031 ROUGE-2 0. 052 0. 046 0. 049