Usenet Training 5 7 May 2004 Disneys Coronado

  • Slides: 50
Download presentation
Usenet Training 5 -7 May 2004 Disney’s Coronado Springs Resort

Usenet Training 5 -7 May 2004 Disney’s Coronado Springs Resort

Introductions • Highwinds Software – Purveyors of Cyclone, Typhoon, Twister, Tornado, and Hurricane –

Introductions • Highwinds Software – Purveyors of Cyclone, Typhoon, Twister, Tornado, and Hurricane – Long History of highperformance USENET software • Josh Gagliardi – Employee #2 at Highwinds Software – Currently a member of the Highwinds Software Technology Board

Agenda • Day One – USENET/NNTP Introduction – Servers Introduction • Day Two –

Agenda • Day One – USENET/NNTP Introduction – Servers Introduction • Day Two – Building and Enterprise Network – Highwinds APIs – Highwinds Terminology and Advanced Usage • Day Three – Server Administration – Disaster Recovery – Customer-Driven Discussions

Day One • Introduction to USENET and NNTP

Day One • Introduction to USENET and NNTP

QUIZ • Rate Yourself from 1 to 6 • Have you ever read/posted to/used

QUIZ • Rate Yourself from 1 to 6 • Have you ever read/posted to/used USENET? • Have you ever installed or managed news server software? • Do you know what the Path header line is for? • Do you know how moderated groups work? • What is “The Freenix Top 1000”? • What are the big six hierarchies?

What are your problems? • • New news administrator No machines No bandwidth No

What are your problems? • • New news administrator No machines No bandwidth No money Growing too rapidly, not enough space Too much traffic, we need more capacity Too much traffic, we want it to stop “Well… my wife/husband left me. . ”

Communications Technology Lifecycle • • Phase I: Nerd Toy Phase II: Pornography Phase III:

Communications Technology Lifecycle • • Phase I: Nerd Toy Phase II: Pornography Phase III: Business Use Phase IV: Mainstream Use • Predecessors: Cable & Satellite TV, VCRs, the Web • Followers: Web Phones, Camera Phones, Video Phones

What does News look like?

What does News look like?

How do you interact with News? • Newsgroups • Reading • Posting

How do you interact with News? • Newsgroups • Reading • Posting

Usenet saved this dog!

Usenet saved this dog!

How much News is there? • 1985: One person can read every message in

How much News is there? • 1985: One person can read every message in every group and still get work done. • 1986: RFC 977 approved • 1990: One group, rec. arts. startrek, is hard to keep up with. Following ten groups is easy. • 1997: Cyclone released • 2000: 250 K articles/day, 7 Gb in traffic • One Year Ago: 1 -2 M articles/day, 200 Gb

TODAY 2 -4 M articles/day 1. 1 Tb daily traffic

TODAY 2 -4 M articles/day 1. 1 Tb daily traffic

Why Is News Important? • • Network traffic is huge Machines are costly Mistakes

Why Is News Important? • • Network traffic is huge Machines are costly Mistakes are costly Users are dedicated: news makes the phone ring!

How does News get around? Usenet ?

How does News get around? Usenet ?

Inside Usenet “Flood Fill” Readers Transit Servers Readers

Inside Usenet “Flood Fill” Readers Transit Servers Readers

Transit Servers • Communicate with other servers • Manage article propagation • Cyclone is

Transit Servers • Communicate with other servers • Manage article propagation • Cyclone is a transit server • Highwinds Innovation

Reader Servers Readers Transit Server Reader Server • Communicate with newsreaders and with transit

Reader Servers Readers Transit Server Reader Server • Communicate with newsreaders and with transit servers • Manage article organization • Serve as injection point for news • Typhoon, Twister, and Tornado are reader servers

Newsreaders • Communicate with reader servers • Present a coherent view of the news

Newsreaders • Communicate with reader servers • Present a coherent view of the news stream to an individual user • Manage creation of news articles • Mozilla, Outlook Express, and Agent are newsreaders • Special-Purpose Readers

Inside Usenet Readers Transit Servers Readers

Inside Usenet Readers Transit Servers Readers

Is News the same as Mail? • No: Mail is delivered point-to-point while news

Is News the same as Mail? • No: Mail is delivered point-to-point while news is broadcast. • No: Each user pays for the storage of his own mail. • No: If everyone voiced their opinions to 50, 000 people with email, the mail server might well fill or fail.

Is News the same as Chat? • No: If you go away on vacation,

Is News the same as Chat? • No: If you go away on vacation, you can catch up on your news. • No: Your comments are reliably gatewayed to machines around the world. People don’t have to be able to connect to your chat server to talk to you. • No: You can use attachments in newsgroups where such behavior is welcome.

How quickly does News get around? • Before Cyclone, it could take hours and

How quickly does News get around? • Before Cyclone, it could take hours and sometimes days for articles to make it to every leaf node. • Now, articles typically arrive at wellconnected sites within a few seconds and to less well-connected sites within minutes.

What is Next? • • Discussion Message Boards Replicated Discussion Message Boards Archived Instant

What is Next? • • Discussion Message Boards Replicated Discussion Message Boards Archived Instant Messages Archived Mailing Lists Hubs for Peer-to-Peer Numbering Servers Archive Servers ? ? ? ? ?

Other Servers Two other servers, both Highwinds inventions: • • NUMBERING SERVER - Hurricane

Other Servers Two other servers, both Highwinds inventions: • • NUMBERING SERVER - Hurricane ARCHIVE SERVER - Tornado BE Hurricane Tornado BE

Anatomy of an Article • Governed by RFC 850 (same as SMTP) • Header

Anatomy of an Article • Governed by RFC 850 (same as SMTP) • Header + Body Header Body

Article Anatomy - Gory Details • Know your separators! • CR = ^M =

Article Anatomy - Gory Details • Know your separators! • CR = ^M = r • LF = ^J = n Header. Name: Value rn SEPARATOR rn Body TERMINATOR rn

Article Headers • QUIZ

Article Headers • QUIZ

Article Headers, Continued. . . • The basics: – – – • • From

Article Headers, Continued. . . • The basics: – – – • • From Subject Newsgroups Path Message-ID XRef Path Date NNTP-Posting-Host NNTP-Posting-Date X-Trace References

An Article Path: ndnws 01. ne. mediaone. net!chnws 05. ne. mediaone. net!24. 128. 1.

An Article Path: ndnws 01. ne. mediaone. net!chnws 05. ne. mediaone. net!24. 128. 1. 91!chn ws 02. mediaone. net!192. 148. 253. 68!netnews. com!newshub. northeast. verio. net!nuq-peer. news. verio. net!dfw-artgen. news. verio. net!ordread. news. verio. net. POSTED!not-for-mail From: "Henry C. Barta" <hbarta@miles. wwa. com> Subject: Re: old IBM thinkpads and linux? Newsgroups: comp. os. linux. portable References: <19990728232115. 18753. 00003295@ng-ba 1. aol. com> Message-ID: <h. St. B 3. 112$DY. 3777@ord-read. news. verio. net> Date: Wed, 08 Sep 1999 13: 52: 13 GMT I ran Linux on a 750 Cs - 33 MHz 486, 12 MB Ram and 360 MB hardrive. I was able to install X and run that. It was pretty OK

A Day in the Life of an Article • • • Creation Posting Feeding

A Day in the Life of an Article • • • Creation Posting Feeding / Transit - By Message-ID Storage - By Group and Number Reading Expiration

Article Creation and Posting • Minimal requirements on the user • The First Time

Article Creation and Posting • Minimal requirements on the user • The First Time Is Always Special: POST vs. IHAVE • Post Filtering – SPAM prevention – Accountability • Moderated Groups • Upstream

Article Feeding and Transit • Each server offers articles to the other servers it

Article Feeding and Transit • Each server offers articles to the other servers it knows about, its “peers” • During feeding Message-ID matters more than Group/Article Number • Servers offer articles with IHAVE • Servers refuse already-seen articles • The “history” • IHAVE is chatty. . . • CHECK / TAKETHIS

Article Feeding II: NNTP STREAMING

Article Feeding II: NNTP STREAMING

Article Feeding III: Adaptive Streaming • Key Question: How many CHECKs should you send?

Article Feeding III: Adaptive Streaming • Key Question: How many CHECKs should you send? • Multiple modes, appropriate for all different load conditions • Optimized for “getting articles into” another server • Key Cyclone differentiator

Article Storage • In the Highwinds servers, articles are stored in Spools. • Articles

Article Storage • In the Highwinds servers, articles are stored in Spools. • Articles are indexed for retrieval by readers • The Active File and the Overview Database create the “illusion” of groups.

Space Utilization • Spools are always full • Articles expire based on SPACE, not

Space Utilization • Spools are always full • Articles expire based on SPACE, not TIME • You decide how much space to allocate, by hierarchy or article size • New data overwrites old data, as needed, and without any pause for cleanup

Data Storage Philosophy • • • Customized Data Structures Maximize Utilization Maximize Locality Avoid

Data Storage Philosophy • • • Customized Data Structures Maximize Utilization Maximize Locality Avoid inode hits Allocate space at install time • RESULT: Block-Oriented Storage

Reading Articles • • LIST ACTIVE: What groups exist? GROUP: Select a group for

Reading Articles • • LIST ACTIVE: What groups exist? GROUP: Select a group for reading XOVER: What articles exist? ARTICLE: Finally, read an article • All of these commands depend on article numbers.

How to Write a News Reader

How to Write a News Reader

How to Write a News Reader II

How to Write a News Reader II

Structure of the Highwinds Servers • Installation uses a hellishly complicated object-oriented GUIdriven tool

Structure of the Highwinds Servers • Installation uses a hellishly complicated object-oriented GUIdriven tool called. . . tar. • Three configuration files control the server behavior: – start. conf for command-line arguments – <server>. conf for storage declarations and whole-server parameters – feeds. conf for controlling how the server communicates with other servers and with users

start. conf • Contains parameters most likely to change during server tuning • Is

start. conf • Contains parameters most likely to change during server tuning • Is called by the bin/start script

<server>. conf • Contains mostly parameters having to do with storage • Declares the

<server>. conf • Contains mostly parameters having to do with storage • Declares the filesystem paths for storage objects • SPOOLS • OVERVIEWS • HISTORY • ACTIVE FILE • OVERVIEW CACHES

feeds. conf • Allows you to tune server-server and server-browser communication in excruciating detail

feeds. conf • Allows you to tune server-server and server-browser communication in excruciating detail • Virtual Servering - the server appears different to different users • Fine-grained feeding control Cyclone • Virtual-servering primitives: Incoming. Hostnames, rate limiting, groups visible • Cyclone feeding primitives: backlogs, retry and failover settings

Virtual Servering on Steroids: Authentication Programs • • Ultra-fine control Server-spawned program Multiple instances

Virtual Servering on Steroids: Authentication Programs • • Ultra-fine control Server-spawned program Multiple instances allowed Full real-time override of all virtual server parameters • Intercepts allowed at connect • Can force users to authenticate with username/password

Death to Spammers: SPAM filtering • Server-spawned slave program • With -fastfilter, parallel filters

Death to Spammers: SPAM filtering • Server-spawned slave program • With -fastfilter, parallel filters and shared-memory communication • Program allows or denies each incoming article • The post filter can do spam filtering for locally-sourced articles – Delayed acknowledgement – False acknowledgement

Details of Content Control • • Subscription Filter. Subscription Globs Used for spools, overview

Details of Content Control • • Subscription Filter. Subscription Globs Used for spools, overview caches, and virtual servering • Examples: – Subscription * – Filter. Subscription special. * – Subscription basic. * – Filter. Subscription !basic. * • CROSSPOSTING

Logfiles and Log Rotation • Many logfiles available: – article log • 0 x,

Logfiles and Log Rotation • Many logfiles available: – article log • 0 x, 0 i, 0 s, etc. – Stats. in – Stats. out – Stats. reader – Stats. group • Logs are buffered • bin/statsnow • Rotation Recipe – mv log. old – bin/statsnow – sleep 5 – gzip log. old

FINAL ADVICE • READ the config files. They contain great examples. • Use bin/validate

FINAL ADVICE • READ the config files. They contain great examples. • Use bin/validate before bin/restart. • Decide your storage policies early. • Get lots of spindles working for you. • Check the syslog!