Resource Sharing Across Users in Server Clusters Optimizing

Enterprise (Information) Systems • Any kind of computing system that is of "enterprise class"

Optimizing and Scaling Enterprise Applications… • Enterprise applications are constructed using a multi-tier architecture

WS vs. AS • Web servers – Do well defined and quantifiable local work

Inside the Application Layer 3 -tier model HT ML PRESENTATION ADDT’L SERVICES BUSINESS LOGIC

Inside the Application Layer… PRESENTATION Code Block(s) . . . 2. Servlet contacts CMS

Performance and Scalability Issues • Computationally-intensive logic executed at multiple tiers • Cross-tier communication

Optimizing the Application Layer Traditional Means • Optimize each tier independently: – Presentation-level caches

Query result caching • Many application server products offer this feature -- mitigates only

Middle tier database caching • Caching database tables in main memory Oracle 9 i

Page Level Caching • Dynamically generated HTML pages are cached + Can completely offload

Optimizing the Application Layer Issues • Traditional techniques impact specific components within the application,

Key ideas • Re-use program results to eliminate redundant work • Facilitate single-point, architecture-wide

Optimizing the Application Layer PRESENTATION • JSP • ASP cache ADDT’L SERVICES BUSINESS LOGIC

Usually…. PRESENTATION Code Block(s) . . . 2. Servlet contacts CMS ADDT’L SERVICES •

With Our Solution… Appl. Programming Interface PRESENTATION BUSINESS LOGIC DATA CONNECTOR Chutney tags Code

Cache Management • A critical aspect of any caching solution • Support novel cache

Cache Replacement Site Graph News • Prediction-based replacement Sports ⁻ fragments having lowest probability

Cache Invalidation Need to support common cache invalidation techniques: – Time-based: Each cache element

Cache Invalidation… • Other invalidation techniques supported: – Observation-based • User-initiated updates are observed

Other Fragment Level Caching… app servers (e. g. , BEA’s Web. Logic, IBM’s Web.

Performance Study… Test Site – Fictitious online retail site, allows browsing of product catalog

Performance Study… Test Setup – Content Database Server: Oracle 8. 1. 6 – Web/Application

Testing Methodology. . . • Baseline Parameters: – Cache Size, i. e. , percentage

Performance Impact 80% faster response times through existing application infrastructure Source: Fortune 100 client

Chutney Throughput Impact 250% increase in transaction rates Source: Fortune 100 client results

Broad Interoperability Java-based JSP, Servlets, EJB, BEA Web. Logic, IBM Web. Sphere, i. Planet,

Alternative: CDNs Sources Repositories Content Distribution Networks e. g. , Akamai Push Based Core

Request Distribution within Clusters Maximizing affinity Exploit application characteristics

Summary • Bottlenecks persist throughout multi-tier architectures • Traditional optimization approaches focus on individual

Slides: 30

Download presentation

Resource Sharing Across Users in Server Clusters Optimizing and Scaling Enterprise Applications Krithi Ramamritham IIT Bombay krithi@iitb. ac. in

Enterprise (Information) Systems • Any kind of computing system that is of "enterprise class" – offering high quality of service – dealing with large volumes of data – capable of supporting a large organization -- “an enterprise“ • Enterprise Information Systems – provide a technology platform that enables organisations to integrate and coordinate their business processes. – provide a single system that is central to the organisation. – ensure that information can be shared across all functional levels and management hierarchies. – help eliminate the problem of information fragmentation caused by multiple information systems in an organisation.

Optimizing and Scaling Enterprise Applications… • Enterprise applications are constructed using a multi-tier architecture for simplified development and maintenance Request Application Server Layer Content Web Server Layer Database • Considerable time and money is invested in the server infrastructure • A significant amount of developer time is being spent to optimize Web applications

WS vs. AS • Web servers – Do well defined and quantifiable local work • e. g. , processing HTTP headers, serving static content • Application servers – Run multi-layer programs • e. g. , scripts involving calls to backends

Inside the Application Layer 3 -tier model HT ML PRESENTATION ADDT’L SERVICES BUSINESS LOGIC • Commerce • Content Mgt. • Personalization DATA CONNECTOR Databases Legacy Systems • JSP • ASP • Servlets • COM+ • EJB • JDBC • ODBC Ob jec Ro ts w. S et

Inside the Application Layer… PRESENTATION Code Block(s) . . . 2. Servlet contacts CMS ADDT’L SERVICES • Commerce • Content Mgt. • Personalization 3. CMS requests data 4. DBMS calls storage system Databases BUSINESS LOGIC DATA CONNECTOR Legacy Systems Code Block(s) . . . • JDBC • ODBC 1. JSP invokes a Servlet

Performance and Scalability Issues • Computationally-intensive logic executed at multiple tiers • Cross-tier communication • Object instantiation and cleanup processing • External I/O calls • Database connection pool latencies • Content conversion and formatting

Optimizing the Application Layer Traditional Means • Optimize each tier independently: – Presentation-level caches built inside application server processes – Main memory database employed over persistent DBMS – Persistent object storage techniques employed inside content management systems … and so on PRESENTATION ADDT’L SERVICES BUSINESS LOGIC DATA CONNECTOR • JSP • ASP • Servlets • COM+ • EJB • JDBC • ODBC Local cache and optimization code

Query result caching • Many application server products offer this feature -- mitigates only local database access latency -- only a subset of query results may be reused in page generation -- page fragments may not all be from databases

Middle tier database caching • Caching database tables in main memory Oracle 9 i Cache Main-memory databases, e. g. , Times. Ten -- mitigates only database access latency -- caching at table granularity results in poor cache utilization -- main-memory databases are difficult to integrate and maintain and can be expensive

Page Level Caching • Dynamically generated HTML pages are cached + Can completely offload work from web/app server – Low reusability for highly personalized web pages – URL may not uniquely identify a page -- increasing the risk of delivering incorrect pages – Often introduces excessive invalidations -- e. g. , even if a single element on the page changes

Optimizing the Application Layer Issues • Traditional techniques impact specific components within the application, but not the entire application – No mitigation of component-to-component interaction latencies – Different synchronization and invalidation policies risk data integrity – Each optimization scheme consumes programmer time for development and maintenance

Key ideas • Re-use program results to eliminate redundant work • Facilitate single-point, architecture-wide optimization Apply to both programmatic objects and result fragments

Optimizing the Application Layer PRESENTATION • JSP • ASP cache ADDT’L SERVICES BUSINESS LOGIC • Commerce • Content Mgt. • Personalization DATA CONNECTOR Databases Legacy Systems • Servlets • COM+ • EJB • JDBC • ODBC Enables the results of programs to be re-used.

Usually…. PRESENTATION Code Block(s) . . . 2. Servlet contacts CMS ADDT’L SERVICES • Commerce • Content Mgt. • Personalization 3. CMS requests data 4. DBMS calls storage system Databases BUSINESS LOGIC DATA CONNECTOR Code Block(s) . . . 1. JSP invokes a Servlet • JDBC • ODBC Legacy Systems Plus, at each step there are communication delays and logic processing delays

With Our Solution… Appl. Programming Interface PRESENTATION BUSINESS LOGIC DATA CONNECTOR Chutney tags Code Block(s) . . . Can store any program output, but is most commonly an HTML fragment or a Programmatic Object. Real-time storage engine Function Parameter(s) Result Code Block(s) . . . • JDBC • ODBC Tags trigger calls to the storage engine. When the Result of a Function with a specific Parameter set is already known (and up-todate), the work normally necessary to produce that Result is bypassed.

Cache Management • A critical aspect of any caching solution • Support novel cache management strategies: – Prediction-based cache replacement – Observation-based cache invalidation

Cache Replacement Site Graph News • Prediction-based replacement Sports ⁻ fragments having lowest probability of access replaced ⁻ Least-Likely-to-be-Used (LLU) – Access probabilities based on: • Current user navigational patterns over site graph (in the form of clickstreams) • Historical user navigational patterns over site graph (in the form of association rules) Hockey Schedules Scores Players Teams (News, Sports, Hockey) Schedules = 20% (News, Sports, Hockey) Players = 15% LLU (News, Sports, Hockey) Teams = 10% (News, Sports, Hockey) Scores = 55%

Cache Invalidation Need to support common cache invalidation techniques: – Time-based: Each cache element assigned a TTL – Event-based: Updates to the database send an invalidation message to the cache – On demand: Manual invalidation of selected elements – ….

Cache Invalidation… • Other invalidation techniques supported: – Observation-based • User-initiated updates are observed in scripts; each such update sends an invalidation message to the cache • Most appropriate for auction sites, online trading sites • Invalidation does not require communication with the databases – Keyword-based: • Elements can be associated with keywords; e. g. , a retailer may wish to invalidate all “seasonal” items – Regular expression-based: • Elements can be invalidated based on regular expression matching

Other Fragment Level Caching… app servers (e. g. , BEA’s Web. Logic, IBM’s Web. Sphere) cache fragments produced by JSP scripts + can offload presentation layer tasks – runs in the application server process space => competes for server resources – application server cluster => multiple cache instances, duplication of content, additional synchronization overhead Application Server Cluster

Performance Study… Test Site – Fictitious online retail site, allows browsing of product catalog – Pages generated using JSP scripts – Site content stored in Oracle database – Database schema based on Dublin Core Metadata Open Standard – Contains 200, 000 products and 44, 000 categories – Each page consists of 3 components, each involving a database call

Performance Study… Test Setup – Content Database Server: Oracle 8. 1. 6 – Web/Application Server: Web. Logic 6. 0 running on cluster of 2 machines – Server machines: have 1 GB RAM, dual P III-933 Mhz processors run Windows 2 K Advanced Server

Testing Methodology. . . • Baseline Parameters: – Cache Size, i. e. , percentage of fragments that fit into cache: 75% – Cache replacement policy: LLU • User load is varied by sending requests from client machines running Radview’s Web. Load • Simulated users navigate site according to Zipf 80 -20 distribution (i. e. , 80% of users follow 20% of navigation links)

Performance Impact 80% faster response times through existing application infrastructure Source: Fortune 100 client results

Chutney Throughput Impact 250% increase in transaction rates Source: Fortune 100 client results

Broad Interoperability Java-based JSP, Servlets, EJB, BEA Web. Logic, IBM Web. Sphere, i. Planet, Broadvision, etc. Presentation Business Logic Data Microsoft-based ASP, COM, IIS, MS Transaction Server, etc. Presentation Other Cold. Fusion, Perl, etc. Presentation Chutney cache Business Logic Data Multi-server, heterogeneous environments can interface with a single storage engine.

Alternative: CDNs Sources Repositories Content Distribution Networks e. g. , Akamai Push Based Core Infrastructure Clients

Request Distribution within Clusters Maximizing affinity Exploit application characteristics

Summary • Bottlenecks persist throughout multi-tier architectures • Traditional optimization approaches focus on individual components, not the entire application • Need a solution which optimizes every tier of a web application, globally