MOLECULAR BIOLOGY PATHOLOGY IN EPIDEMIOLOGY Jian Yu Rao
MOLECULAR BIOLOGY & PATHOLOGY IN EPIDEMIOLOGY Jian. Yu Rao, M. D. Associate professor of pathology and epidemiology UCLA
Molecular Biology - Outline • • • Introduction Basic Principles of Molecular Biology Core Techniques of Molecular Biology High Throughput Technologies Epigenetics – DNA Methylation
INTRODUCTION • 1953 - Discovery of DNA double helix (Crick & Watson) • 1960 s - DNA transcription mechanism • 1970 s - Recombinant DNA technology • 1980 s - PCR • 1990 s - Human genome project/DNA chips • 2000 – Genome Wide Association (GWA) Studies
Basic Principles of Molecular Biology • DNA structure – 4 bases (nucleotide): 2 pyrimidines thymine (T) and cytosine (C), and 2 purines adenine (A) and guanine (G) – Form double helix by baseparing through H-bond (A to T and G to C) and a backbone consists of sugars and phosphate. – The strands have polarity (3’ to 5’ or vice versa) and are complementary to each other.
– Genetic information is organized lineally: • A codon is the basic unit with 3 consecutive nucleotides that specifies a single aa. 5’ 3’ • A gene is a segment of DNA (with lineally linked multiple codeons) that specifies a protein. 5' –CCT GGT CCT CTG ACT GCT - 3' • A chromosome contains several thousands genes and is the smallest replicating unit (human K H L … has 46 chromosomes). • The genome is the entire set of information that an organism contains.
Basic Principles of Molecular Biology (cont. ) • Gene structure – Gene is compose of a upstream 5’ regulatory region (TATA box or CAAT box), several exons (expressed gene sequences), and intervening intrones (nonexpressed sequence). – There a total of 100, 000 genes estimated in mammalian genome. – Less than 30% of the genome is ever transcribed into RNA, and only a fraction of that is translated into protein.
– More than 70% of entire genome is not transcribed and is composed of many stretches of repetitious sequences that can repeat on scales of 5 -10 bp, to 5000 -6000 bp. Species specific type of repeats, termed Alu sequences, are useful as markers for identifying genes transferred between species. – A gene family are a number of closely linked genes that code for structurally and functionally related proteins.
Basic Principles of Molecular Biology (Cont. ) • Gene transcription (DNA to m. RNA) – m. RNA (message RNA) is the template for protein synthesis. – Only the exon sequences of a given gene is transcribed. – Transcription begins by binding of RNA polymerase II on initiation site. This process requires a transcription factor which is a protein recognizing the region of DNA to be transcribed.
– A “primary transcript” which ranges from the initiation site to a termination site (including all the exons and introns) is produced initially, followed by adding a cap (methylated G) at 5’ end a Poly A tail at 3’end, and finally by several steps of splicing (cut off the introns). – The produced mature m. RNA is then exported from nuclear to cytoplasm by unknown mechanisms for translation.
Basic Principles of Molecular Biology (Cont. ) • Translation (m. RNA to protein) – The translation is taken place in cytoplasm, in ribosomes. – Proteins are further modified by post -translational modification steps, including proteolytic cleavage, addition of carbohydrate or lipid motifs, and modification of a. a. . • Gene expression in a cell is influenced by both the micro (surrounding cell, tissue, organ) and macro (endocrine and paracrine) environments.
Core Techniques • Restriction Endonucleases – Enzymes found in bacteria that cleave DNA at precise sequences. – Named by the organisms of origin (eg. Eco. RI is from E Coli R strain). – Size of fragments produced is a function of the number of the bases in the restriction site. (eg. , 4 cutters produce DNA into smaller fragments while 8 cutters produce gene-sized DNA fragments).
Core Techniques (Cont. ) • Hybridization – Based on the property of DNA base paring (A to T and G to C). – The principle is the recognition of a complementary sequence (gene to be detected) by a short sequence (Probe). • The two strands of targeted DNA needs to be separated into single strands by a process of melting at first, followed by annealing (reform the double strand) after adding the probe.
– The annealing depends on several factors, including DNA concentration, the time, the temperature, and the concentration of salts. The stringency of annealing is a function of temperature and salt concentration. – Examples: • Dot or slot blot • In situ hybridization (FISH, gene or chromosome) • Northern or Southern blot – Needs to know the DNA sequence to be fished.
Core Techniques (Cont. ) • Electrophoresis – A technique to separate nucleic acids and proteins by size and charge. – All electrophoretic techniques are carried out using a supporting gel of controlled pore size. – Most separations are by size of moleculars (large one stay, the small one migrate), while the charge governs the actual migration of the moleculars. • Polyacrylamide - for small noncharged moleculars (DNA) • Agarose - for large noncharged moleculars (DNA/RNA) • urea and SDS - for charged moleculars (protein)
– Procedure: • Making a gel and buffers (loading and running buffers) • Apply sample into the well • Apply voltage (100 to 1000 s depends on the size of gel) • Visualize and detection (staining the gel, or transfer the moleculars into membranes)
Core Techniques (Cont. ) • Sourthern blot - for DNA (RFLP) • Northern blot - for RNA • Western blot - for protein
Core Techniques (Cont. ) • Isolation of DNA and RNA – It is crucial to have pure source of DNA or RNA for the accurate analysis. – The purity is indicated by the ratio of OD reading (OD 260 vs 280, which measures nucleic acids vs protein, respectively) – RNA is much less stable than DNA, due to the widely present RNases. – The major method for DNA isolation is the phenol-chloroform extraction (phenol allows dissociation of DNA from protein, whereas chloroform promotes the protein denaturation). Followed by separation with centrifugation, the DNA is present at upper phase.
– The major method for m. RNA isolation is by modified phenol-chloroform method that requires a inhibition of RNase using guanidinium and a purification step using either oligo(d. T) chromatography or beads. – Source of DNA can be any fresh or archived small amount materials (paraffin blocks, trace amount of old blood, saliva, etc), while m. RNA usually requires large amounts of fresh or immediately frozen samples.
Core Techniques (Cont. ) • PCR (Polymerase Chain Reaction) – Revolutionize the detection technique for nucleic acids (DNA and RNA), also useful for cloning and site-directed mutagenesis. – The principle is by cycling the temperature changes from denaturation (95 C), annealing (50 C), and hybridization (70 C), it allows a molecular (single stranded) to replicate itself exponentially. – Requires primers, DNA polymerase, nucleoside triphosphates, and magnesium ion.
– Limitations of PCR: • • • Primer selectivity Primer dimer formation Contamination Nonspecific priming Temperature design for GC rich or AT rich genes (incomplete melting or incomplete annealing, respectively). – In epidemiological studies it is used for detecting the presence/absence of genes (DNA or RNA), measures the level of genes, or detect the specific forms of mutations, etc.
Core Techniques (Cont. ) • Examples of Variant PCR – – – – LCR (for detection of point mutation) Competitive PCR (for quantification of DNA copy #) RT-PCR (for m. RNA detection and quantification) SSCP (for screening of gene mutation) In situ PCR TRAP (for telomerase activity detection) Real-Time PCR
Core Techniques (Cont. ) • Monoclonal Antibodies – Or so called immunoglobulins, are antibodies capable of recognizing only one specific antigen (epitope). – Developed by various techniques e. g. , hybridoma, Phgae-display, etc. – Used in molecular epidemiological studies to detect any protein products (such as oncogene products, growth factors, receptors, etc) in a highly specific and often quantitative manner by various methods such as ELISA, EIA, immunohistochemistry, immunocytochemistry, etc.
– All these methods are basically use the same principle, i. e. , antigen-antibody reaction. They can be either direct (without amplification step) or indirect (with amplification steps)and a detection step (with enzyme colormatrix or fluorescence). • 3 steps immunofluorescence to detect a tumor specific antigen M 344 – Step 1: Incubate cells with Mc. Ab (mouse anti human) against M 344 – Step 2: Incubate with biotinlated Goat (or rabbit) anti mouse Ig. G (amplification) – Step 3: Incubate with streptavidin-Texas Red (amplification/detection)
QFIABiomarker Profile G-actin: Texas-Red conjugated DNase I M 344: FITC (or Rhodamin) 3 Step Immunofluorescence DNA: Hoechst or DAPI
Core Techniques (Cont. ) • RFLP - Microsattelite marker - SNP – RFLP is the method to detect alterations (mutation) of one specific gene. – Microsattelite markers are simple tandem repeat polymorphisms of several locus, which replaces RFLP as markers for disease – SNP - are single nucleotide variants of entire genome - therefore are much more powerful and may replace Microsattelite markers or RFLP as markers of disease • More prevalent in the genome than microsattelites in genome • Some SNPs located in genes directly affect protein structure or expression levels • More stably inherited • Better for high throughput analysis
SNPs - Definition “Single base pair positions in genomic DNA at which different sequence alternatives (alleles) exist in normal individuals in some population, wherein the least frequent allele has an abundance of 1% or greater” (Brookes, Gene, 1999).
How to Define SNPS? Conventional way: • develop sequence tagged sites (STS) • identify DNA sequence variants • estimate allele frequencies of the marker • place the marker in human genome • obtain DNA sequence More powerful – Genome Wide Association Studies (GWA)
Genome Wide Association (GWA) Study • Help to identified genetic susceptibility markers for cancer – Prostate: Chromosome 8 q 24 (Gudundsson, et al, Nature genetics/Yeager, et al, Nature Genetics, 2007) – Lung: Chromosome 15 q 25 (nicotinic acetylcholine receptor subunits) (Huang, et al, Nature 2008/Amos, et al, Nature Genetics, 2008/Thorgerisson, et al, Nature genetic, 2008) • Genes identified in these locus may also be the targets for chemopreventive drug development
High Throughput Techniques • Microarray technology – DNA chips • c. DNA array format • in situ synthesized oligonucleotide format (Affymetrix) – Proteomics – Tissue arrays • These are powerful tools and high through put methods to study gene expression, but they are not the answers themselves • Individual targets/patterns identified need to be validated • In epidemiological studies, these methods can be used to identify specific exposure induced molecular changes, individual risk assessments, etc.
An example of our 9000 gene mouse-arrays using differential expression analysis with Cy 3 and Cy 5 fluorescent dyes.
Proteomics • Examine protein level expression in a high throughput manner • Used to identify protein markers/patterns associated with disease/function • Different formats: – SELDI-TOF (laser desorption ionization time-of-flight): the protein-chip arrays, the mass analyzer, and the data-analysis software – 2 D Page coupled with MALDI-TOF (matrix-assisted laser desorption ionization time-of-flight) – Antibody based formats
A, GTE (20 g/ml) 3. 5 4. 5 5. 1 5. 5 6. 0 7. 0 8. 4 Fig 1 p. I 9. 5 3. 5 4. 5 5. 1 5. 5 6. 0 7. 0 9. 53. 5 4. 5 5. 1 5. 5 8. 4 6. 0 7. 0 8. 4 9. 5 MW (k. Da) 217 116 98 8 55 2 10 7 17 6 30 16 12 5 11 13 2 10 1 9 8 1 5 11 37 9 7 13 6 18 17 16 12 14 14 3 20 15 B, GTE (40 g/ml) 3. 5 4. 5 5. 1 5. 5 6. 0 7. 0 8. 4 18 3 15 4 4 p. I 9. 5 3. 5 4. 5 5. 1 5. 5 6. 0 7. 0 8. 4 9. 5 MW (k. Da) 217 116 98 55 5 1 10 11 37 12 14 10 5 13 17 30 20 19 11 17 18 16 12 14 15 20 48 hr GTE: - 16 15 4 Time: 1 24 hr 48 hr + + 13 18
Tissue Array • Provide a new high-throughput tool for the study of gene dosage and protein expression patterns in a large number of individual tissues for rapid and comprehensive molecular profiling of cancer and other diseases, without exhausting limited tissue resources. • A typical example of a tissue array application is in searching for oncogenes amplifications in vast tumor tissue panels. Large-scale studies involving tumors encompassing differing stages and grades of disease are necessary to more efficiently validate putative markers and ultimately correlate genotypes with phenotypes. • Also applicable to any medical research discipline in which paraffin-embedded tissues are utilized, including structural, developmental, and metabolic studies.
Bladder Array Gelsolin HE
DNA Methylation DNA methylation plays an important role in normal cellular processes, including X chromosome inactivation, imprinting control and transcriptional regulation of genes It predominantly found on cytosine residues in Cp. G dinucleotide, Cp. G island, to producing 5 -Methylcytosine Cp. G islands frequently located in or around the transcription sites
DNA Methylation (Cont’d) Aberrant DNA methylation are one of the most common features of human neoplasia Two major potential mechanisms for aberrant DNA methylation in tumor carcinogenesis Point mutation: C to T transition (e. g. P 53 gene) Silencing tumor suppressor genes (e. g. p 16 gene) Source: Royal Society of Chemistry
Promoter-Region Methylation Promoter-region Cp. G islands methylation • Is rare in normal cells • Occur virtually in every type of human neoplasm • Associate with inappropriate transcriptional silence • Early event in tumor progression In tumor suppressor genes Most of the tumor suppressor genes are under-methylated in normal cells but methylated in tumor cells. Methylation is often correlated with an decreasing level of gene expression and can be found in premalignant lesions
DNA methyltransferases DNMTs catalyze the transfer of a methyl group (CH 3) from Sadenosylmethionine (SAM) to the carbon-5 position of cytosine producing the 5 -methylcytosine There are several DNA methyltransferases had been discovered, including DNMT 1, 3 a, and 3 b
Pathology - Objective • To learn basic histopathological terminology. • To know different types of tumor.
What is the difference between “tumor” vs “cancer” Tumor – Either benign or malignant Cancer – Usually malignant
Classification of Tumors -Based on histological origin (epithelial, mesenchyme, etc. . ) -Based on biological behavior (benign vs malignant)
PATHOLOGICAL REPORT • • Tumor histological type. Tumor stage. Tumor grade. Other features (size, % necrosis, lymphovascular invasion…)
CANCER HISTOLOGICAL TYPE • Three Major Categories: – Epithelial – “Carcinoma” – Mesenchyme – “Sarcoma” – Hematopoitic – “Leukemia/Lymphoma” • Other Minor Categories: – Nevocytic – “Melanoma” – Germ cell – Teratoma, Seminoma, Yolk sac tumor, Choriocarcinoma, etc… – Endocrine/Neuro – Carcinoid/Insulinoma/small cell carcinoma, etc…
CARCINOMA • • Squamous – Squamous Cell Carcinoma. Glandular - Adenocarcinoma. Transitional – Transitional Cell Carcinoma. Small cell – Small cell carcinoma
SARCOMA • Muscle – Smooth muscle: Leiomyosarcoma – Skeletal muscle: Rhabdomyosarcoma • Fat – Liposarcoma • Skeleton – Osteosarcoma • Cartilage – Chondrosarcoma
Classification of tumor according to their morphologic features (histology) • Morphologic classification refers to the histologic classification made by pathologist based on microscopic examination.
Benign vs Malignant Tumor • The main distinction between benign and malignant tumor is: – Malignant tumor has invasion and metastatic potential whereas benign tumor does not. – Malignant tumor has features of abnormal cellular differentiation whereas benign tumor usually not.
Why histologic classification is important in cancer epidemiology? • Cancer is not ONE disease • Different cancer types of same organ may have different exposure etiology, pathogenesis, as well as behavior, i. e. , HETEROGENEITY
Carcinoma • Carcinoma (Cancer of the epithelium) 85% Epithelium is the term applied to the cells that cover the external surface of the body or that line the internal cavities, plus those cells derived from the linings that form glands.
Why most common cancers are epithelial origin? • These cells are the first point of contact of the body with environmental substances, either directly (squamous cells) or indirectly (glandular cells). • Epithelial cells usually have fast turn over rate, i. e. , fast cell division, and their DNA can be damaged by carcinogens more often than non-dividing cells.
Carcinoma: Squamous cell • Originates from stratified squamous epithelium of the skin, mouth, esophagus, and vagina, as well as from areas of squamous metaplasia, as in the bronchi or squamocolumnar junction of the uterine cervix. SCC is marked by the production of keratin.
Skin Cancer
Squamous Cell Carcinoma
Carcinoma, Transitional Cell • Transitional cell carcinoma - arise from the transitional cell epithelium of the urinary tract, such as bladder.
transitional cell carcinoma of the urothelium is shown here at low power to reveal the frond-like papillary projections of the tumor above the surface to the left. It is differentiated enough to resemble urothelium, but is a mass. No invasion to the right is seen at this point.
TCC at high power
Carcinoma: Adenocarcinoma • Adenocarcinoma - is carcinoma of glandular epithelium and includes malignant tumors of the gastrointestinal mucosa, endometrium, and pancreas; and is often associated with desmoplasia, tumor-induced proliferation of non-neoplastic fibrous connective tissue, particularly in adenocarcinoma of the breast, pancreas, and prostate.
Prostate Ca Ovarian Ca
Sarcoma • Sarcoma is a malignant tumor of mesenchymal origin • Sarcoma is often used with a prefix that denotes the tissue of origin of the tumor, as in osteosarcoma (bone), leiomyosarcoma (smooth muscle), rhabdomyosarcoma (skeletal muscle), and liposarcoma (fatty tissue).
Classification of tumor according to stage
Stage • -is clinical assessment of the degree of localization or spread of the tumor. • -generally correlated better with prognosis than dose histopathologic grading. • -is examplified by the generalized TNM system, which evaluates size and extent of tumor (T), lymph node involvement (N), and metastasis (M). • -different staging systems (WHO, TNM, etc), sometimes oriented toward specific tumors, e. g. , Dukes system for colorectal carcinomas.
Classification of Tumor according to its differentiation (grade)
Grade of Disease • Grading is histo-pathologic evaluation of the lesion based on the degree of cellular differentiation and nuclear features: Well Differentiated (Grade I) – more resemble to normal tissue/cell Moderately differentiated (Grade II) - less resemblance of normal tissue/cell Undifferentiated (Grade III) - lost resemblance to normal tissue/cell
Gleason's breakthrough was to develop a reproducible description of the glandular architecture, to which one assigns a score from 1 to 5. The pathologist looks for a major pattern and a minor pattern to give a Gleason sum between 2 and 10. On the left is a picture adapted from Gleason's 1977 article demonstrating the changes in gland pattern as one goes from grade 1 to grade 5 cancer. The glands in grade 1 cancer are small and round. Grade 5 cancer is hardly forming glands at all.
Gleason Grade 1 Prostate Cancer
At right is Gleason 3 Ca. P. The glands are irregularly shaped. They are mixed in with some normal glands. This tumor is infiltrating the prostate. At higher magnification, there are nests of glands with no intervening stroma. This is characteristic of higher grade Ca. P
Here is Gleason 5, or poorly differentiated cancer. You can see that it is invading the seminal vesicle (stage T 4) The cells are not organized into glands, but are infiltrating the prostate as cords.
Precursors from intraepithelial neoplasia (IN) to carcinoma in situ (CIS)
NORMAL CIN 1 NORMAL LGSIL CIN 2 HG SIL CIN 3 HGSIL
Important for Epidemiologist • Study nature history of disease progression • Study genetic/environmental factors associate with disease progression • Develop tools for risk assessment/early detection • Targets for chemoprevention
Exposure to Carcinogen Birth Precancerous Intraepithelial Lesions, (PIN, CIN, Pa. IN. . ) Additional Molecular Event Cancer Surrogate End Point Markers Genetic Suscep. Markers for Exposure Markers of Effect CHEMOPREVENTION Tumor Markers
SUMMARY • The key is to understand tumor hetergeneity: – Human cancer is not one disease , but many different types of diseases (Disease heterogeneity). – The same type/stage/grade of tumor may behave differently in different person (Behavior heterogeneity). – Even within the same tumor, there are may be different histological appearances and molecular expressions/changes (Expression heterogeneity). • As an epidemiologist, we should know the basic features of the disease, and design studies accordingly
Thank You!
- Slides: 84