# Organizing Geographic Data Module 4 ESRI Virtual Campus

• Slides: 79

Organizing Geographic Data Module 4 ESRI Virtual Campus Learning Arc. GIS Desktop Training Course ESRI Arc. GIS Statewide Curriculum

Introduction • First step in creating geographic data – Identifying features, events, and phenomena and associating them with a location • Geographic data – recorded information about the earth's surface and the objects found on it, associated to a geographic location. • In this module, you will learn more about geographic data—how it is organized and stored in a GIS, and how it can be assembled into a useful GIS database. Statewide Curriculum

Learning Objectives • Describe two common data models used to represent geographic data • List different geographic data formats • Determine the data source of a layer in Arc. Map • Identify data formats in Arc. Catalog • Create a geodatabase • Add data from different formats to a geodatabase Statewide Curriculum

Exploring Geographic Data • Before working with data in a GIS – Data must be in a digital format • How to translate real-world features into digital features? – Data model • Data model – Defines how to abstract real-world features into a format that can be understood by a computer. • Two main data models used to represent features Statewide Curriculum

Geographic Data Models • Two common data models used to represent geographic data – Vector data model – Raster data model Statewide Curriculum

Vector Data Model • Based on assumption earth's surface composed of discrete objects – Trees, rivers, lakes, etc. • Objects represented as – point, line, and polygon features – with well-defined boundaries • Feature boundaries are defined by – x, y coordinate pairs • which reference location in real world Statewide Curriculum

X, Y Coordinates • Pair of values that represents – distance from an origin (0, 0) – along two axes • horizontal axis (x) representing east-west • vertical axis (y) representing north-south • On a map – x, y coordinates represent features at location found on earth's spherical surface Statewide Curriculum

Vector Data Model • Points – Defined by single x, y coordinate pair • Lines – Defined by two or more x, y coordinate pairs • Polygons – Defined by lines that close to form the polygon boundaries Statewide Curriculum

Vector Data Model • Every feature is assigned a unique numerical identifier – Stored with feature record in attribute table Statewide Curriculum

Raster Data Model • Earth is represented as grid of equally sized cells. – An individual cell represents a portion of the earth • such as a square meter or a square mile • Only one x, y coordinate pair is normally present – Called the origin – Used to define the location of every cell • Each cell's location is defined in relation to the origin Statewide Curriculum

Raster Data Model • Each raster cell is assigned a numeric value – can represent any kind of information about that geographic location • Elevation measurement in meters • Code number that specifies type of vegetation Statewide Curriculum

Raster Data Model • Represents geographic data – Elevation • Rows and columns of equally sized cells • One corner must be defined by – x, y coordinate pair Statewide Curriculum

Which Data Model? • Vector data model – To represent features that have discrete boundaries – A building • Polygon feature • x, y coordinates recorded for its corners – More accurate • Raster data model – To represent discrete features as well. – A building • Group of connected cells with same value – Code value for building. – Less storage space Statewide Curriculum

Which Data Model? • Vector data model – represents geographic features with exactly defined boundaries • Raster data model – represents them as cells of the same value • Shapes of the raster building and road – don't seem as similar to the real-world shapes Statewide Curriculum

Which Data Model? • Raster data model – Very useful for representing continuous geographic data • Don't have well-defined boundaries • Usually change gradually across a given area – i. e. elevation, precipitation, and temperature – When used to represent continuous data – each cell value is a measure of the phenomenon being modeled • An elevation raster – each cell value represents the elevation of a particular area. – Commonly used for spatial analysis and modeling Statewide Curriculum

Organizing Vector Data • Feature – Basic unit of vector data • Feature class – Basic storage unit for features – A collection of features that • Share same geometry type and same attributes • Located within common geographic extent – Examples • All customer locations for group of business franchises = point feature class named "Customers" • All roads in a city =line feature class named "Roads" • Areas in a city =polygon feature class called "Zoning" Statewide Curriculum

Organizing Vector Data • All features in a feature class have the – Same geometry type – Same attributes – Are located within a common geographic extent Statewide Curriculum

Organizing Vector Data • Three common data formats that use feature classes – Geodatabase – Coverage – Shapefile Statewide Curriculum

Geodatabase • Data storage format introduced with Arc. GIS® software • Relational database – Composed of various tables that organize data and are linked to one another • Think of a geodatabase as – A container for storing geographic data. • Geographic data stored in a geodatabase may be – Collection of vector feature classes • Point, line, polygon, or annotation – Raster datasets – Tables Statewide Curriculum

Geodatabase • Basic components are – Tables – Feature classes – Raster datasets • Has many powerful capabilities for modeling realworld objects Statewide Curriculum

How a Geodatabase Organizes Data • Feature classes can be – Stand-alone – Organized into components called feature datasets • Feature dataset – stores feature classes that have the same coordinate system – Not all features have to have the same geometry type – Can store feature classes with different geometry types • i. e. a feature dataset representing sewers may store – a line feature class representing mains – a point feature class representing valves Statewide Curriculum

How a Geodatabase Organizes Data • Feature classes grouped into a feature dataset – Normally have spatial relationships to one another • i. e. might be adjacent, intersect, or coincide with each other • Feature class tables – Store feature geometry and attribute information Statewide Curriculum

How a Geodatabase Organizes Data • Some attributes are automatically created and maintained by the geodatabase – Line feature classes • Calculates length of each feature and stores data in a field called Shape_Length – Polygon feature classes • Calculates perimeter and area of each feature and stores in fields called Shape_Length and Shape_Area Statewide Curriculum

How a Geodatabase Organizes Data • Nonspatial tables – Geodatabase tables that contain only feature attributes—no geometry • Feature attributes stored outside feature class table in separate table – Used for database efficiency • To speed up data queries and feature draw time Statewide Curriculum

How a Geodatabase Organizes Data • A geodatabase can contain – – Stand-alone feature classes Feature classes grouped into feature datasets Raster datasets Nonspatial tables Statewide Curriculum

Coverages • File-based data format native to ESRI's Arc. Info® Workstation software • Conceptually, coverages can be thought of as a combination of other vector data formats you have learned about. • Like a feature class, coverages have a geometry type of point, line, or polygon. • And, also like a feature class, a coverage represents a single thematic layer, such as schools, streets, or land use, – in which all features have the same attributes and are located within a common geographic area. • On the other hand, coverages are like a geodatabase feature dataset because they store a set of spatially related feature classes. – Point, line, and polygon coverages each contain a different set of feature classes that, together, define their features. Statewide Curriculum

Coverages • Geometry type of the coverage – determines which feature classes it will store Statewide Curriculum

Main Coverage Feature Classes • Point feature class – stores the point features of a point coverage • Arc feature class – stores the line features of a line coverage or the polygon boundaries of a polygon coverage • Polygon feature class – Stores polygon features of a polygon coverage • Tic feature class – Stores geographic control points that represent known real-world coordinates – Used to reference coverage features to the real world – All coverages have a tic feature class • Label feature class – Stores points in center of each polygon of a polygon coverage – Used to place feature labels Statewide Curriculum

More About Coverages • Coverages can contain many more types of feature classes – Annotation – Routes – Regions – For more information refer to the Arc. GIS Desktop Help • (Contents tab -> Data support in Arc. GIS -> Coverages) Statewide Curriculum

Coverages • Attributes and spatial relationships associated with a coverage feature class – stored in INFO-format tables • INFO tables – stored in a folder called info • which is stored with the other coverage files in a workspace folder. – Even if there is more than one coverage in a workspace folder, there is always only one info folder that contains the INFO tables for all the coverages in that workspace. Statewide Curriculum

Coverages • How coverages are displayed in Windows Explorer. • Coverages workspace folder contains 3 coverages: – Landuse – Schools – Streets • The info folder contains the INFO tables associated with each of those coverages Statewide Curriculum

Coverages • Always use Arc. Catalog to manage coverages • Won't see the info folder associated with a coverage in the Arc. Catalog™ Catalog – Can see it in your operating system's file manager (e. g. , Windows Explorer) • Never move, copy, rename, or delete a coverage using your operating system's file manager – Connection between coverage feature classes and info folder could become broken or corrupted Statewide Curriculum

Shapefiles • File-based data format – native to Arc. View® 3. x software • A shapefile is composed of at least 3 files, and as many as 8 – Each file has shapefile name and extension • . shp, . shx, . dbf – Information stored allows features and attribute table to be displayed Statewide Curriculum

Shapefile Files • Shapefile. Name. dbf – d. BASE-format table that stores feature attributes • Shapefile. Name. shp – stores feature geometry • Shapefile. Name. shx – stores the index of the feature geometry Statewide Curriculum

Additional Shapefile Extensions • Shapefile. Name. aih • Shapefile. Name. sbn • Shapefile. Name. ain • Shapefile. Name. sbx – attribute index file • Shapefile. Name. prj – spatial index file – coordinate system file Statewide Curriculum

Shapefiles • In Arc. Catalog – Can only see. shp extension • Can view all shapefiles in Windows Explorer • Always use Arc. Catalog to manage shapefiles • Arc. Catalog accesses all shapefiles when renaming, moving, copying, or deleting Statewide Curriculum

Shapefiles • Files associated with a shapefile are visible in Windows Explorer – Named Census. Blocks Statewide Curriculum

Shapefiles • Common data format – Arc. Pad® software – Global positioning system (GPS) applications Statewide Curriculum

Raster Data Formats • Two common data formats based on the raster data model are – Grids – Images Statewide Curriculum

Grids • Used to store both – discrete features • buildings, roads, and parcels – continuous phenomena • elevation, temperature, and precipitation • Cells – Basic unit of raster data model – Store information about what things are like at a particular location on the earth's surface. – Depending on type of data being stored values can be either • integers (whole numbers) • floating points (numbers with decimals). • Two types of grids – one stores integers – One stores floating points Statewide Curriculum

Discrete Grids • Contains cells whose values are integers – Often code numbers for a particular category • Cells can have the same value – i. e. land use • each land use type is coded by a different integer • But many cells may have the same code. • Have an attribute table that stores cell values and their associated attributes Statewide Curriculum

Continuous Grids • Continuous grid – Used to represent continuous phenomena – Cell values are floating points – Each cell can have a different floating point value • i. e. elevation – one cell might store an elevation value of 564. 3 meters – while the cell to the left might store an elevation value of 565. 1 meters – don't have an attribute table Statewide Curriculum

Grids • Discrete grids represent discrete features such as land use categories with integer values • Continuous grids represent continuous phenomena such as elevation with floating point values Statewide Curriculum

Grids • Attribute tables of discrete grids – INFO format • same format as coverage feature class attribute tables – Stored within an info folder • which is stored at the same level as the grid in a workspace folder – One info folder for all the grids in a workspace folder – To avoid breaking or corrupting the connection between grid files and the info folder, always use Arc. Catalog to move, copy, rename, and delete grids Statewide Curriculum

Grids • Grids workspace folder contains 2 grids – Soils – Vegetation • Attribute tables for both grids – stored in info folder • Auxiliary files link grids and attribute tables – soils. aux – vegetation. aux Statewide Curriculum

Images • Collective term for rasters whose pixels – store brightness values of reflected visible light or other types of electromagnetic radiation • emitted heat (infrared) • ultraviolet (UV) • Commonly used in GIS – Aerial photos – satellite images – scanned paper maps Statewide Curriculum

Images • Can be displayed as – Layers in a map – Attributes for vector features • i. e. a real estate company might include photos of available houses as an attribute of a homes layer • To be displayed as a layer – Must be referenced to real-world locations • i. e. an aerial photo as it comes from the camera – just a static picture – no geographic information – photo may contain distortion and scale variations • To display properly with other map layers – photo must be assigned a coordinate system – some of its pixels must be linked to known geographic coordinates Statewide Curriculum

Images • Raster images – i. e. aerial photographs and scanned maps – can be referenced to real-world locations – then displayed as a layer in a GIS map Statewide Curriculum

Image File Formats • Differ in the type of compression used to reduce the file size • Some supported by Arc. GIS software –. tif (Tagged Image File Format) –. sid (Lizard. Tech Mr. SID) –. img (ERDAS Imagine) –. jpg (Joint Photographic Experts Group) Statewide Curriculum

Exercise • Explore Geographic Data Statewide Curriculum

Organizing Data into a Geodatabase • Geographic data can be stored in a variety of formats. • To assemble geographic data into a GIS database – create a collection of folders containing data stored in different formats • i. e. shapefiles and coverages – Or create a geodatabase Statewide Curriculum

Advantages of Geodatabase • All data is stored in one central location – Helps maintain an overview of data holdings and more easily locate data – Promotes faster and more accurate data entry and editing • Can set up rules for a geodatabase feature class that say only certain values are valid for a particular attribute, and create relationships among feature classes so that when a feature in one feature class is updated, related features in other feature classes update Statewide Curriculum

Advantages of Geodatabase • Automatically calculates and maintains geometric values for – Line and polygon feature classes • Length, perimeter, and area – Extremely valuable if doing analyses that rely on these measures Statewide Curriculum

Types of Geodatabases • File geodatabases • Personal geodatabases for Microsoft® Access™ • Arc. SDE geodatabases Statewide Curriculum

Types of Geodatabases • Type depends on – what data will be used for – Structure and workflows of the organization • Small workgroup, and data edited by single user – File geodatabase most suitable • File geodatabase – – can handle very large datasets with very fast performance Storage capacity virtually unlimited Requires less disk space than other file formats Recommended data format for Arc. GIS • Personal geodatabase for Access – also designed for small workgroups with single editor – Uses Microsoft Access data format with the. mdb file extension Statewide Curriculum

Types of Geodatabases • Large organization – i. e. a city government – Geodatabase may store inventory of all data available for administration of the city. – Different departments will use data and multiple people will need to access and edit the data at the same time. In this case • Arc. SDE geodatabase is probably the best choice. • Arc. SDE geodatabases – Can support increasing numbers of concurrent users and editors – Require a relational database management system such as DB 2, Oracle, or SQL Server and ESRI's Arc. SDE® technology Statewide Curriculum

Types of Geodatabases • File geodatabases, personal geodatabases for Access, and Arc. SDE geodatabases store the same basic elements – feature classes (stand-alone or in feature datasets) – raster datasets – nonspatial tables Statewide Curriculum

Designing a Geodatabase • Before looking for data – Must know what data to look for – Requires some thinking, planning and design • Before building a geodatabase – Identify all data it will store and decide on best way to structure that data – Asking the right questions in the beginning may help avoid some pitfalls that can cost time and money – Strive to organize geographic data to best serve the needs of the organization Statewide Curriculum

Questions to Ask Before Building a Geodatabase • What will the geodatabase be used for? – Listing possible application scenarios will help identify thematic data layers needed to store – i. e. suppose you will be evaluating fire hydrant coverage in a particular part of town. You might decide that you need to store fire hydrants and streets as well as buildings and parcels in the geodatabase • What data layers do you need? – After identifying possible applications, there will be a good understanding of what datasets will be needed – List them out • What attributes do you need? – Information from a GIS is only as good as the information put in – i. e. For the fire hydrant coverage project, you might need attributes such as hydrant IDs, street names, building addresses, parcel IDs, parcel owners, and parcel owner addresses Statewide Curriculum

Questions to Ask Before Building a Geodatabase • At what level of detail should features be represented? • How should attributes be stored? • Which feature classes are spatially related? – Decide on geometry type of each data layer. – Will fire hydrants be represented as points or polygons? – Will streets be represented as lines or polygons? When making these decisions, consider how accurately (with how much detail) you need to represent features in order to perform the tasks for all the possible applications. – Attributes can be stored along with associated features in a feature class table – Nonspatial tables can be created to store attributes – i. e. Building addresses should probably be stored with the buildings feature class since they are not likely to change, but the parcel owner addresses could either be stored in the parcels feature class or in a separate nonspatial table. – Storing attributes outside the feature class may make sense when they change often. – If certain feature classes are spatially related, group them into a feature dataset – Store buildings and parcels in a feature dataset because the two feature classes are spatially related • buildings are always contained within a parcel Statewide Curriculum

Designing a Geodatabase • Result of design work plan or model of the geodatabase • Design model best illustrated in a diagram that shows – Feature classes – Feature datasets – Nonspatial tables – A list of feature attributes Statewide Curriculum

Designing a Geodatabase • Diagram shows structure of planned geodatabase • Will contain – One feature dataset – Two stand-alone feature classes – A nonspatial table Statewide Curriculum

Data Resources • • • Common problem doing GIS finding needed data Often, when starting a GIS project, you will have some, but not all, the data required for the project. List of Data Marts – ESRI http: //www. esri. com/data/index. html – Geography Network http: //www. geographynetwork. com/ – Geospatial One-Stop http: //www. geodata. gov/ – National Atlas of the United States http: //nationalatlas. gov/atlasftp. html – National Park Service http: //science. nature. nps. gov/nrdata/index. cfm – NOAA National Geophysical Data Center http: //www. ngdc. noaa. gov/ – U. S. Geological Survey (USGS) http: //www. usgs. gov/pubprod/digitaldata. html • EROS Data Center http: //edc. usgs. gov/geodata/ • National Map Seamless Data Distribution Sys Statewide Curriculum

Data Resources • State data clearinghouses – Alaska http: //agdc. usgs. gov/ – California http: //frap. cdf. ca. gov/data/frapgisdata/select. asp – Hawaii http: //www. state. hi. us/dbedt/gis/index. html • Commercial data vendors – GIS Data Depot http: //data. geocomm. com/ – GIS Lounge http: //gislounge. com/ll/data. shtml – Map Mart http: //www. mapmart. com/ Statewide Curriculum

Getting Data into a Geodatabase • Once you have a design for your geodatabase, you can add data to it. • There are three ways to get data into a geodatabase: – Import data – Load data – Copy data Statewide Curriculum

Importing Data • Data can be imported into geodatabase feature classes and tables – Create feature classes or tables and populate them with data at the same time – Can import multiple data files at once – Can exclude attributes from being imported – By default spatial reference of data imported into the geodatabase = same spatial reference as source files Statewide Curriculum

Loading Data • Loading – Creating new, empty feature classes, feature datasets, and nonspatial tables, then populating them with data – Data from multiple source files can be loaded and combined into one geodatabase feature class – Can also select which features and attributes loaded – Can populate empty feature classes and tables by creating new data in Arc. Map • When loading existing data into empty feature classes – spatial reference and the name, type, and length of the attribute fields in the source files and in the empty feature class must be the same • If some attribute fields in new feature class don't match ones in source files, only matching attributes will be loaded Statewide Curriculum

Copying Data • Can copy and paste if data exists in another geodatabase • When copying and pasting feature classes into a feature dataset – Make sure spatial reference of both feature datasets are the same Statewide Curriculum

Understanding Field Types • When creating new, empty feature classes and tables – Important to be aware that the field type of an attribute determines what data can be stored in it. Statewide Curriculum

Understanding Field Types Type Short integer Long integer Stored Values Numbers from -32, 768 to 32, 768 Application Numeric values without decimal places Numbers from -2, 147, 483, 648 Large numeric values without to 2, 147, 483, 648 decimal places Float Approx. -3. 4 * E-38 to 1. 2 E 38 Numeric values with or without decimal places Double Approx. -2. 2*E-308 to 1. 8* E 308 Large numeric values with or without decimal places Text Up to 64, 000 characters Text strings such as names and descriptions Date Mm/dd/yyyy hh: mm: ss AM/PM Date and time values Blob (binary large object) Varies Images and other multimedia GUID Raster 36 -character string enclosed in Unique feature IDs within and curly brackets across geodatabases Raster datasets as attributes Statewide Curriculum

Exercise • Create a project database Statewide Curriculum

Review • Geographic data – recorded information about earth's surface and objects found on it associated to a geographic location • Two models for representing real-world features in a GIS – Vector data model – Raster data model • Common file formats for storing vector data – Geodatabases – Shapefiles – Coverages • Common file formats for storing raster data – Images – Grids Statewide Curriculum

Review • Three types of geodatabases – File geodatabases – Personal geodatabases for Microsoft Access – Arc. SDE geodatabases • Used with relational database management systems and Arc. SDE technology • Geodatabase feature classes can be – Stand-alone – Feature datasets • organized into larger units • Field type of an attribute determines type of data that can be stored in it Statewide Curriculum

Review Questions 1. When would you use the vector data model versus the raster data model? 2. In a geodatabase polygon feature class, which two fields are automatically calculated and updated? 3. List three methods of adding data to a geodatabase. 4. If you wanted to combine features from different feature classes into one feature class, which method would you use? Statewide Curriculum

Review Answers 1. If you want to represent features with distinct boundaries, it's probably better to use the vector data model and store the features' x, y coordinate locations. The raster data model is better suited to representing phenomena whose boundaries change gradually across a given area. 2. In a geodatabase polygon feature class, the Shape_Length and Shape_Area fields are automatically calculated and updated. 3. Three methods of adding data to a geodatabase are importing, loading, and copying data. 4. To combine features from different feature classes into one feature class, you would load them into a new, empty geodatabase feature class. Statewide Curriculum

Key Terms • Coverage – • Data model – • A data model for storing geographic features using Arc. Info software. A coverage stores a set of thematically associated data considered to be a unit. It usually represents a single layer, such as soils, streams, roads, or land use. In a coverage, features are stored as both primary features (points, arcs, polygons) and secondary features (tics, links, annotation). Feature attributes are described and stored independently in feature attribute tables. Coverages cannot be edited in Arc. GIS. In a general sense, an abstraction of the real world which incorporates only those properties thought to be relevant to the application at hand. It would normally define specific groups of entities, their attribute values, and the relationships between these. In GIS, it is often used to refer to the mechanistic representation and organization of spatial data; for example, the vector data model and the raster data model. A data model is independent of a computer system and its associated data structures. Feature class – – A collection of geographic features with the same geometry type (such as point, line, or polygon), the same attributes, and the same spatial reference. Feature classes can stand alone within a geodatabase or be contained within shapefiles, coverages, or other feature datasets. Feature classes allow homogeneous features to be grouped into a single unit for data storage purposes. For example, highways, primary roads, and secondary roads can be grouped into a line feature class named "roads. " In a geodatabase, feature classes can also store annotation and dimensions. Statewide Curriculum

Key Terms • Feature dataset – • Geodatabase – • In a geodatabase, a collection of feature classes stored together that share the same spatial reference; that is, they have the same coordinate system and their features fall within a common geographic area. Feature classes with different geometry types may be stored in a feature dataset. Grid A relational database that stores geographic data. More precisely, the geodatabase is an object-oriented data model introduced by ESRI that is used to store spatial and attribute data and the relationships that exist among them. The geodatabase provides tools for creating "smart" geographic features and enforcing database integrity. A geodatabase can store feature classes, feature datasets, nonspatial tables, and relationship classes. • A raster data format that defines geographic space as an array of equally sized cells arranged in rows and columns. Each cell stores a numeric value that represents a geographic attribute (such as elevation) for that unit of space. When the grid is drawn as a map, cells are assigned colors according to their numeric values. Each grid cell is referenced by its x, y coordinate locations. • In cartography, any network of parallel and perpendicular lines superimposed on a map and used for reference. Grids are usually named after the map's projection; for example, Lambert grid and Transverse Mercator grid. Statewide Curriculum

Key Terms • Image • Raster • Shapefile – A raster-based representation or description of a scene, typically produced by an optical or electronic device such as a camera or a scanning radiometer. Common examples include remotely sensed data (for example, satellite data), scanned data, and photographs. An image is stored as a raster dataset of binary or integer values that represent the intensity of reflected light, heat, sound, or any other range of values on the electromagnetic spectrum. An image may contain one or more bands. – A spatial data model that defines space as an array of equally sized cells arranged in rows and columns. Each cell contains an attribute value and location coordinates. Unlike a vector structure, which stores coordinates explicitly, raster coordinates are contained in the ordering of the matrix. Groups of cells that share the same value represent geographic features – A vector data storage format for storing the location, shape, and attributes of geographic features. A shapefile is stored in a set of related files and contains one feature class. Statewide Curriculum

Key Terms • Thumbnail • Vector • x, y coordinates – A snapshot describing the geographic data contained in a data source, layer, or map. A thumbnail might provide an overview of all the features in a feature class or a detailed view of the features in, and the symbology of, a layer. Thumbnails are not updated automatically; they will go out of date if features are added to a data source or if the symbology of a layer changes. Thumbnails are created in Arc. Catalog. – A coordinate-based data model that represents geographic features as points, lines, or polygons. Each point feature is represented as a single coordinate pair, while line and polygon features are represented as ordered lists of vertices. Attributes are associated with each feature, as opposed to a raster data model, which associates attributes with grid cells. – A pair of values that represents the distance from an origin (0, 0) along two axes, a horizontal axis (x) representing east-west, and a vertical axis (y) representing north-south. On a map, x, y coordinates are used to represent features at the location they are found on the earth's spherical surface. Statewide Curriculum