Data Sources Data Input and Data Quality Babu
Data Sources, Data Input and Data Quality Babu Ram Dawadi 1
Major data feeds to GIS Systems Employ a Wide Range of Data Sources Most GIS projects has to rely almost exclusively upon data available only in printed or "paper" form. Much of the data available for use is still published on paper, but a great deal of information is now distributed in digital formats 2
Private Suppliers (Commercial Data) Commercial mapmaking firms are among the largest providers, but other firms have for years supplied detailed demographic and economic information, such as data on retail trade and marketing trends. Some of this information can be quite expensive to purchase Copyright/licensing restriction play roles Many software vendors earn a substantial income by repackaging and selling data in the proprietary forms used by their software products. 3
Questions regarding data sources Where did it come from? In what medium was it originally produced? What is the area coverage of the data? To what map scale was the data digitized? What projection, coordinate system, and datum were used in maps? What was the density of observations used for its compilation? How accurate are positional and attribute features? Does the data seem logical and consistent? Do cartographic representations look "clean? " Is the data relevant to the project at hand? In what format is the data kept? How was the data checked? Why was the data compiled? What is the reliability of the provider? 4
Source: Map Marks on a paper that stands for definable things on the earth's surface. A representation usually on a flat surface, of the whole or a part of an area Any concrete or abstract image of the distributions and features that occur on or near the surface of the earth or other bodies Map Resolution: Refers to how accurately the location and shape of the map features can be depicted for a given map scale 5
Maps gain value in three ways As a way of recording and storing information: Governments, business, and society must store large quantities of information about the environment and the location of natural resources, capital assess, and people. As a mean of analyzing distributions and spatial patterns: Maps let us recognize spatial distribution and relationships and make it possible for us to visualize and hence conceptualize patterns and processes that operate through space. As a method of presenting information and communication findings: Maps allow us to convey information and findings that are difficult to express verbally. 6
Virtual Maps vs. Real Maps Real map: A hard copy or conventional map. Virtual map: Information that can be converted into a real map, i. e. information on a computer screen, mental images, field information, notes, and remote sensing information. Map Features Point Line Area 7
Elements of Map Scale: The extent of the reduction necessary to put a proportion of the earth's surface on a sheet of paper. Numeric or ratio scales: 1: 24, 000 1/24, 000 both are the same, this means that one inch on a map = 24, 000 inches on the ground. Verbal: 1 inch = 100 feet. Graphic or Bar: Rake scale or some other graphical representation Direction Explanation (Legend) 8
9
Global Positioning System GPS provides specially coded satellite signals that can be processed in a GPS receiver, enabling the receiver to compute position, velocity and time. GPS is funded by and controlled by the U. S. Department of Defense (DOD). While there are many thousands of civil users of GPS world-wide, the system was designed for and is operated by the U. S. military Four GPS, satellite signals are used to compute positions in three dimensions and the time offset in the receiver clock. 10
GPS: Space Segment The Space Segment of the system consists of the GPS satellites. These space vehicles (SVs) send radio signals from space The nominal GPS Operational Constellation consists of 24 satellites that orbit the earth in 24 hours There are often more than 24 operational satellites as new ones are launched to replace older satellites The orbit altitude is such that the satellites repeat the same track and configuration over any point approximately each 24 hours (4 minutes earlier each day) 11
Space Segment The nominal GPS Operational Constellation consists of 24 satellites that orbit the earth in 24 hours. There are often more than 24 operational satellites as new ones are launched to replace older satellites. The satellite orbits repeat almost the same ground track (as the earth turns beneath them) once each day. The orbit altitude is such that the satellites repeat the same track and configuration over any point approximately each 24 hours (4 minutes earlier each day). There are six orbital planes, with nominally four SVs (Satellite Vehicles) in each, equally spaced (60 degrees apart), and inclined at about fifty-five degrees with respect to the equatorial plane. This constellation provides the user with between five and eight SVs visible from any point on the earth. 12
Space Segment Contd. . There are six orbital planes (with nominally four SVs in each), equally spaced (60 degrees apart), and inclined at about fifty-five degrees with respect to the equatorial plane. This constellation provides the user with between five and eight SVs visible from any point on the earth 13
Control Segment The Control Segment consists of a system of tracking stations located around the world. 14
User Segment The GPS User Segment consists of the GPS receivers and the user community GPS receivers convert SV signals into position, velocity, and time estimates Four satellites are required to compute the four dimensions of X, Y, Z (position) and Time 15
GPS Data The GPS Navigation Message consists of time-tagged data bits marking the time of transmission of each sub frame at the time they are transmitted by the SV. A data bit frame consists of 1500 bits divided into five 300 bit sub frames. A data frame is transmitted every thirty seconds. Three six-second sub frames contain orbital and clock data. SV Clock corrections are sent in sub frame one and precise SV orbital data sets for the transmitting SV are sent in sub frames two and three. Sub frames four and five are used to transmit different pages of system data. An entire set of twenty-five frames (125 sub frames) makes up the complete Navigation Message that is sent over a 12. 5 minute period. 16
Data bit sub frames (300 bits transmitted over six seconds) contain parity bits that allow for data checking and limited error correction. Clock data parameters describe the SV clock and its relationship to GPS time. Satellite 2 (X 2, Y 2, Z 2) Satellite 1 (X 1, Y 1, Z 1) R 1 R 2 Satellite 3 Time (X 3, Y 3, Z 3) R 3 R 4 GPS Receiving Station (Xr, Yr, Zr) 17
Interruptions to the Satellite There are some factors that can affect the satellites performance and job to relay the data to the receivers such as: Ionosphere and Troposphere Delays Signal Multipath Receiver Clock Errors Orbital Errors Number of Satellites Visible Satellite Geometry/Shading Intentional Degradation of the Satellite Signal http: //www 8. garmin. com/about. GPS/ 18
Interruptions Continued… Ionosphere and Troposphere: Signal that slows down in transition through the atmosphere Signal Multi-path: Affected by things in the surrounding area (such as rocky mountains or buildings) Receiver Clock: Clocks do not match up on the receiver and the satellite Number of Visible Satellites: The more the better Satellite Geometry/Shading: Need to be spaced properly; line/tight group = bad signals Intentional Degradation of Satellite Signal: Specific for the Military use but affects the civilian populations use of GPS 19
Other Global Navigation Satellite Systems (GNSS) • GLONASS – Russian Federation – (24) Satellites • Galileo – European Union – (27+3) Satellites • Compass – China – (27 MEO+3 IGSO+5 GEO) Satellites • Regional Constellation – Indian Regional Navigational Satellite System (IRNSS) (7) – Quasi-Zenith Satellite System (QZSS) (Japan) (4) 20
Satellite Navigation Orbits Comparison 21
Remote Sensing The term remote sensing was coined by geographers in the office of Naval Research of the United States in the 1960 s to refer to the acquisition of information about an object without physical contact Remote Sensing is the science and art of acquiring information (spectral, spatial, temporal) about material objects, area, or phenomenon, without coming into physical contact with the objects, or area, or phenomenon under investigation. 22
Electromagnetic waves are radiated through space. When the energy encounters an object, even a very tiny one like a molecule of air, one of three reactions occurs The radiation will either be reflected off the object, absorbed by the object, or transmitted through the object In remote sensing, information transfer is accomplished by use of electromagnetic radiation (EMR). EMR is a form of energy that reveals its presence by the observable effects it produces when it strikes the matter. 23
The total amount of radiation that strikes an object is referred to as the incident radiation, and is equal to: Reflected radiation + absorbed radiation + transmitted radiation In remote sensing, we are largely concerned with REFLECTED RADIATION This is the radiation that causes our eyes to see colors, causes infrared film to record vegetation, and allows radar images of the earth to be created. 24
Types of Remote Sensing In respect to the type of Energy Resources: Passive Remote Sensing: Makes use of sensors that detect the reflected or emitted electro-magnetic radiation from natural sources. Active remote Sensing: Makes use of sensors that detect reflected responses from objects that are irradiated from artificially-generated energy sources, such as radar. In respect to Wavelength Regions: Remote Sensing is classified into three types in respect to the wavelength regions o Visible and Reflective Infrared Remote Sensing. o Thermal Infrared Remote Sensing. o Microwave Remote Sensing. 25
Passive Remote Sensing Active Remote Sensing E. transmission, reception, and pre-processing A. the Sun: energy source F. processing, interpretation and analysis C. target 26 D. sensor: receiving and/or energy source G. analysis and application
27
Global Geostationary Satellites N. & S. American Eastern Pacific Earth radius 6, 370 km Satellite altitude 35, 800 km Europe and Africa C. Asia, India Ocean Jap. Aus. W. Paci China, India Ocean 29
Energy Interactions The proportions of energy reflected, absorbed, and transmitted will vary for different earth features, depending upon their material type and conditions. These differences permit us to distinguish different features on an image. Even within a given feature type, the proportion of reflected, absorbed, and transmitted energy will vary at different wavelengths. 31
Spatial data input Direct spatial data acquisition ground based field surveys remote sensors in satellites or airplanes In practice, it is not always feasible to obtain spatial data using these techniques. Factors of cost and available time may be a hindrance Digitizing paper maps On-tablet On-screen 32
The vectorization process Vectorization is the process that attempts to distill points, lines and polygons from a scanned image. As scanned image, as scanned lines may be several pixels wide, they are often first thinned, to retain only the centerline. This thinning process is also known as skeletonizing, as it removes all pixels that make the line wider than just one pixel Semi-automatic vectorization proceeds by placing the mouse pointer at the start of a line to be vectorized 33
Scanned Image Vectorized Data After Processing 34
Spatial Referencing Geographic referencing, which is sometimes simply called georeferencing, is defined as the representation of the location of real-world features within the spatial framework of a particular coordinate system The objective of georeferencing is to provide a rigid spatial framework by which the position of the realworld features are measured, computed, recorded, and analyzed 35
Spatial reference system and frames The geometry and motion of objects in 3 D Euclidean space are described in a reference coordinate system A reference coordinate system is a coordinate system with well-defined origin and orientation of the three orthogonal, coordinate axes We shall refer to such a system as a spatial reference system (SRS) Several spatial reference systems are used in the earth sciences. The most important one for the GIS community is the International Terrestrial Reference System (ITRS) The ITRS has its origin in the center of mass of the earth 36
(a) The ITRS and (b) The ITRF visualized as the fundamental polyhedron 37
ITRS The ITRS is realized through the International Terrestrial Reference Frame (ITRF), a catalogue of estimated coordinates (and velocities) at a particular epoch (era) They can be thought of as defining the vertices of a fundamental polyhedron of several specific, identifiable points Maintenance of the spatial reference frame means relating the rotated, translated and deformed polyhedron at a later epoch to the fundamental polyhedron Frame maintenance is necessary because of geophysical processes that deform the earth’s crust at measurable global, regional and local scales. 38
Spatial reference surfaces and datum ITRF is sufficient for describing the geometry and behavior in time of objects of interest near and on the earth surface in terms of a uniform triad of geocentric, Cartesian X, Y, Z coordinates and velocities Then Why do we need to also introduce spatial reference surfaces? Splitting the description of 3 D location in 2 D (horizontal) and 1 D (height) has a long tradition in earth sciences. 39
SRS & Datum… we humans are essentially inhabitants of 2 D space In first instance, we have sought intuitively to describe our environment in two dimensions. Hence we need a simple 2 D curved reference surface upon which the complex 2 D earth topography can be projected for easier 2 D horizontal referencing and computations 40
Datum A datum is a set of parameters defining a coordinate system, and a set of control points whose geometric relationships are known, either through measurement or calculation (Dew Hurst, 1990). A datum is defined by a spheroid, which approximates the shape of the Earth, and the spheroid’s position relative to the center of the Earth. There are many spheroids representing the shape of the Earth, and many more datums based upon them. 41
The geoid and the vertical datum To describe heights, we need an imaginary surface of zero height A surface where water does not flow, a level surface, is a good candidate Each level surface is a surface of constant height However, there are infinitely many level surfaces. Which one should we choose as the height reference surface? The most obvious choice is the level surface that most closely approximates all the earth’s oceans We call this surface the geoid 42
Every point on the geoid has the same zero height all over the world This makes it an ideal global reference surface for heights Historically, the geoid has been realized only locally, not globally For the Netherlands and Germany, the local mean sea level is realized through the Amsterdam tide-gauge (zero height). Obviously, there are several realizations of local mean sea levels, also called local vertical datums, in the world. They are parallel to the geoid but offset by up to a couple of meters 43
The ellipsoid and the horizontal datum Earth has been found to be slightly flattened at the poles, and the physical shape of the real earth is closely approximated by the mathematical surface of the rotational ellipsoid. The ellipsoid is widely used as the reference surface for horizontal coordinates (latitude & longitude) Ellipsoid globally best fitting to the geoid Regi on of best fit Ellipsoid regionally best fitting to the geoid The geiod 44
Ellipsoid… The mathematical shape that is simple enough and most closely approximates the local mean sea level is the surface of an ellipsoid An ellipsoid with specific dimensions – a and b as half the length of the major, respectively minor, axis is chosen which best fits the local mean sea level Then the ellipsoid is positioned and oriented with respect to the local mean sea level by adopting a latitude (φ) and longitude (§) and height (h) of a so called fundamental point 45
We say that a local horizontal datum is defined by: Dimensions (a, b) of the ellipsoid The adopted geographic coordinates φ and § and h of the fundamental point, and Azimuth from this point to another Different ellipsoids with varying position and orientation had to be adopted to best fit the local mean sea level in different countries or regions An example is the Potsdam datum, the local horizontal datum used in Germany. The fundamental point is in Rauenberg and the underlying ellipsoid is the Bessel ellipsoid (a=6, 377. 156 m, b=6, 356, 079. 175 m). 46
Datum Transformation Satellite positioning and navigation technology, now widely used around the world for spatial referencing, implies a global geocentric datum global and regional data sets refer now days almost always to a global geocentric datum and are useful to individual nations only if they can be reconciled with the local datum Mapping organizations do not only coach the user community about the implications of the geocentric datum. They also develop tools to enable users to transform coordinates of spatial objects from the new datum to the old one 47
Datum… This process is known as datum transformations. The tools are called datum transformation parameters The good news is that a transformation from datum A to datum B is a mathematically straight forward process Essentially, it is a transformation between two orthogonal Cartesian spatial reference frames together with some elementary tools from adjustment theory 48
Datum To translate one datum to another we must know the relationship between the chosen ellipsoids in terms of position and orientation. The relationship is defined by 7 constants 3 - Distance of the ellipsoid center from the center of the earth ( X, Y, Z) 3 - Rotations around the X, Y, and Z of the Cartesian coordinate system Axes ( , , ) 1 - Scale change (S) of the survey control network 49
Movement of points along an Axis X Z Y Movement of points around an Axis Changing the distance between points 50
Map Projections A map projection is an attempt to portray the surface of the earth or a portion of the earth on a flat surface; the manner in which the spherical surface of the earth is represented on a two-dimensional surface All projections distort properties of map (conformality, distance, direction, scale, or area). Choose a projection that will MINIMIZE distortion in your area and be best suited for your application. Conformality: When the scale of a map at any point on the map is the same in any direction, the projection is conformal. Meridians (lines of longitude) and parallels (lines of latitude) intersect at right angles. Shape is preserved locally on conformal maps 51
Map Projections. . Distance: A map is equidistant when it portrays distances from the center of the projection to any other place on the map. Direction: A map preserves direction when azimuths (angles from a point on a line to another point) are portrayed correctly in all directions Scale: Scale is the relationship between a distance portrayed on a map and the same distance on the Earth Area: When a map portrays areas over the entire map so that all mapped areas have the same proportional relationship to the areas on the Earth that they represent, the map is an equal-area map 52
Classification of map projections Map projections fall into three general classes: Cylindrical Conical Planar or Azimuthal Cylindrical Projection is assumed to circumscribe a transparent globe (marked with meridians and parallels) so that the cylinder touches the equator throughout its circumference Assuming that a light bulb is placed at the center of the globe, the graticule of the globe is projected on to the cylinder 53
Cylindar By cutting open the cylinder along a meridian and unfolding it, a rectangle-shaped cylindrical projection is obtained 54
Conical Projection: a cone is placed over the globe in such a way that the apex of the cone is exactly over the polar axis 55
Planar or Azimuthal Projection A plane is placed so that it touches the globe at the north or South Pole. The projection resulting is better known as the polar Azimuthal projection It is circular in shape with meridians projected as straight lines radiating from the center of the circle, which is the pole 56
Data precision, error and repair Precision refers to the level of measurement and exactness of description in a GIS database. Precise location data may measure position to a fraction of a unit. The level of precision required for particular applications varies greatly. Engineering projects such as road and utility construction require very precise information measured to the millimeter or tenth of an inch. Highly precise data can be very difficult and costly to collect. Carefully surveyed locations needed by utility companies to record the locations of pumps, wires, pipes and transformers cost $5 -20 per point to collect 57
Precision, error and accuracy Acquired data sets must be checked for consistency and completeness. This requirement applies to the geometric and topological quality as well as the semantic quality of the data There are different approaches to clean up data. Errors can be identified automatically, after which manual editing methods can be applied to correct the errors Before cleanup After cleanup Description Erase duplicates or silver lines Erase dangling objects overshoots or 58
Multiple data sources A GIS project usually involves multiple data sets, so a next step addresses the issue of how these multiple sets relate to each other There are three fundamental cases to be considered if we compare data sets pair wise: They may be about the same area, but differ in accuracy, They may be about the same area, but differ in choice of representation, and They may be about adjacent areas, and have to be merged into a single data set 59
Differences in accuracy Images come at a certain resolution, and paper maps at certain scale. This typically results in differences of resolution of acquired data sets Due to scale differences in the sources, the resulting polygons do not perfectly coincide, and polygon boundaries cross each other The integration of two vector data sets may lead to silver 60
Differences in representation There exist more advanced GIS applications that require the possibility of representing the same geographic phenomenon in different ways Object in scale i Object in scale j Object in scale k Object with multiple representation Multi-scale and multi-representation systems compared; the main difference is that multi-representation systems have a built in understanding that different representations belong together. 61
Data Transformation Format Change: Raster to vector and vector to raster conversion within the same GIS system. May also include raster to vector and vector to raster data Loss of detail: especially at features edges, generally vector data more accurately represents a feature Loss of attribute data: some raster formats do not allow for multiple attributes per cell Some systems use only one format exclusively and provide utilities or import options to bring in the data and convert it to the needed format. 62
Data Transformation 63
- Slides: 61