{"id":40,"date":"2014-01-23T22:10:51","date_gmt":"2014-01-23T22:10:51","guid":{"rendered":"http:\/\/opentextbc.ca\/natureofgeographicinformation\/?post_type=chapter&#038;p=40"},"modified":"2016-12-09T20:47:30","modified_gmt":"2016-12-09T20:47:30","slug":"1-overview","status":"publish","type":"chapter","link":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/chapter\/1-overview\/","title":{"raw":"Data and Information","rendered":"Data and Information"},"content":{"raw":"<h2>1.1. Overview<\/h2>\r\nWhen I started writing this text in 1997, my office was across the street (and, fortunately, upwind) from Penn State's power plant. The energy used to heat and cool my office is still produced there by burning coal mined from nearby ridges. Combustion transforms the potential energy stored in the coal into electricity, which solves the problem of an office that would otherwise be too cold or too warm. Unfortunately, the solution itself causes another problem, namely emissions of carbon dioxide and other more noxious substances into the atmosphere. Cleaner means of generating electricity exist, of course, but they too involve transforming energy from one form to another. And cleaner methods cost more than most of us are willing or able to pay.\r\n\r\nIt seems to me that a coal-fired power plant is a pretty good analogy for a geographic information system. For that matter, GIS is comparable to any factory or machine that transforms a raw material into something more valuable. Data is grist for the GIS mill. GIS is like the machinery that transforms the data into the commodity--information--that is needed to solve problems or create opportunities. And the problems that the manufacturing process itself creates include uncertainties resulting from imperfections in the data, intentional or unintentional misuse of the machinery, and ethical issues related to what the information is used for, and who has access to it.\r\n\r\nThis text explores the nature of geographic information. To study the nature of something is to investigate its essential characteristics and qualities. To understand the nature of the energy produced in a coal-fired power plant, one should study the properties, morphology, and geographic distribution of coal. By the same reasoning I believe that a good approach to understanding the information produced by GIS is to investigate the properties of geographic data and the technologies and institutions that produce it.\r\n<h3>Objectives<\/h3>\r\nThe goal of Chapter 1 is to situate GIS in a larger enterprise known as Geographic Information Science and Technology (GIS&amp;T), and in what the U.S. Department of Labor calls the \"geospatial industry.\" In particular, students who successfully complete Chapter 1 should be able to:\r\n<ol>\r\n \t<li>Define a geographic information system;<\/li>\r\n \t<li>Recognize and name basic database operations from verbal descriptions;<\/li>\r\n \t<li>Recognize and name basic approaches to geographic representation from verbal descriptions;<\/li>\r\n \t<li>Identify and explain at least three distinguishing properties of geographic data; and<\/li>\r\n \t<li>Outline the kinds of questions that GIS can help answer.<\/li>\r\n<\/ol>\r\n<h2>1.2. Checklist<\/h2>\r\nThe following checklist is for Penn State students who are registered for classes in which this text, and associated quizzes and projects in the ANGEL course management system, have been assigned. You may find it useful to print this page out first so that you can follow along with the directions.\r\n<h3>Chapter 1 Checklist (for registered students only)<\/h3>\r\n<table summary=\"Tasks to be completed for the lesson.\"><caption>Chapter 1 Checklist<\/caption>\r\n<tbody>\r\n<tr>\r\n<th>Step<\/th>\r\n<th>Activity<\/th>\r\n<th>Access\/Directions<\/th>\r\n<\/tr>\r\n<tr>\r\n<th>1<\/th>\r\n<td><strong>Read<\/strong>\u00a0Chapter 1<\/td>\r\n<td>This is the second page of Chapter 1. Click on the links at the bottom of the page to continue or to return to the previous page, or to go to the top of the chapter. You can also navigate the text via the links in the GEOG 482 menu on the left.<\/td>\r\n<\/tr>\r\n<tr>\r\n<th>2<\/th>\r\n<td>Submit\u00a0<strong>quizzes<\/strong>\u00a0as you come across them in the chapter. Blue banners denote practice quizzes that are not graded. Red banners signal graded quizzes. (Note that Chapter 1 does not include a graded quiz.)<\/td>\r\n<td>Go to ANGEL &gt; [your course section] &gt; Lessons tab &gt; Chapter 1 folder &gt; [quiz]<\/td>\r\n<\/tr>\r\n<tr>\r\n<th>3<\/th>\r\n<td>Perform\u00a0<strong>\"Try This\" activities<\/strong>\u00a0as you come across them in the chapter. \"Try This\" activities are not graded.<\/td>\r\n<td>Instructions are provided for each activity.<\/td>\r\n<\/tr>\r\n<tr>\r\n<th>4<\/th>\r\n<td>\u00a0Read\u00a0<strong>comments and questions<\/strong>\u00a0posted by fellow students. Add comments and questions of your own, if any.<\/td>\r\n<td>\u00a0Comments and questions may be posted on any page of the text, or in a Chapter-specific discussion forum in ANGEL.<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h2>\u00a01.3. Data<\/h2>\r\n\"After more than 30 years, we're still confronted by the same major challenge that GIS professionals have always faced: You must have good data. And good data are expensive and difficult to create.\" (Wilson, 2001, p. 54)\r\n\r\n<strong>Data consist of symbols that represent measurements of phenomena.\u00a0<\/strong>People create and study data as a means to help understand how natural and social systems work. Such systems can be hard to study because they're made up of many interacting phenomena that are often difficult to observe directly, and because they tend to change over time. We attempt to make systems and phenomena easier to study by measuring their characteristics at certain times. Because it's not practical to measure everything, everywhere, at all times, we measure selectively. How accurately data reflect the phenomena they represent depends on how, when, where, and what aspects of the phenomena were measured. All measurements, however, contain a certain amount of error.\r\n\r\nMeasurements of the locations and characteristics of phenomena can be represented with several different kinds of symbols. For example, pictures of the land surface, including photographs and maps, are made up of graphic symbols. Verbal descriptions of property boundaries are recorded on deeds using alphanumeric symbols. Locations determined by satellite positioning systems are reported as pairs of numbers called coordinates. As you probably know, all of these different types of data--pictures, words, and numbers--can be represented in computers in digital form. Obviously, digital data can be stored, transmitted, and processed much more efficiently than their physical counterparts that are printed on paper. These advantages set the stage for the development and widespread adoption of GIS.\r\n<h2>1.4. Information<\/h2>\r\n<strong>Information is data that has been selected or created in response to a question.<\/strong>\u00a0For example, the location of a building or a route is data, until they are needed to dispatch an ambulance in response to an emergency. When used to inform those who need to know \"where is the emergency, and what's the fastest route between here and there?,\" the data are transformed into information. The transformation involves the ability to ask the right kind of question, and the ability to retrieve existing data--or to generate new data from the old--that help people answer the question. The more complex the question, and the more locations involved, the harder it becomes to produce timely information with paper maps alone.\r\n\r\nInterestingly, the potential value of data is not necessarily lost when they are used. Data can be transformed into information again and again, provided that the data are kept up to date. Given the rapidly increasing accessibility of computers and communications networks in the U.S. and abroad, it's not surprising that information has become a commodity, and that the ability to produce it has become a major growth industry.\r\n<h2>1.5. Information Systems<\/h2>\r\n<strong>Information systems are computer-based tools that help people transform data into information.<\/strong>\r\n\r\nAs you know, many of the problems and opportunities faced by government agencies, businesses, and other organizations are so complex, and involve so many locations, that the organizations need assistance in creating useful and timely information. That\u2019s what information systems are for.\r\n\r\nAllow me a fanciful example. Suppose that you\u2019ve launched a new business that manufactures solar-powered lawn mowers. You\u2019re planning a direct mail campaign to bring this revolutionary new product to the attention of prospective buyers. But since it\u2019s a small business, you can\u2019t afford to sponsor coast-to-coast television commercials, or to send brochures by mail to more than 100 million U.S. households. Instead, you plan to target the most likely customers \u2013 those who are environmentally conscious, have higher than average family incomes, and who live in areas where there is enough water and sunshine to support lawns and solar power.\r\n\r\nFortunately, lots of data are available to help you define your mailing list. Household incomes are routinely reported to banks and other financial institutions when families apply for mortgages, loans, and credit cards. Personal tastes related to issues like the environment are reflected in behaviors such as magazine subscriptions and credit card purchases. Firms like Claritas amass such data, and transform it into information by creating \u201clifestyle segments\u201d \u2013 categories of households that have similar incomes and tastes. Your solar lawnmower company can purchase lifestyle segment information by 5-digit ZIP code, or even by ZIP+4 codes, which designate individual households.\r\n\r\nIt\u2019s astonishing how companies like Claritas can create valuable information from the millions upon millions of transactions that are recorded every day. Their products are made possible by the fact that the original data exist in digital form, and because the companies have developed information systems that enable them to transform the data into information that companies like yours value. The fact that lifestyle information products are often delivered by geographic areas, such as ZIP codes, speaks to the appeal of geographic information systems.\r\n<h3><strong>TRY THIS<\/strong><\/h3>\r\nTry out the demo of what Claritas used to call the \u201cYou Are Where You Live\u201d tool. The Nielson Company has acquired Claritas and the tool is now called \u201cMyBestSegments.\u201d Point your browser to the\u00a0<a href=\"https:\/\/segmentationsolutions.nielsen.com\/mybestsegments\/\">My Best Segments<\/a>\u00a0page. Click the button labeled \u201cZIP Code Look-up.\u201d\r\n\r\nEnter your ZIP code then choose a segmentation system. Do the lifestyle segments, listed on the left, seem accurate for your community? If you don\u2019t live in the United States, try Penn State\u2019s Zip code, 16802.\r\nDoes the market segmentation match your expectations? Registered students are welcome to post comments directly to this page.\r\n<h2>1.6. Databases, Mapping, and GIS<\/h2>\r\nOne of our objectives in this first chapter is to be able to define a geographic information system. Here\u2019s a tentative definition:\u00a0<strong>A GIS is a computer-based tool used to help people transform geographic data into geographic information.<\/strong>\r\n\r\nThe definition implies that a GIS is somehow different from other information systems, and that geographic data are different from non-geographic data. Let\u2019s consider the differences next.\r\n<h2>1.7. Database Management Systems<\/h2>\r\nClaritas and similar companies use database management systems (DBMS) to create the \u201clifestyle segments\u201d that I referred to in the previous section. Basic database concepts are important since GIS incorporates much of the functionality of DBMS.\r\n\r\nDigital data are stored in computers as files. Often, data are arrayed in tabular form. For this reason, data files are often called\u00a0<strong>tables<\/strong>. A\u00a0<strong>database<\/strong>\u00a0is a collection of tables. Businesses and government agencies that serve large clienteles, such as telecommunications companies, airlines, credit card firms, and banks, rely on extensive databases for their billing, payroll, inventory, and marketing operations.\u00a0<strong>Database management systems<\/strong>\u00a0are information systems that people use to store, update, and analyze non-geographic databases.\r\n\r\nOften, data files are tabular in form, composed of rows and columns.\u00a0<strong>Rows<\/strong>, also known as\u00a0<strong>records<\/strong>, correspond with individual entities, such as customer accounts.\u00a0<strong>Columns<\/strong>\u00a0correspond with the various<strong>attributes<\/strong>\u00a0associated with each entity. The attributes stored in the accounts database of a telecommunications company, for example, might include customer names, telephone numbers, addresses, current charges for local calls, long distance calls, taxes, etc.\r\n\r\n<strong>Geographic data<\/strong>\u00a0are a special case: records correspond with places, not people or accounts. Columns represent the attributes of places. The data in the following table, for example, consist of records for Pennsylvania counties. Columns contain selected attributes of each county, including the county\u2019s ID code, name, and 1980 population.\r\n<table summary=\"1980 populations for 15 PA counties\"><caption>1980 Population Data for PA Counties<\/caption>\r\n<thead>\r\n<tr>\r\n<th>FIPS Code<\/th>\r\n<th>County<\/th>\r\n<th>1980 Pop<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>42001<\/td>\r\n<td>Adams County<\/td>\r\n<td>78274<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42003<\/td>\r\n<td>Allegheny County<\/td>\r\n<td>1336449<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42005<\/td>\r\n<td>Armstrong County<\/td>\r\n<td>73478<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42007<\/td>\r\n<td>Beaver County<\/td>\r\n<td>186093<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42009<\/td>\r\n<td>Bedford County<\/td>\r\n<td>47919<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42011<\/td>\r\n<td>Berks County<\/td>\r\n<td>336523<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42013<\/td>\r\n<td>Blair County<\/td>\r\n<td>130542<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42015<\/td>\r\n<td>Bradford County<\/td>\r\n<td>60967<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42017<\/td>\r\n<td>Bucks County<\/td>\r\n<td>541174<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42019<\/td>\r\n<td>Butler County<\/td>\r\n<td>152013<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42021<\/td>\r\n<td>Cambria County<\/td>\r\n<td>163062<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42023<\/td>\r\n<td>Cameron County<\/td>\r\n<td>5913<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42025<\/td>\r\n<td>Carbon County<\/td>\r\n<td>56846<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42027<\/td>\r\n<td>Centre County<\/td>\r\n<td>124812<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThe contents of one file in a database.\r\n\r\nThe example is a very simple file, but many geographic attribute databases are in fact very large (the U.S. is made up of over 3,000 counties, almost 50,000 census tracts, about 43,000 five-digit ZIP code areas and many tens of thousands more ZIP+4 code areas). Large databases consist not only of lots of data, but also lots of files. Unlike a spreadsheet, which performs calculations only on data that are present in a single document, database management systems allow users to store data in, and retrieve data from, many separate files. For example, suppose an analyst wished to calculate population change for Pennsylvania counties between the 1980 and 1990 censuses. More than likely, 1990 population data would exist in a separate file, like so:\r\n<table summary=\"1990 populations for 15 PA counties\"><caption>1990 Population Data for PA Counties<\/caption>\r\n<thead>\r\n<tr>\r\n<th>FIPS Code<\/th>\r\n<th>1990 Pop<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>42001<\/td>\r\n<td>84921<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42003<\/td>\r\n<td>1296037<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42005<\/td>\r\n<td>73872<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42007<\/td>\r\n<td>187009<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42009<\/td>\r\n<td>49322<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42011<\/td>\r\n<td>352353<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42013<\/td>\r\n<td>131450<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42015<\/td>\r\n<td>62352<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42017<\/td>\r\n<td>578715<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42019<\/td>\r\n<td>167732<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42021<\/td>\r\n<td>158500<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42023<\/td>\r\n<td>5745<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42025<\/td>\r\n<td>58783<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42027<\/td>\r\n<td>131489<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nAnother file in a database. A database management system (DBMS) can relate this file to the prior one illustrated above because they share the list of attributes called \u201cFIPS Code.\u201d\r\n\r\nIf two data files have at least one common attribute, a DBMS can combine them in a single new file. The common attribute is called a\u00a0<strong>key<\/strong>. In this example, the key was the county FIPS code (FIPS stands for Federal Information Processing Standard). The DBMS allows users to produce new data as well as to retrieve existing data, as suggested by the new \u201c% Change\u201d attribute in the table below.\r\n<table summary=\"Percent change in populations of 15 PA counties from 1980 to 1990\"><caption>Percent Change in Populations for PA Counties 1980-1990<\/caption>\r\n<thead>\r\n<tr>\r\n<th>FIPS<\/th>\r\n<th>County<\/th>\r\n<th>1980<\/th>\r\n<th>1990<\/th>\r\n<th>% Change<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>42001<\/td>\r\n<td>Adams<\/td>\r\n<td>78274<\/td>\r\n<td>84921<\/td>\r\n<td>8.5<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42003<\/td>\r\n<td>Allegheny<\/td>\r\n<td>1336449<\/td>\r\n<td>1296037<\/td>\r\n<td>-3<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42005<\/td>\r\n<td>Armstrong<\/td>\r\n<td>73478<\/td>\r\n<td>73872<\/td>\r\n<td>0.5<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42007<\/td>\r\n<td>Beaver<\/td>\r\n<td>186093<\/td>\r\n<td>187009<\/td>\r\n<td>0.5<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42009<\/td>\r\n<td>Bedford<\/td>\r\n<td>47919<\/td>\r\n<td>49322<\/td>\r\n<td>2.9<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42011<\/td>\r\n<td>Berks<\/td>\r\n<td>336523<\/td>\r\n<td>352353<\/td>\r\n<td>4.7<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42013<\/td>\r\n<td>Blair<\/td>\r\n<td>130542<\/td>\r\n<td>131450<\/td>\r\n<td>0.7<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42015<\/td>\r\n<td>Bradford<\/td>\r\n<td>60967<\/td>\r\n<td>62352<\/td>\r\n<td>2.3<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42017<\/td>\r\n<td>Bucks<\/td>\r\n<td>541174<\/td>\r\n<td>578715<\/td>\r\n<td>6.9<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42019<\/td>\r\n<td>Butler<\/td>\r\n<td>152013<\/td>\r\n<td>167732<\/td>\r\n<td>10.3<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42021<\/td>\r\n<td>Cambria<\/td>\r\n<td>163062<\/td>\r\n<td>158500<\/td>\r\n<td>-2.8<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42023<\/td>\r\n<td>Cameron<\/td>\r\n<td>5913<\/td>\r\n<td>5745<\/td>\r\n<td>-2.8<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42025<\/td>\r\n<td>Carbon<\/td>\r\n<td>56846<\/td>\r\n<td>58783<\/td>\r\n<td>3.4<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>42027<\/td>\r\n<td>Centre<\/td>\r\n<td>124812<\/td>\r\n<td>131489<\/td>\r\n<td>5.3<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nA new file produced from the prior two files as a result of two database operations. One operation merged the contents of the two files without redundancy. A second operation produced a new attribute\u2013\u201d% Change\u201d\u2013dividing the difference between \u201c1990 Pop\u201d and \u201c1980 Pop\u201d by \u201c1980 Pop\u201d and expressing the result as a percentage.\r\n\r\nDatabase management systems are valuable because they provide secure means of storing and updating data. Database administrators can protect files so that only authorized users can make changes. DBMS provide transaction management functions that allow multiple users to edit the database simultaneously. In addition, DBMS also provide sophisticated means to retrieve data that meet user specified criteria. In other words, they enable users to select data in response to particular questions. A question that is addressed to a database through a DBMS is called a\u00a0<strong>query<\/strong>.\r\n\r\nDatabase queries include basic set operations, including union, intersection, and difference. The product of a<strong>union<\/strong>\u00a0of two or more data files is a single file that includes all records and attributes, without redundancy. An\u00a0<strong>intersection<\/strong>\u00a0produces a data file that contains only records present in all files. A\u00a0<strong>difference<\/strong>\u00a0operation produces a data file that eliminates records that appear in both original files. (Try drawing Venn diagrams\u2013intersecting circles that show relationships between two or more entities\u2013to illustrate the three operations. Then compare your sketch to\u00a0<a href=\"http:\/\/opentextbc.ca\/files\/geog482\/image\/venn_diagrams%282%29.png\">the venn diagram example<\/a>. ) All operations that involve multiple data files rely on the fact that all files contain a common key. The key allows the database system to relate the separate files. Databases that contain numerous files that share one or more keys are called\u00a0<strong>relational databases<\/strong>. Database systems that enable users to produce information from relational databases are called<strong>relational database management systems<\/strong>.\r\n\r\nA common use of database queries is to identify subsets of records that meet criteria established by the user. For example, a credit card company may wish to identify all accounts that are 30 days or more past due. A county tax assessor may need to list all properties not assessed within the past 10 years. Or the U.S. Census Bureau may wish to identify all addresses that need to be visited by census takers, because census questionnaires were not returned by mail. DBMS software vendors have adopted a standardized language called SQL (Structured Query Language) to pose such queries.\r\n<h3><strong>PRACTICE QUIZ<\/strong><\/h3>\r\n<h2>1.8. Mapping Systems<\/h2>\r\nGIS (geographic information systems) arose out of the need to perform spatial queries on geographic data. A spatial query requires knowledge of locations as well as attributes. For example, an environmental analyst might want to know which public drinking water sources are located within one mile of a known toxic chemical spill. Or, a planner might be called upon to identify property parcels located in areas that are subject to flooding. To accommodate geographic data and spatial queries, database management systems need to be integrated with mapping systems. Until about 1990, most maps were printed from handmade drawings or engravings. Geographic data produced by draftspersons consisted of graphic marks inscribed on paper or film. To this day, most of the lines that appear on topographic maps published by the U.S. Geological Survey were originally engraved by hand. The place names shown on the maps were affixed with tweezers, one word at a time. Needless to say, such maps were expensive to create and to keep up to date. Computerization of the mapmaking process had obvious appeal.\r\n\r\n<strong>Computer-aided design (CAD)<\/strong>\u00a0CAD systems were originally developed for engineers, architects, and other design professionals who needed more efficient means to create and revise precise drawings of machine parts, construction plans, and the like. In the 1980s, mapmakers began to adopt CAD in place of traditional map drafting. CAD operators encode the locations and extents of roads, streams, boundaries and other entities by tracing maps mounted on electronic drafting tables, or by key-entering location coordinates, angles, and distances. Instead of graphic features, CAD data consist of digital features, each of which is composed of a set of point locations. Calculations of distances, areas, and volumes can easily be automated once features are digitized. Unfortunately, CAD systems typically do not encode data in forms that support spatial queries. In 1988, a geographer named David Cowen illustrated the benefits and shortcomings of CAD for spatial decision making. He pointed out that a CAD system would be useful for depicting the streets, property parcel boundaries, and building footprints of a residential subdevelopment. A CAD operator could point to a particular parcel, and highlight it with a selected color or pattern. \u201cA typical CAD system\u201d, Cowen observed, \u201ccould not automatically shade each parcel based on values in an assessor\u2019s database containing information regarding ownership, usage, or value, however.\u201d A CAD system would be of limited use to someone who had to make decisions about land use policy or tax assessment.\r\n\r\n<strong>Desktop mapping\u00a0\u00a0<\/strong>An evolutionary stage in the development of GIS, desktop mapping systems like Atlas*GIS combined some of the capabilities of CAD systems with rudimentary linkages between location data and attribute data. A desktop mapping system user could produce a map in which property parcels are automatically colored according to various categories of property values, for example. Furthermore, if property value categories were redefined, the map\u2019s appearance could be updated automatically. Some desktop mapping systems even supported simple queries that allow users to retrieve records from a single attribute file. Most real-world decisions require more sophisticated queries involving multiple data files. That\u2019s where real GIS comes in.\r\n\r\n<strong>Geographic information systems (GIS)<\/strong>\u00a0As stated earlier, information systems assist decision makers by enabling them to transform data into useful information. GIS specializes in helping users transform geographic data into geographic information. David Cowen (1988) defined GIS as a decision support tool that combines the attribute data handling capabilities of relational database management systems with the spatial data handling capabilities of CAD and desktop mapping systems. In particular, GIS enables decision makers to identify locations or routes whose attributes match multiple criteria, even though entities and attributes may be encoded in many different data files.\r\n\r\nInnovators in many fields, including engineers, computer scientists, geographers, and others, started developing digital mapping and CAD systems in the 1950s and 60s. One of the first challenges they faced was to convert the graphical data stored on paper maps into digital data that could be stored in, and processed by, digital computers. Several different approaches to representing locations and extents in digital form were developed. The two predominant representation strategies are known as \u201cvector\u201d and \u201craster.\u201d\r\n<h2>1.9. Representation Strategies for Mapping<\/h2>\r\nRecall that data consist of symbols that represent measurements. Digital geographic data are encoded as alphanumeric symbols that represent locations and attributes of locations measured at or near Earth\u2019s surface. No geographic data set represents every possible location, of course. The Earth is too big, and the number of unique locations is too great. In much the same way that public opinion is measured through polls, geographic data are constructed by measuring representative\u00a0<strong>samples<\/strong>\u00a0of locations. And just as serious opinion polls are based on sound principles of statistical sampling, so too do geographic data represent reality by measuring carefully chosen samples of locations. Vector and raster data are, at essence, two distinct sampling strategies.\r\n\r\nThe\u00a0<strong>vector<\/strong>\u00a0approach involves sampling locations at intervals along the length of linear entities (like roads), or around the perimeter of areal entities (like property parcels). When they are connected by lines, the sampled points form line features and polygon features that approximate the shapes of their real-world counterparts.\r\n\r\n<a href=\"http:\/\/opentextbc.ca\/files\/geog482\/file\/vector.avi\"><img class=\"aligncenter\" alt=\"Illustration of vector encoding of a reservoir and highway\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/vector.gif\" \/><\/a>\r\n\r\nTwo frames (the first and last) of an animation showing the construction of a vector representation of a reservoir and highway.\r\n<h3><strong>TRY THIS<\/strong><\/h3>\r\nClick the graphic above to download and view the animation file (vector.avi, 1.6 Mb) in a separate Microsoft Media Player window.\r\n\r\nTo view the\u00a0<a href=\"http:\/\/opentextbc.ca\/files\/geog482\/file\/vector.mov\">same animation in QuickTime format (vector.mov, 1.6 Mb), click here<\/a>. Requires the QuickTime plugin, which is available free at\u00a0<a href=\"http:\/\/www.apple.com\/quicktime\/download\/\">apple.com<\/a>.\r\n\r\nThe aerial photograph above left shows two entities, a reservoir and a highway. The graphic above right illustrates how the entities might be represented with vector data. The small squares are nodes: point locations specified by latitude and longitude coordinates. Line segments connect nodes to form line features. In this case, the line feature colored red represents the highway. Series of line segments that begin and end at the same node form polygon features. In this case, two polygons (filled with blue) represent the reservoir.\r\n\r\nThe vector data model is consistent with how surveyors measure locations at intervals as they traverse a property boundary. Computer-aided drafting (CAD) software used by surveyors, engineers, and others, stores data in vector form. CAD operators encode the locations and extents of entities by tracing maps mounted on electronic drafting tables, or by key-entering location coordinates, angles, and distances. Instead of graphic features, CAD data consist of digital features, each of which is composed of a set of point locations.\r\n\r\nThe vector strategy is well suited to mapping entities with well-defined edges, such as highways or pipelines or property parcels. Many of the features shown on paper maps, including contour lines, transportation routes, and political boundaries, can be represented effectively in digital form using the vector data model.\r\n\r\nThe\u00a0<strong>raster<\/strong>\u00a0approach involves sampling attributes at fixed intervals. Each sample represents one cell in a checkerboard-shaped grid.\r\n\r\n<a href=\"http:\/\/opentextbc.ca\/files\/geog482\/file\/raster.avi\"><img class=\"aligncenter\" alt=\"Illustration of raster encoding of a reservoir and highway\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/raster.gif\" \/><\/a>\r\n\r\nTwo frames (the first and last) of an animation showing the construction of a raster representation of a reservoir and highway.\r\n\r\n&nbsp;\r\n<h3><strong>TRY THIS<\/strong><\/h3>\r\nClick the graphic above to download and view the animation file (raster.avi, 0.8 Mb) in a separate Microsoft Media Player window.\r\n\r\nTo view the\u00a0<a href=\"http:\/\/opentextbc.ca\/files\/geog482\/file\/raster.mov\">same animation in QuickTime format (raster.mov, 0.6 Mb), click here<\/a>. Requires the QuickTime plugin, which is available free at\u00a0<a href=\"http:\/\/www.apple.com\/quicktime\/download\/\">apple.com<\/a>.\r\n\r\nThe graphic above illustrates a raster representation of the same reservoir and highway as shown in the vector representation. The area covered by the aerial photograph has been divided into a grid. Every grid cell that overlaps one of the two selected entities is encoded with an attribute that associates it with the entity it represents. Actual raster data would not consist of a picture of red and blue grid cells, of course; they would consist of a list of numbers, one number for each grid cell, each number representing an entity. For example, grid cells that represent the highway might be coded with the number \u201c1\u2033 and grid cells representing the reservoir might be coded with the number \u201c2.\u201d\r\n\r\nThe raster strategy is a smart choice for representing phenomena that lack clear-cut boundaries, such as terrain elevation, vegetation, and precipitation. Digital airborne imaging systems, which are replacing photographic cameras as primary sources of detailed geographic data, produce raster data by scanning the Earth\u2019s surface pixel by pixel and row by row.\r\n\r\nBoth the vector and raster approaches accomplish the same thing: they allow us to caricature the Earth\u2019s surface with a limited number of locations. What distinguishes the two is the sampling strategies they embody. The vector approach is like creating a picture of a landscape with shards of stained glass cut to various shapes and sizes. The raster approach, by contrast, is more like creating a mosaic with tiles of uniform size. Neither is well suited to all applications, however. Several variations on the vector and raster themes are in use for specialized applications, and the development of new object-oriented approaches is underway.\r\n<h3><strong>PRACTICE QUIZ<\/strong><\/h3>\r\n<h2>1.10. Automated Map Analysis<\/h2>\r\nAs I mentioned earlier, the original motivation for developing computer mapping systems was to automate the map making process. Computerization has not only made map making more efficient, it has also removed some of the technological barriers that used to prevent people from making maps themselves. What used to be an arcane craft practiced by a few specialists has become a \u201ccloud\u201d application available to any networked computer user. When I first started writing this course in 1997, my example was the mapping extension included in Microsoft Excel 97, which made creating a simple map as easy as creating a graph. Ten years later, who hasn\u2019t used Google Maps or MapQuest?\r\n\r\nAs much as computerization has changed the way maps are made, it has had an even greater impact on how maps can be used. Calculations of distance, direction, and area, for example, are tedious and error-prone operations with paper maps. Given a digital map, such calculations can easily be automated. Those who are familiar with CAD systems know this from first-hand experience. Highway engineers, for example, rely on aerial imagery and digital mapping systems to estimate project costs by calculating the volumes of rock that need to be excavated from hillsides and filled into valleys.\r\n\r\nThe ability to automate analytical tasks not only relieves tedium and reduces errors. It also allows us to perform tasks that would otherwise seem impractical. Consider, for example, if you were asked to plot on a map a 100-meter-wide\u00a0<strong>buffer<\/strong>\u00a0zone surrounding a protected stream. If all you had to work with was a paper map, a ruler, and a pencil, you might have a lengthy job on your hands. You might draw lines scaled to represent 100 meters, perpendicular to the river on both sides, at intervals that vary in frequency with the sinuosity of the stream. Then you might plot a perimeter that connects the end points of the perpendicular lines. If your task was to create hundreds of such buffer zones, you might conclude that automation is a necessity, not just a luxury.\r\n\r\n<img alt=\"Illustration showing construction of a 100-meter buffer polygon around a stream\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/buffer.gif\" \/>\r\n\r\nSurrounding a protected stream with a buffer polygon.\r\n\r\nSome tasks can be implemented equally well in either vector- or raster- oriented mapping systems. Other tasks are better suited to one representation strategy or another. The calculation of slope, for example, or of<strong>gradient<\/strong>\u2013the direction of maximum slope along a surface\u2013is more efficiently accomplished with raster data. The slope of one raster grid cell may be calculated by comparing its elevation to the elevations of the eight cells that surround it. Raster data are also preferred for a procedure called\u00a0<strong>viewshed analysis<\/strong>\u00a0that predicts which portions of a landscape will be in view, or hidden from view, from a particular perspective.\r\n\r\nSome mapping systems provide ways to analyze attribute data as well as locational data. For example, the Excel mapping extension I mentioned above links the geographic data display capabilities of a mapping system with the data analysis capabilities of a spreadsheet. As you probably know, spreadsheets like Excel let users perform calculations on individual fields, columns, or entire files. A value changed in one field automatically changes values throughout the spreadsheet. Arithmetic, financial, statistical, and even certain database functions are supported. But as useful as spreadsheets are, they were not engineered to provide secure means of managing and analyzing large databases that consist of many related files, each of which is the responsibility of a different part of an organization. A spreadsheet is not a DBMS. And by the same token, a mapping system is not a GIS.\r\n<h2>1.11. Geographic Information Systems<\/h2>\r\nThe preceding discussion leads me to revise my working definition:\r\n\r\nAs I mentioned earlier, a geographer named David Cowen defined GIS as\u00a0<strong>a decision-support tool that combines the capabilities of a relational database management system with the capabilities of a mapping system (<\/strong>1988). Cowen cited an earlier study by William Carstensen (1986), who sought to establish criteria by which local governments might choose among competing GIS products. Carstensen chose site selection as an example of the kind of complex task that many organizations seek to accomplish with GIS. Given the necessary database, he advised local governments to expect that a fully functional GIS should be able to identify property parcels that are:\r\n<ul>\r\n \t<li>At least five acres in size;<\/li>\r\n \t<li>Vacant or for sale;<\/li>\r\n \t<li>Zoned commercial;<\/li>\r\n \t<li>Not subject to flooding;<\/li>\r\n \t<li>Located not more than one mile from a heavy duty road; and<\/li>\r\n \t<li>Situated on terrain whose maximum slope is less than ten percent.<\/li>\r\n<\/ul>\r\nThe first criterion\u2013identifying parcels five acres or more in size\u2013might require two operations. As described earlier, a mapping system ought to be able to calculate automatically the area of a parcel. Once the area is calculated and added as a new attribute into the database, an ordinary database query could produce a list of parcels that satisfy the size criterion. The parcels on the list might also be highlighted on a map, as in the example below.\r\n\r\n<img alt=\"Map of property parcels five acres or larger in Ontario California\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/ont_ca_5acres.gif\" \/>\r\n\r\nThe cartographic result of a database query identifying all property parcels greater than or equal to five acres in size. (City of Ontario, CA, GIS Department. Used by permission.)\r\n\r\nThe ownership status of individual parcels would be an attribute of a property database maintained by a local tax assessor\u2019s office. Parcels whose ownership status attribute value matched the criteria \u201cvacant\u201d or \u201cfor sale\u201d could be identified through another ordinary database query.\r\n\r\n<img alt=\"Map of property parcels zoned commercial in Ontario California\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/ont_ca_commercial.gif\" \/>\r\n\r\nThe cartographic result of a spatial intersection (or map overlay) operation identifying all property parcels zoned for commercial (C-1) development. (City of Ontario, CA, GIS Department. Used by permission.)\r\n\r\nCarstensen\u2019s third criterion was to determine which parcels were situated within areas zoned for commercial development. This would be simple if authorized land uses were included as an attribute in the community\u2019s property parcel database. This is unlikely to be the case, however, since zoning and taxation are the responsibilities of different agencies. Typically, parcels and land use zones exist as separate paper maps. If the maps were prepared at the same scale, and if they accounted for the shape of the Earth in the same manner, then they could be superimposed one over another on a light table. If the maps let enough light through, parcels located within commercial zones could be identified.\r\n\r\nThe GIS approach to a task like this begins by digitizing the paper maps, and by producing corresponding attribute data files. Each digital map and attribute data file is stored in the GIS separately, like separate map<strong>layers<\/strong>. A fully functional GIS would then be used to perform a\u00a0<strong>spatial intersection<\/strong>\u00a0that is analogous to the overlay of the paper maps. Spatial intersection, otherwise known as\u00a0<strong>map overlay<\/strong>, is one of the defining capabilities of GIS.\r\n\r\n<img alt=\"Map of property parcels within one mile buffer of a highway in Ontario California\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/ont_ca_buffer.gif\" \/>\r\n\r\nThe cartographic result of a buffer operation identifying all property parcels located within a specified distance of a specified type of highway. (City of Ontario, CA, GIS Department. Used by permission.)\r\n\r\nAnother of Carstensen\u2019s criteria was to identify parcels located within one mile of a heavy-duty highway. Such a task requires a digital map and associated attributes produced in such a way as to allow heavy-duty highways to be differentiated from other geographic entities. Once the necessary database is in place, a<strong>buffer<\/strong>\u00a0operation can be used to create a polygon feature whose perimeter surrounds all \u201cheavy duty highway\u201d features at the specified distance. A spatial intersection is then performed, isolating the parcels within the buffer from those outside the buffer.\r\n\r\nTo produce a final list of parcels that meet all the site selection criteria, the GIS analyst might perform an<strong>intersection<\/strong>\u00a0operation that creates a new file containing only those records that are present in all the other intermediate results.\r\n\r\n<img alt=\"Map showing parcels that meet all search criteria in Ontario California\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/ont_ca_final.gif\" \/>\r\n\r\nThe cartographic result of the intersection of the above three figures. Only the parcels shown in this map satisfy all of the site selection criteria. (City of Ontario, CA, GIS Department. Used by permission.)\r\n\r\nI created the maps shown above in 1998 using the Geographic Information Web Server of the City of Ontario, California. Although it is no longer supported, the City of Ontario was one of the first of its kind to provide much of the functionality required to perform a site suitability analysis online. Today, many local governments offer similar Internet map services to current and prospective taxpayers.\r\n<h3><strong>TRY THIS<\/strong><\/h3>\r\nFind an online site selection utility similar to the one formerly provided by the City of Ontario. Registered Penn State students can post a comment to this page describing the site\u2019s functionality, and comparing it with the capabilities of the example illustrated above.\r\n<h2>1.12. Geographic Information Science and Technology<\/h2>\r\nSo far in this chapter I\u2019ve tried to make sense of GIS in relation to several information technologies, including database management, computer-aided design, and mapping systems. At this point I\u2019d like to expand the discussion to consider GIS as one element in a much larger field of study called \u201cGeographic Information Science and Technology\u201d (GIS&amp;T). As shown in the following illustration, GIS&amp;T encompasses three subfields including:\r\n<ul>\r\n \t<li><strong>Geographic Information Science<\/strong>, the multidisciplinary research enterprise that addresses the nature of geographic information and the application of geospatial technologies to basic scientific questions;<\/li>\r\n \t<li><strong>Geospatial Technology<\/strong>, the specialized set of information technologies that support acquisition, management, analysis, and visualization of geo-referenced data, including the Global Navigation Satellite System (GPS and others), satellite, airborne, and shipboard remote sensing systems; and GIS and image analysis software tools; and<\/li>\r\n \t<li><strong>Applications of GIS&amp;T,<\/strong>\u00a0the increasingly diverse uses of geospatial technology in government, industry, and academia.This is the subfield in which most GIS professionals work.<\/li>\r\n<\/ul>\r\nArrows in the diagram below reflect relationships among the three subfields, as well as to numerous other fields, including Geography, Landscape Architecture, Computer Science, Statistics, Engineering, and many others. Each of these fields has influenced, and some have been influenced by, the development of GIS&amp;T. It is important to note that these fields and subfields do not neatly correspond with professions like GIS analyst, photogrammetrist, or land surveyor. Rather, GIS&amp;T is a\u00a0<em>nexus\u00a0<\/em>of overlapping professions that differ in backgrounds, disciplinary allegiances, and regulatory status.\r\n\r\n<img alt=\"Diagram showing components of the field of Geographic Information Science and Technology and its relations to other fields.\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/BoK2006_Fig1_Domains_18Feb.jpg\" \/>\r\n\r\nThe field of Geographic Information Science and Technology (GIS&amp;T) and its relations to other fields. Two-way relations that are half-dashed represent asymmetrical contributions between allied fields. (\u00a9 2006 Association of American Geographers and University Consortium for Geographic Information Science. Used by permission. All rights reserved.)\r\n\r\nThe illustration above first appeared in the\u00a0<em>Geographic Information Science and Technology Body of Knowledge\u00a0<\/em>(DiBiase, DeMers, Johnson, Kemp, Luck, Plewe, and Wentz, 2006), published by the University Consortium for Geographic Information Science (UCGIS) and the Association of American Geographers (AAG) in 2006. The\u00a0<em>Body of Knowledge<\/em>\u00a0is a community-developed inventory of the knowledge and skills that define the GIS&amp;T field.\u00a0Like the bodies of knowledge developed in Computer Science and other fields, the<em>\u00a0GIS&amp;T BoK<\/em>\u00a0represents the GIS&amp;T knowledge domain as a hierarchical list of knowledge areas, units, topics, and educational objectives. The ten knowledge areas and 73 units that make up the first edition are shown in the table below. Twenty-six \u201ccore\u201d units (those in which all graduates of a degree or certificate program should be able to demonstrate some level of mastery) are shown in bold type. Not shown are the 329 topics that make up the units, or the 1,660 education objectives by which topics are defined. These appear in the full text of the\u00a0<em>GIS&amp;T BoK.\u00a0<\/em>Unfortunately, the full text is not freely available online. An important related work produced by the U.S. Department of Labor is, however. We\u2019ll take a look at that shortly.\r\n<h3>KNOWLEDGE AREAS AND UNITS COMPRISING THE 1ST EDITION OF THE GIS&amp;T BOK<\/h3>\r\n<strong>-Knowledge Area AM. Analytical Methods<\/strong>\r\n-Unit AM1 Academic and analytical origins\r\n-Unit AM2 Query operations and query languages\r\n<strong>-Unit AM3 Geometric measures\r\n-Unit AM4 Basic analytical operations\r\n-Unit AM5 Basic analytical methods<\/strong>\r\n-Unit AM6 Analysis of surfaces\r\n-Unit AM7 Spatial statistics\r\n-Unit AM8 Geostatistics\r\n-Unit AM9 Spatial regression and econometrics\r\n-Unit AM10 Data mining\r\n-Unit AM11 Network analysis\r\n-Unit AM12 Optimization and location-allocation modeling\r\n\r\n<strong>-Knowledge Area CF. Conceptual Foundations<\/strong>\r\n-Unit CF1 Philosophical foundations\r\n-Unit CF2 Cognitive and social foundations\r\n<strong>\u00a0 -Unit CF3 Domains of geographic information\r\n-Unit CF4 Elements of geographic information<\/strong>\r\n-Unit CF5 Relationships\r\n-Unit CF6 Imperfections in geographic information\r\n\r\n<strong>-Knowledge Area CV. Cartography and Visualization<\/strong>\r\n-Unit CV1 History and trends\r\n<strong>-Unit CV2 Data considerations\r\n-Unit CV3 Principles of map design<\/strong>\r\n-Unit CV4 Graphic representation techniques\r\n-Unit CV5 Map production\r\n<strong>-Unit CV6 Map use and evaluation<\/strong>\r\n\r\n<strong>-Knowledge Area DA. Design Aspects<\/strong>\r\n-Unit DA1 The scope of GI S&amp;T system design\r\n-Unit DA2 Project definition\r\n-Unit DA3 Resource planning\r\n<strong>-Unit DA4 Database design<\/strong>\r\n-Unit DA5 Analysis design\r\n-Unit DA6 Application design\r\n-Unit DA7 System implementation\r\n\r\n<strong>-Knowledge Area DM. Data Modeling<\/strong>\r\n-Unit DM1 Basic storage and retrieval structures\r\n<strong>-Unit DM2 Database management systems\r\n-Unit DM3 Tessellation data models\r\n-Unit DM4 Vector and object data models<\/strong>\r\n-Unit DM5 Modeling 3D, temporal, and uncertain phenomena\r\n\r\n<strong>-Knowledge Area DN. Data Manipulation<\/strong>\r\n<strong>-Unit DN1 Representation transformation\r\n-Unit DN2 Generalization and aggregation<\/strong>\r\n-Unit DN3 Transaction management of geospatial data\r\n\r\n<strong>-Knowledge Area GC. Geocomputation<\/strong>\r\n-Unit GC1 Emergence of geocomputation\r\n-Unit GC2 Computational aspects and neurocomputing\r\n-Unit GC3 Cellular Automata (CA) models\r\n-Unit GC4 Heuristics\r\n-Unit GC5 Genetic algorithms (GA)\r\n-Unit GC6 Agent-based models\r\n-Unit GC7 Simulation modeling\r\n-Unit GC8 Uncertainty\r\n-Unit GC9 Fuzzy sets\r\n\r\n<strong>-Knowledge Area GD. Geospatial Data<\/strong>\r\n-<strong>Unit GD1 Earth geometry\r\n-<\/strong>Unit GD2 Land partitioning systems\r\n<strong>\u00a0 -Unit GD3 Georeferencing systems\r\n-Unit GD4 Datums\r\n-Unit GD5 Map projections\r\n-Unit GD6 Data quality\r\n-Unit GD7 Land surveying and GPS\r\n<\/strong>-Unit GD8 Digitizing\r\n-Unit GD9 Field data collection\r\n<strong>\u00a0 -Unit GD10 Aerial imaging and photogrammetry\r\n-Unit GD11 Satellite and shipboard remote sensing\r\n-Unit GD12 Metadata, standards, and infrastructures<\/strong>\r\n\r\n<strong>-Knowledge Area GS. GIS&amp;T and Society<\/strong>\r\n-Unit GS1 Legal aspects\r\n-Unit GS2 Economic aspects\r\n-Unit GS3 Use of geospatial information in the public sector\r\n-Unit GS4 Geospatial information as property\r\n-Unit GS5 Dissemination of geospatial information\r\n<strong>-Unit GS6 Ethical aspects of geospatial information and technology<\/strong>\r\n-Unit GS7 Critical GIS\r\n\r\n<strong>-Knowledge Area OI. Organizational and Institutional Aspects<\/strong>\r\n-Unit OI1 Origins of GI S&amp;T\r\n-Unit O2 Managing the GI system operations and\u00a0\u00a0 infrastructure\r\n-Unit OI3 Organizational structures and procedures\r\n-Unit OI4 GI S&amp;T workforce themes\r\n<strong>-Unit OI5 Institutional and inter-institutional aspects\r\n-Unit OI6 Coordinating organizations (national and international)<\/strong>\r\n\r\nTen knowledge areas and 73 units comprising the 1st edition of the GIS&amp;T BoK. Core units are indicated with bold type.\u00a0 (\u00a9 2006 Association of American Geographers and University Consortium for Geographic Information Science. Used by permission. All rights reserved.)\r\n\r\nNotice that the knowledge area that includes the most core units is GD: Geospatial Data. This course focuses on the sources and distinctive characteristics of geographic data. This is one part of the knowledge base that most successful geospatial professionals possess. The Department of Labor\u2019s Geospatial Technology Competency Model (GTCM) highlights this and other essential elements of the geospatial knowledge base. We\u2019ll consider it next.\r\n<h2>1.13. Geospatial Competencies and Our Curriculum<\/h2>\r\nA body of knowledge is one way to think about the GIS&amp;T field. Another way is as an industry made up of agencies and firms that produce and consume goods and services, generate sales and (sometimes) profits, and employ people. In 2003, the U.S. Department of Labor (DoL) identified \u201cgeospatial technology\u201d as one of 14 \u201chigh growth\u201d technology industries, along with biotech, nanotech, and others. However, the DoL also observed that the geospatial technology industry was ill-defined, and poorly understood by the public.\r\n\r\nSubsequent efforts by the DoL and other organizations helped to clarify the industry\u2019s nature and scope. Following a series of \u201croundtable\u201d discussions involving industry thought leaders, the Geospatial Information Technology Association (GITA) and the Association of American Geographers (AAG) submitted the following \u201cconcensus\u201d definition to DoL in 2006:\r\n\r\nThe geospatial industry acquires, integrates, manages, analyzes, maps, distributes, and uses geographic, temporal, and spatial information and knowledge. The industry includes basic and applied research, technology development, education, and applications to address the planning, decision making, and operational needs of people and organizations of all types.\r\n\r\nIn addition to the proposed industry definition, the GITA and AAG report recommended that DoL establish additional occupations in recognition of geospatial industry workforce activities and needs. At the time, the existing geospatial occupations included only Surveyors, Surveying Technicians, Mapping Technicians, and Cartographers and Photogrammetrists. Late in 2009, with input from the GITA, AAG, and other stakeholders, the DoL established six new geospatial occupations: Geospatial Information Scientists and Technologists, Geographic Information Systems Technicians, Remote Sensing Scientists and Technologists, Remote Sensing Technicians, Precision Agriculture Technicians, and Geodetic Surveyors.\r\n<h3><strong>TRY THIS<\/strong><\/h3>\r\nInvestigate the geospatial occupations at the\u00a0<a href=\"http:\/\/www.onetonline.org\/\">U.S. Department of Labor\u2019s\u00a0 \u201cO*Net\u201d database<\/a>. Enter \u201cgeospatial\u201d in the search field named \u201cOccupation Quick Search.\u201d Follow links to occupation descriptions. Note the estimates for 2008 employment and employment growth through 2018. Also note that, for some anomalous reason, the keyword \u201cgeospatial\u201d is not associated with the occupation \u201cGeodetic Surveyor.\u201d\r\n\r\n<img alt=\"Screen capture of Department of Labor's O-Net site\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/o-net.png\" \/>\r\n\r\nMeanwhile, DoL commenced a \u201ccompetency modeling\u201d initiative for high-growth industries in 2005. Their goal was to help educational institutions like ours meet the demand for qualified technology workers by identifying what workers need to know and be able to do. At DoL, a\u00a0<em>competency<\/em>\u00a0is \u201cthe capability to apply or use a set of related knowledge, skills, and abilities required to successfully perform \u2018critical work functions\u2019 or tasks in a defined work setting\u201d (Ennis 2008). A\u00a0<em>competency model<\/em>\u00a0is \u201ca collection of competencies that together define successful performance in a particular work setting.\u201d\r\n\r\nWorkforce analysts at DoL began work on a Geospatial Technology Competency Model (GTCM) in 2005. Building on their research, a panel of accomplished practitioners and educators produced a complete draft of the GTCM, which they subsequently revised in response to public comments. Published in June 2010, the GTCM identifies the competencies that characterize successful workers in the geospatial industry. In contrast to\u00a0<em>GIS&amp;T Body of Knowledge,<\/em>\u00a0an academic project meant to define the nature and scope of the field, the GTCM is an industry specification the defines what individual workers and students should aspire to know and learn.\r\n<h3><strong>TRY THIS<\/strong><\/h3>\r\nExplore the\u00a0<a href=\"http:\/\/www.careeronestop.org\/CompetencyModel\/\">Geospatial Technology Competency Model (GTCM)<\/a>\u00a0at the U.S. Department of Labor\u2019s Competency Model Clearinghouse. Under \u201cIndustry Competency Models,\u201d follow the link \u201cGeospatial Technology.\u201d There, the pyramid (as shown below) is an image map which you can click to reveal the various competencies. The complete GTCM is also available as a Word doc and PDF file.\r\n\r\n&nbsp;\r\n\r\n<img alt=\"Screen capture of the Department of Labor's Geospatial Technology Competency Model site\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/gtcm.png\" \/>\r\n\r\n&nbsp;\r\n\r\nThe GTCM specifies several \u201ctiers\u201d of competencies, progressing from general to occupationally specific. Tiers 1 through 3 (the gray and red layers), called Foundation Competencies, specify general workplace behaviors and knowledge that successful workers in most industries exhibit. Tiers 4 and 5 (yellow) include the distinctive technical competencies that characterize a given industry and its three sectors: Positioning and Data Acquisition, Analysis and Modeling, and Programming and Application Development. Above Tier 5 are additional Tiers corresponding to the occupation-specific competencies and requirements that are specified in the occupation descriptions published at O*NET Online, and in a Geospatial Management Competency Model that is in development as of January, 2012.\r\n\r\nOne way educational institutions and students can use the GTCM is as a guideline for assessing how well curricula align with workforce needs. The Penn State Online GIS program conducted such an assessment in 2011. Results appear in the spreadsheet linked below.\r\n<h3><strong>TRY THIS<\/strong><\/h3>\r\nOpen the\u00a0<a href=\"http:\/\/opentextbc.ca\/natureofgeoinfo\/files\/geog482\/image\/GTCM_assessment_Penn_State_2011.xlsx\">attached Excel spreadsheet<\/a>\u00a0to see how our Penn State Online GIS curricula address workforce needs identified in the GTCM.\r\n\r\nThe sheet will open on a cover page. At the bottom of the sheet are\u00a0<strong>tabs<\/strong>\u00a0that correspond to Tiers 1-5 of the GTCM. Click the tabs to view the worksheet associated with the Tier you want to see.\r\n\r\nIn each Tier worksheet,\u00a0<strong>rows<\/strong>\u00a0correspond to the GTCM competencies.<strong>\u00a0Columns<\/strong>\u00a0correspond to the Penn State Online courses included in the assessment. Courses that are required for most students are highlighted light blue. Course authors and instructors were asked to state what students actually do in relation to each of the GTCM competencies. Use the\u00a0<strong>scroll bar<\/strong>\u00a0at the bottom right edge of the sheet to reveal more courses.\r\n\r\nOpen the\u00a0<a title=\"GTCM assessment of Penn State Online GIS curriculum\" href=\"http:\/\/opentextbc.ca\/natureofgeoinfo\/files\/geog482\/image\/gtcm_spreadsheet_demo.swf\">attached Flash movie<\/a>\u00a0to view a video demonstration of how to navigate the spreadsheet.\r\n\r\nBy studying this spreadsheet you\u2019ll gain insight about how individual courses, and how the Penn State Online curriculum as a whole, relates to geospatial workforce needs. If you\u2019re interested in comparing ours to curricula at other institutions, ask if they\u2019ve conducted a similar assessment. If they haven\u2019t, ask why not.\r\n\r\nFinally, don\u2019t forget that you can preview much of our online courseware through our\u00a0<a href=\"http:\/\/open.ems.psu.edu\/\">Open Educational Resouces initiative<\/a>.\r\n<h2>1.14. Distinguishing Properties of Geographic Data<\/h2>\r\nThe claim that geographic information science is a distinct field of study implies that spatial data are somehow special data. Goodchild (1992) points out several distinguishing properties of geographic information. I have paraphrased four such properties below. Understanding them, and their implications for the practice of geographic information science, is a key objective of this course.\r\n<ol>\r\n \t<li>Geographic data represent spatial locations and non-spatial attributes measured at certain times.<\/li>\r\n \t<li>Geographic space is continuous.<\/li>\r\n \t<li>Geographic space is nearly spherical.<\/li>\r\n \t<li>Geographic data tend to be spatially dependent.<\/li>\r\n<\/ol>\r\nLet\u2019s consider each of these properties next.\r\n<h2>1.15. Locations and Attributes<\/h2>\r\n<strong>Geographic data represent spatial locations and non-spatial attributes measured at certain times.<\/strong>Goodchild (1992, p. 33) observes that \u201ca spatial database has dual keys, allowing records to be accessed either by attributes or by locations.\u201d Dual keys are not unique to geographic data, but \u201cthe spatial key is distinct, as it allows operations to be defined which are not included in standard query languages.\u201d In the intervening years, software developers have created variations on SQL that incorporate spatial queries. The dynamic nature of geographic phenomena complicates the issue further, however. The need to pose spatio-temporal queries challenges geographic information scientists (GIScientists) to develop ever more sophisticated ways to represent geographic phenomena, thereby enabling analysts to interrogate their data in ever more sophisticated ways.\r\n<h2>1.16. Continuity<\/h2>\r\nGeographic space is continuous. Although dual keys are not unique to geographic data, one property of the spatial key is. \u201cWhat distinguishes spatial data is the fact that the spatial key is based on two continuous dimensions\u201d (Goodchild, 1992, p.33). \u201cContinuous\u201d refers to the fact that there are no gaps in the Earth\u2019s surface. Canyons, crevasses, and even caverns notwithstanding, there is no position on or near the surface of the Earth that cannot be fixed within some sort of coordinate system grid. Nor is there any theoretical limit to how exactly a position can be specified. Given the precision of modern positioning technologies, the number of unique point positions that could be used to define a geographic entity is practically infinite. Because it\u2019s not possible to measure, let alone to store, manage, and process, an infinite amount of data,\u00a0<strong>all geographic data is selective, generalized, approximate<\/strong>. Furthermore,\u00a0<strong>the larger the territory covered by a geographic database, the more generalized the database tends to be<\/strong>.\r\n\r\n&nbsp;\r\n\r\n&nbsp;\r\n\r\nGeographic data are generalized according to scale. Click on the buttons beneath the map to zoom in and out on the town of Gorham. (U.S. Geological Survey). (<strong>Note:<\/strong>\u00a0You will need to have the Adobe Flash player installed in order to complete this exercise. If you do not already have the Flash player, you can\u00a0<a href=\"http:\/\/www.adobe.com\/shockwave\/download\/index.cgi?P1_Prod_Version=ShockwaveFlash\">download it for free from Adobe<\/a>.)\r\n\r\n&nbsp;\r\n\r\nFor example, the illustration above shows a town called Gorham depicted on three different\u00a0<strong>topographic maps<\/strong>\u00a0produced by the United States Geological Survey. Gorham occupies a smaller space on the small-scale (1:250,000) map than it does at 1:62,000 or at 1:24,000. But the relative size of the feature isn\u2019t the only thing that changes. Notice that the shape of the feature that represents the town changes also. As does the number of features and the amount of detail shown within the town boundary and in the surrounding area. The name for this characteristically parallel decline in map detail and map scale is\u00a0<strong>generalization<\/strong>.\r\n\r\nIt is important to realize that generalization occurs not only on printed maps, but in digital databases as well. It is possible to represent phenomena with highly detailed features (whether they be made up of high-resolution raster grid cells or very many point locations) in a single\u00a0<strong>scale-independent\u00a0<\/strong>database. In practice, however, highly detailed databases are not only extremely expensive to create and maintain, but they also bog down information systems when used in analyses of large areas. For this reason, geographic databases are usually created at several scales, with different levels of detail captured for different intended uses.\r\n<h2>1.17. Nearly Spherical<\/h2>\r\n<strong>Geographic space is nearly spherical.<\/strong>\u00a0The fact that the Earth is nearly, but not quite, a sphere poses some surprisingly complex problems for those who wish to specify locations precisely.\r\n\r\n<img alt=\"World map showing the differences in elevation between a geoid and a reference ellipsoid.\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/geoid_map.jpg\" \/>\r\n\r\nDifferences in elevation between a geoid model and a reference ellipsoid. Deviations range from a high of 75 meters (colored red, over New Guinea) to a low of 104 meters (colored purple, in the Indian Ocean). (National Geodetic Survey, n. d.).\r\n\r\nThe\u00a0<strong>geographic coordinate system<\/strong>\u00a0of latitude and longitude coordinates provides a means to define positions on a sphere. Inaccuracies that are unacceptable for some applications creep in, however, when we confront the Earth\u2019s \u201cactual\u201d irregular shape, which is called the\u00a0<strong>geoid<\/strong>. Furthermore, the calculations of angles and distance that surveyors and others need to perform routinely are cumbersome with\u00a0<strong>spherical coordinates<\/strong>.\r\n\r\nThat consideration, along with the need to depict the Earth on flat pieces of paper, compels us to transform the globe into a plane, and to specify locations in\u00a0<strong>plane coordinates<\/strong>\u00a0instead of spherical coordinates. The set of mathematical transformations by which spherical locations are converted to locations on a plane\u2013called\u00a0<strong>map projections<\/strong>\u2013all lead inevitably to one or another form of inaccuracy.\r\n\r\nAll this is trouble enough, but we encounter even more difficulties when we seek to define \u201cvertical\u201d positions (elevations) in addition to \u201chorizontal\u201d positions. Perhaps it goes without saying that an elevation is the height of a location above some\u00a0<strong>datum<\/strong>, such as mean sea level. Unfortunately, to be suitable for precise positioning, a datum must correspond closely with the Earth\u2019s actual shape. Which brings us back again to the problem of the geoid.\r\n\r\nWe will consider these issues in greater depth in Chapter 2. For now, suffice it to say that geographic data are unique in having to represent phenomena that are distributed on a continuous and nearly spherical surface.\r\n<h2>1.18. Spatial Dependency<\/h2>\r\n<strong>Geographic data tend to be spatially dependent<\/strong>. Spatial dependence is \u201cthe propensity for nearby locations to influence each other and to possess similar attributes\u201d (Goodchild, 1992, p.33). In other words, to paraphrase a famous geographer named Waldo Tobler, while everything is related to everything else, things that are close together tend to be more related than things that are far apart. Terrain elevations, soil types, and surface air temperatures, for instance, are more likely to be similar at points two meters apart than at points two kilometers apart. A statistical measure of the similarity of attributes of point locations is called\u00a0<strong>spatial autocorrelation<\/strong>.\r\n\r\nGiven that geographic data are expensive to create, spatial dependence turns out to be a very useful property. We can sample attributes at a limited number of locations, then estimate the attributes of intermediate locations. The process of estimating unknown values from nearby known values is called<strong>interpolation<\/strong>. Interpolated values are reliable only to the extent that the spatial dependence of the phenomenon can be assumed. If we were unable to assume some degree of spatial dependence, it would be impossible to represent continuous geographic phenomena in digital form.\r\n<h3><strong>PRACTICE QUIZ<\/strong><\/h3>\r\n<h2>19. Geographic Data and Geographic Questions<\/h2>\r\nThe ultimate objective of all geospatial data and technologies, after all, is to produce knowledge. Most of us are interested in data only to the extent that they can be used to help understand the world around us, and to make better decisions.\u00a0 Decision making processes vary a lot from one organization to another. In general, however, the first steps in making a decision are to articulate the questions that need to be answered, and to gather and organize the data needed to answer the questions (Nyerges &amp; Golledge, 1997).\r\n\r\nGeographic data and information technologies can be very effective in helping to answer certain kinds of questions. The expensive, long-term investments required to build and sustain GIS infrastructures can be justified only if the questions that confront an organization can be stated in terms that GIS is equipped to answer. As a specialist in the field, you may be expected to advise clients and colleagues on the strengths and weaknesses of GIS as a decision support tool. To follow are examples of the kinds of questions that are amenable to GIS analyses, along with questions that GIS is not so well suited to help answer.\r\n<h3>QUESTIONS CONCERNING INDIVIDUAL GEOGRAPHIC ENTITIES<\/h3>\r\nThe simplest geographic questions pertain to individual entities. Such questions include:\r\n<h4><strong>QUESTIONS ABOUT SPACE<\/strong><\/h4>\r\n<ul>\r\n \t<li>Where is the entity located?<\/li>\r\n \t<li>What is its extent?<\/li>\r\n<\/ul>\r\n<h4><strong>QUESTIONS ABOUT ATTRIBUTES<\/strong><\/h4>\r\n<ul>\r\n \t<li>What are the attributes of the entity located there?<\/li>\r\n \t<li>Do its attributes match one or more criteria?<\/li>\r\n<\/ul>\r\n<h4><strong>QUESTIONS ABOUT TIME<\/strong><\/h4>\r\n<ul>\r\n \t<li>When were the entity\u2019s location, extent or attributes measured?<\/li>\r\n \t<li>Has the entity\u2019s location, extent, or attributes changed over time?<\/li>\r\n<\/ul>\r\nSimple questions like these can be answered effectively with a good printed map, of course. GIS becomes increasingly attractive as the number of people asking the questions grows, especially if they lack access to the required paper maps.\r\n<h3>QUESTIONS CONCERNING MULTIPLE GEOGRAPHIC ENTITIES<\/h3>\r\nHarder questions arise when we consider relationships among two or more entities. For instance, we can ask:\r\n<h3>QUESTIONS ABOUT SPATIAL RELATIONSHIPS<\/h3>\r\n<ul>\r\n \t<li>Do the entities contain one another?<\/li>\r\n \t<li>Do they overlap?<\/li>\r\n \t<li>Are they connected?<\/li>\r\n \t<li>Are they situated within a certain distance of one another?<\/li>\r\n \t<li>What is the best route from one entity to the others?<\/li>\r\n \t<li>Where are entities with similar attributes located?<\/li>\r\n<\/ul>\r\n<h3>QUESTIONS ABOUT ATTRIBUTE RELATIONSHIPS<\/h3>\r\n<ul>\r\n \t<li>Do the entities share attributes that match one or more criteria?<\/li>\r\n \t<li>Are the attributes of one entity influenced by changes in another entity?<\/li>\r\n<\/ul>\r\n<h3>QUESTIONS ABOUT TEMPORAL RELATIONSHIPS<\/h3>\r\n<ul>\r\n \t<li>Have the entities\u2019 locations, extents, or attributes changed over time?<\/li>\r\n<\/ul>\r\nGeographic data and information technologies are very well suited to answering moderately complex questions like these. GIS is most valuable to large organizations that need to answer such questions often.\r\n<h3>QUESTIONS THAT GIS IS NOT PARTICULARLY GOOD AT ANSWERING<\/h3>\r\nHarder still, however, are\u00a0<strong>explanatory questions<\/strong>\u2013such as\u00a0<em>why<\/em>\u00a0entities are located where they are,\u00a0<em>why<\/em>\u00a0they have the attributes they do, and\u00a0<em>why<\/em>\u00a0they have changed as they have. In addition, organizations are often concerned with\u00a0<strong>predictive questions<\/strong>\u2013such as what will happen at\u00a0<em>this<\/em>\u00a0location if thus-and-so happens at<em>that<\/em>\u00a0location? In general, commercial GIS software packages cannot be expected to provide clear-cut answers to explanatory and predictive questions right out of the box. Typically, analysts must turn to specialized statistical packages and simulation routines. Information produced by these analytical tools may then be re-introduced into the GIS database, if necessary. Research and development efforts intended to more tightly couple analytical software with GIS software are underway within the GIScience community. It is important to keep in mind that decision support tools like GIS are no substitutes for human experience, insight, and judgment.\r\n\r\nAt the outset of the chapter I suggested that producing information by analyzing data is something like producing energy by burning coal. In both cases, technology is used to realize the potential value of a raw material. Also in both cases, the production process yields some undesirable by-products. Similarly, in the process of answering certain geographic questions, GIS tends to raise others, such as:\r\n<ul>\r\n \t<li>Given the intrinsic imperfections of the data, how reliable are the results of the GIS analysis?<\/li>\r\n \t<li>Does the information produced through GIS analysis tend to systematically benefit some constituent groups at the expense of others?<\/li>\r\n \t<li>Should the data used to make the decision be made public?<\/li>\r\n \t<li>Does the use of GIS affect the organization\u2019s decision-making processes in ways that are beneficial to its management, its employees, and its customers?<\/li>\r\n<\/ul>\r\nAs is the case in so many endeavors, the answer to a geographic question usually includes more questions.\r\n<h3><strong>TRY THIS<\/strong><\/h3>\r\nCan you cite an example of a \u201chard\u201d question that you and your GIS system have been called upon to address? Registered Penn State students can post a comment directly to this page.\r\n<h2>1.20. Summary<\/h2>\r\nIt\u2019s a truism among specialists in geographic information that the lion\u2019s share of the cost of most GIS projects is associated with the development and maintenance of a suitable database. It seems appropriate, therefore, that our first course in geographic information systems should focus upon the properties of geographic data.\r\n\r\nI began this first chapter by defining data in a generic sense, as sets of symbols that represent measurements of phenomena. I suggested that data are the raw materials from which information is produced. Information systems, such as database management systems, are technologies that people use to transform data into the information needed to answer questions, and to make decisions.\r\n\r\nSpatial data are special data. They represent the locations, extents, and attributes of objects and phenomena that make up the Earth\u2019s surface at particular times. Geographic data differ from other kinds of data in that they are distributed along a continuous, nearly spherical globe. They also have the unique property that the closer two entities are located, the more likely they are to share similar attributes.\r\n\r\nGIS is a special kind of information system that combines the capabilities of database management systems with those of mapping systems. GIS is one object of study of the loosely-knit, multidisciplinary field called Geographic Information Science and Technology. GIS is also a profession\u2013one of several that make up the geospatial industry. As Yogi Berra said, \u201cIn theory, there\u2019s no difference between theory and practice. In practice there is.\u201d In the chapters and projects that follow, we\u2019ll investigate the nature of geographic information from both conceptual and practical points of view.\r\n<h3>COMMENTS AND QUESTIONS<\/h3>\r\nRegistered students are welcome to post comments, questions, and replies to questions about the text. Particularly welcome are anecdotes that relate the chapter text to your personal or professional experience. In addition, there are discussion forums available in the ANGEL course management system for comments and questions about topics that you may not wish to share with the whole world.\r\n\r\nTo post a comment, scroll down to the text box under \u201cPost new comment\u201d and begin typing in the text box, or you can choose to reply to an existing thread. When you are finished typing, click on either the \u201cPreview\u201d or \u201cSave\u201d button (Save will actually submit your comment). Once your comment is posted, you will be able to edit or delete it as needed. In addition, you will be able to reply to other posts at any time.\r\n\r\nNote: the first few words of each comment become its \u201ctitle\u201d in the thread.\r\n<h2>1.21. Bibliography<\/h2>\r\nCarstensen, L. W. (1986). Regional land information systems development using relational databases and geographic information systems.\u00a0<em>Proceedings of the AutoCarto<\/em>, London, 507-516.\r\n\r\nCity of Ontario, California. (n.d.).\u00a0<em>Geographic information web server<\/em>. Retrieved on July 6, 1999 from\u00a0<a href=\"http:\/\/www.ci.ontario.ca.us\/gis\/index.asp\">http:\/\/www.ci.ontario.ca.us\/gis\/index.asp<\/a>(since retired).\r\n\r\nCowen, D. J. (1988). GIS versus CAD versus DBMS: What are the differences?\u00a0<em>Photogrammetric Engineering and Remote Sensing<\/em>\u00a054:11, 1551-1555.\r\n\r\nDiBiase, D. and twelve others (2010).\u00a0<a href=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/chapter\/files\/sites\/file\/DiBiase_etal_2010_GTCM_URISA_Journal.pdf\">The New Geospatial Technology Competency Model: Bringing workforce needs into focus<\/a>.\u00a0<em>URISA Journal<\/em>22:2, 55-72.\r\n\r\nDiBiase, D, M. DeMers, A. Johnson, K. Kemp, A. Luck, B. Plewe, and E. Wentz (2007).\u00a0<a href=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/chapter\/files\/sites\/file\/BoK_CaGIS_2007.pdf\">Introducing the First Edition of the\u00a0<em>GIS&amp;T Body of Knowledge<\/em><\/a>.\u00a0<em>Cartography and Geographic Information Science,<\/em>\u00a034(2), pp. 113-120. U.S. National Report to the International Cartographic Association.\r\n\r\nEnnis, M. R. (2008). Competency models: A review of the literature and the role of the employment and training administration (ETA).<a href=\"http:\/\/www.careeronestop.org\/COMPETENCYMODEL\/info_documents\/OPDRLiteratureReview.pdf\">http:\/\/www.careeronestop.org\/COMPETENCYMODEL\/info_documents\/OPDRLiteratureReview.pdf<\/a>.\r\n\r\nGITA and AAG (2006). Defining and communicating geospatial industry workforce demand: Phase I report.\r\n\r\nGoodchild, M. (1992). Geographical information science.\u00a0<em>International Journal of Geographic Information Systems<\/em>\u00a06:1, 31-45.\r\n\r\nGoodchild, M. (1995). GIS and geographic research. In J. Pickles (Ed.),<em>Ground truth: the social implications of geographic information systems<\/em>(pp. of chapter). New York: Guilford.\r\n\r\nNational Decision Systems.\u00a0<em>A zip code can make your company lots of money!<\/em>\u00a0Retrieved on July 6, 1999 from<a href=\"http:\/\/laguna.natdecsys.com\/lifequiz\">http:\/\/laguna.natdecsys.com\/lifequiz<\/a>\u00a0(since retired).\r\n\r\nNational Geodetic Survey. (1997).\u00a0<em>Image generated from 15\u2032x15\u2032 geoid undulations covering the planet Earth.\u00a0<\/em>Retrieved 1999, from<a href=\"http:\/\/www.ngs.noaa.gov\/GEOID\/geo-index.html\">http:\/\/www.ngs.noaa.gov\/GEOID\/geo-index.html<\/a>\u00a0(since retired).\r\n\r\nNyerges, T. L. &amp; Golledge, R. G. (n.d.)\u00a0<em>NCGIA core curriculum in GIS<\/em>, National Center for Geographic Information and Analysis, University of California, Santa Barbara, Unit 007. Retrieved November 12, 1997, from<a href=\"http:\/\/www.ncgia.ucsb.edu\/giscc\/units\/u007\/u007.html\">http:\/\/www.ncgia.ucsb.edu\/giscc\/units\/u007\/u007.html<\/a>\u00a0(since retired).\r\n\r\nUnited States Department of the Interior Geological Survey. (1977). [map]. 1:24 000. 7.5 minute series. Washington, D.C.: USDI.\r\n\r\nUnited States Geologic Survey. \u201cBellefonte, PA Quadrangle\u201d (1971). [map]. 1:24 000. 7.5 minute series. Washington, D.C.:USGS.\r\n\r\nUniversity Consortium for Geographic Information Science. Retrieved April 26, 2006, from\u00a0<a href=\"http:\/\/www.ucgis.org\/\">http:\/\/www.ucgis.org<\/a>\r\n\r\nWilson, J. D. (2001). Attention data providers: A billion-dollar application awaits.\u00a0<em>GEOWorld<\/em>, February, 54.\r\n\r\nWorboys, M. F. (1995).\u00a0<em>GIS: A computing perspective<\/em>. London: Taylor and Francis.\r\n\r\n<a title=\"Go to previous page\" href=\"http:\/\/opentextbc.ca\/natureofgeoinfo\/c1_p20.html\">\u2039 20. Summary<\/a>\u00a0<a title=\"Go to parent page\" href=\"http:\/\/opentextbc.ca\/natureofgeoinfo\/c1.html\">up<\/a>\u00a0<a title=\"Go to next page\" href=\"http:\/\/opentextbc.ca\/natureofgeoinfo\/c2.html\">Chapter 2: Scales and Transformations \u203a<\/a>","rendered":"<h2>1.1. Overview<\/h2>\n<p>When I started writing this text in 1997, my office was across the street (and, fortunately, upwind) from Penn State&#8217;s power plant. The energy used to heat and cool my office is still produced there by burning coal mined from nearby ridges. Combustion transforms the potential energy stored in the coal into electricity, which solves the problem of an office that would otherwise be too cold or too warm. Unfortunately, the solution itself causes another problem, namely emissions of carbon dioxide and other more noxious substances into the atmosphere. Cleaner means of generating electricity exist, of course, but they too involve transforming energy from one form to another. And cleaner methods cost more than most of us are willing or able to pay.<\/p>\n<p>It seems to me that a coal-fired power plant is a pretty good analogy for a geographic information system. For that matter, GIS is comparable to any factory or machine that transforms a raw material into something more valuable. Data is grist for the GIS mill. GIS is like the machinery that transforms the data into the commodity&#8211;information&#8211;that is needed to solve problems or create opportunities. And the problems that the manufacturing process itself creates include uncertainties resulting from imperfections in the data, intentional or unintentional misuse of the machinery, and ethical issues related to what the information is used for, and who has access to it.<\/p>\n<p>This text explores the nature of geographic information. To study the nature of something is to investigate its essential characteristics and qualities. To understand the nature of the energy produced in a coal-fired power plant, one should study the properties, morphology, and geographic distribution of coal. By the same reasoning I believe that a good approach to understanding the information produced by GIS is to investigate the properties of geographic data and the technologies and institutions that produce it.<\/p>\n<h3>Objectives<\/h3>\n<p>The goal of Chapter 1 is to situate GIS in a larger enterprise known as Geographic Information Science and Technology (GIS&amp;T), and in what the U.S. Department of Labor calls the &#8220;geospatial industry.&#8221; In particular, students who successfully complete Chapter 1 should be able to:<\/p>\n<ol>\n<li>Define a geographic information system;<\/li>\n<li>Recognize and name basic database operations from verbal descriptions;<\/li>\n<li>Recognize and name basic approaches to geographic representation from verbal descriptions;<\/li>\n<li>Identify and explain at least three distinguishing properties of geographic data; and<\/li>\n<li>Outline the kinds of questions that GIS can help answer.<\/li>\n<\/ol>\n<h2>1.2. Checklist<\/h2>\n<p>The following checklist is for Penn State students who are registered for classes in which this text, and associated quizzes and projects in the ANGEL course management system, have been assigned. You may find it useful to print this page out first so that you can follow along with the directions.<\/p>\n<h3>Chapter 1 Checklist (for registered students only)<\/h3>\n<table summary=\"Tasks to be completed for the lesson.\">\n<caption>Chapter 1 Checklist<\/caption>\n<tbody>\n<tr>\n<th>Step<\/th>\n<th>Activity<\/th>\n<th>Access\/Directions<\/th>\n<\/tr>\n<tr>\n<th>1<\/th>\n<td><strong>Read<\/strong>\u00a0Chapter 1<\/td>\n<td>This is the second page of Chapter 1. Click on the links at the bottom of the page to continue or to return to the previous page, or to go to the top of the chapter. You can also navigate the text via the links in the GEOG 482 menu on the left.<\/td>\n<\/tr>\n<tr>\n<th>2<\/th>\n<td>Submit\u00a0<strong>quizzes<\/strong>\u00a0as you come across them in the chapter. Blue banners denote practice quizzes that are not graded. Red banners signal graded quizzes. (Note that Chapter 1 does not include a graded quiz.)<\/td>\n<td>Go to ANGEL &gt; [your course section] &gt; Lessons tab &gt; Chapter 1 folder &gt; [quiz]<\/td>\n<\/tr>\n<tr>\n<th>3<\/th>\n<td>Perform\u00a0<strong>&#8220;Try This&#8221; activities<\/strong>\u00a0as you come across them in the chapter. &#8220;Try This&#8221; activities are not graded.<\/td>\n<td>Instructions are provided for each activity.<\/td>\n<\/tr>\n<tr>\n<th>4<\/th>\n<td>\u00a0Read\u00a0<strong>comments and questions<\/strong>\u00a0posted by fellow students. Add comments and questions of your own, if any.<\/td>\n<td>\u00a0Comments and questions may be posted on any page of the text, or in a Chapter-specific discussion forum in ANGEL.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>\u00a01.3. Data<\/h2>\n<p>&#8220;After more than 30 years, we&#8217;re still confronted by the same major challenge that GIS professionals have always faced: You must have good data. And good data are expensive and difficult to create.&#8221; (Wilson, 2001, p. 54)<\/p>\n<p><strong>Data consist of symbols that represent measurements of phenomena.\u00a0<\/strong>People create and study data as a means to help understand how natural and social systems work. Such systems can be hard to study because they&#8217;re made up of many interacting phenomena that are often difficult to observe directly, and because they tend to change over time. We attempt to make systems and phenomena easier to study by measuring their characteristics at certain times. Because it&#8217;s not practical to measure everything, everywhere, at all times, we measure selectively. How accurately data reflect the phenomena they represent depends on how, when, where, and what aspects of the phenomena were measured. All measurements, however, contain a certain amount of error.<\/p>\n<p>Measurements of the locations and characteristics of phenomena can be represented with several different kinds of symbols. For example, pictures of the land surface, including photographs and maps, are made up of graphic symbols. Verbal descriptions of property boundaries are recorded on deeds using alphanumeric symbols. Locations determined by satellite positioning systems are reported as pairs of numbers called coordinates. As you probably know, all of these different types of data&#8211;pictures, words, and numbers&#8211;can be represented in computers in digital form. Obviously, digital data can be stored, transmitted, and processed much more efficiently than their physical counterparts that are printed on paper. These advantages set the stage for the development and widespread adoption of GIS.<\/p>\n<h2>1.4. Information<\/h2>\n<p><strong>Information is data that has been selected or created in response to a question.<\/strong>\u00a0For example, the location of a building or a route is data, until they are needed to dispatch an ambulance in response to an emergency. When used to inform those who need to know &#8220;where is the emergency, and what&#8217;s the fastest route between here and there?,&#8221; the data are transformed into information. The transformation involves the ability to ask the right kind of question, and the ability to retrieve existing data&#8211;or to generate new data from the old&#8211;that help people answer the question. The more complex the question, and the more locations involved, the harder it becomes to produce timely information with paper maps alone.<\/p>\n<p>Interestingly, the potential value of data is not necessarily lost when they are used. Data can be transformed into information again and again, provided that the data are kept up to date. Given the rapidly increasing accessibility of computers and communications networks in the U.S. and abroad, it&#8217;s not surprising that information has become a commodity, and that the ability to produce it has become a major growth industry.<\/p>\n<h2>1.5. Information Systems<\/h2>\n<p><strong>Information systems are computer-based tools that help people transform data into information.<\/strong><\/p>\n<p>As you know, many of the problems and opportunities faced by government agencies, businesses, and other organizations are so complex, and involve so many locations, that the organizations need assistance in creating useful and timely information. That\u2019s what information systems are for.<\/p>\n<p>Allow me a fanciful example. Suppose that you\u2019ve launched a new business that manufactures solar-powered lawn mowers. You\u2019re planning a direct mail campaign to bring this revolutionary new product to the attention of prospective buyers. But since it\u2019s a small business, you can\u2019t afford to sponsor coast-to-coast television commercials, or to send brochures by mail to more than 100 million U.S. households. Instead, you plan to target the most likely customers \u2013 those who are environmentally conscious, have higher than average family incomes, and who live in areas where there is enough water and sunshine to support lawns and solar power.<\/p>\n<p>Fortunately, lots of data are available to help you define your mailing list. Household incomes are routinely reported to banks and other financial institutions when families apply for mortgages, loans, and credit cards. Personal tastes related to issues like the environment are reflected in behaviors such as magazine subscriptions and credit card purchases. Firms like Claritas amass such data, and transform it into information by creating \u201clifestyle segments\u201d \u2013 categories of households that have similar incomes and tastes. Your solar lawnmower company can purchase lifestyle segment information by 5-digit ZIP code, or even by ZIP+4 codes, which designate individual households.<\/p>\n<p>It\u2019s astonishing how companies like Claritas can create valuable information from the millions upon millions of transactions that are recorded every day. Their products are made possible by the fact that the original data exist in digital form, and because the companies have developed information systems that enable them to transform the data into information that companies like yours value. The fact that lifestyle information products are often delivered by geographic areas, such as ZIP codes, speaks to the appeal of geographic information systems.<\/p>\n<h3><strong>TRY THIS<\/strong><\/h3>\n<p>Try out the demo of what Claritas used to call the \u201cYou Are Where You Live\u201d tool. The Nielson Company has acquired Claritas and the tool is now called \u201cMyBestSegments.\u201d Point your browser to the\u00a0<a href=\"https:\/\/segmentationsolutions.nielsen.com\/mybestsegments\/\">My Best Segments<\/a>\u00a0page. Click the button labeled \u201cZIP Code Look-up.\u201d<\/p>\n<p>Enter your ZIP code then choose a segmentation system. Do the lifestyle segments, listed on the left, seem accurate for your community? If you don\u2019t live in the United States, try Penn State\u2019s Zip code, 16802.<br \/>\nDoes the market segmentation match your expectations? Registered students are welcome to post comments directly to this page.<\/p>\n<h2>1.6. Databases, Mapping, and GIS<\/h2>\n<p>One of our objectives in this first chapter is to be able to define a geographic information system. Here\u2019s a tentative definition:\u00a0<strong>A GIS is a computer-based tool used to help people transform geographic data into geographic information.<\/strong><\/p>\n<p>The definition implies that a GIS is somehow different from other information systems, and that geographic data are different from non-geographic data. Let\u2019s consider the differences next.<\/p>\n<h2>1.7. Database Management Systems<\/h2>\n<p>Claritas and similar companies use database management systems (DBMS) to create the \u201clifestyle segments\u201d that I referred to in the previous section. Basic database concepts are important since GIS incorporates much of the functionality of DBMS.<\/p>\n<p>Digital data are stored in computers as files. Often, data are arrayed in tabular form. For this reason, data files are often called\u00a0<strong>tables<\/strong>. A\u00a0<strong>database<\/strong>\u00a0is a collection of tables. Businesses and government agencies that serve large clienteles, such as telecommunications companies, airlines, credit card firms, and banks, rely on extensive databases for their billing, payroll, inventory, and marketing operations.\u00a0<strong>Database management systems<\/strong>\u00a0are information systems that people use to store, update, and analyze non-geographic databases.<\/p>\n<p>Often, data files are tabular in form, composed of rows and columns.\u00a0<strong>Rows<\/strong>, also known as\u00a0<strong>records<\/strong>, correspond with individual entities, such as customer accounts.\u00a0<strong>Columns<\/strong>\u00a0correspond with the various<strong>attributes<\/strong>\u00a0associated with each entity. The attributes stored in the accounts database of a telecommunications company, for example, might include customer names, telephone numbers, addresses, current charges for local calls, long distance calls, taxes, etc.<\/p>\n<p><strong>Geographic data<\/strong>\u00a0are a special case: records correspond with places, not people or accounts. Columns represent the attributes of places. The data in the following table, for example, consist of records for Pennsylvania counties. Columns contain selected attributes of each county, including the county\u2019s ID code, name, and 1980 population.<\/p>\n<table summary=\"1980 populations for 15 PA counties\">\n<caption>1980 Population Data for PA Counties<\/caption>\n<thead>\n<tr>\n<th>FIPS Code<\/th>\n<th>County<\/th>\n<th>1980 Pop<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>42001<\/td>\n<td>Adams County<\/td>\n<td>78274<\/td>\n<\/tr>\n<tr>\n<td>42003<\/td>\n<td>Allegheny County<\/td>\n<td>1336449<\/td>\n<\/tr>\n<tr>\n<td>42005<\/td>\n<td>Armstrong County<\/td>\n<td>73478<\/td>\n<\/tr>\n<tr>\n<td>42007<\/td>\n<td>Beaver County<\/td>\n<td>186093<\/td>\n<\/tr>\n<tr>\n<td>42009<\/td>\n<td>Bedford County<\/td>\n<td>47919<\/td>\n<\/tr>\n<tr>\n<td>42011<\/td>\n<td>Berks County<\/td>\n<td>336523<\/td>\n<\/tr>\n<tr>\n<td>42013<\/td>\n<td>Blair County<\/td>\n<td>130542<\/td>\n<\/tr>\n<tr>\n<td>42015<\/td>\n<td>Bradford County<\/td>\n<td>60967<\/td>\n<\/tr>\n<tr>\n<td>42017<\/td>\n<td>Bucks County<\/td>\n<td>541174<\/td>\n<\/tr>\n<tr>\n<td>42019<\/td>\n<td>Butler County<\/td>\n<td>152013<\/td>\n<\/tr>\n<tr>\n<td>42021<\/td>\n<td>Cambria County<\/td>\n<td>163062<\/td>\n<\/tr>\n<tr>\n<td>42023<\/td>\n<td>Cameron County<\/td>\n<td>5913<\/td>\n<\/tr>\n<tr>\n<td>42025<\/td>\n<td>Carbon County<\/td>\n<td>56846<\/td>\n<\/tr>\n<tr>\n<td>42027<\/td>\n<td>Centre County<\/td>\n<td>124812<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The contents of one file in a database.<\/p>\n<p>The example is a very simple file, but many geographic attribute databases are in fact very large (the U.S. is made up of over 3,000 counties, almost 50,000 census tracts, about 43,000 five-digit ZIP code areas and many tens of thousands more ZIP+4 code areas). Large databases consist not only of lots of data, but also lots of files. Unlike a spreadsheet, which performs calculations only on data that are present in a single document, database management systems allow users to store data in, and retrieve data from, many separate files. For example, suppose an analyst wished to calculate population change for Pennsylvania counties between the 1980 and 1990 censuses. More than likely, 1990 population data would exist in a separate file, like so:<\/p>\n<table summary=\"1990 populations for 15 PA counties\">\n<caption>1990 Population Data for PA Counties<\/caption>\n<thead>\n<tr>\n<th>FIPS Code<\/th>\n<th>1990 Pop<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>42001<\/td>\n<td>84921<\/td>\n<\/tr>\n<tr>\n<td>42003<\/td>\n<td>1296037<\/td>\n<\/tr>\n<tr>\n<td>42005<\/td>\n<td>73872<\/td>\n<\/tr>\n<tr>\n<td>42007<\/td>\n<td>187009<\/td>\n<\/tr>\n<tr>\n<td>42009<\/td>\n<td>49322<\/td>\n<\/tr>\n<tr>\n<td>42011<\/td>\n<td>352353<\/td>\n<\/tr>\n<tr>\n<td>42013<\/td>\n<td>131450<\/td>\n<\/tr>\n<tr>\n<td>42015<\/td>\n<td>62352<\/td>\n<\/tr>\n<tr>\n<td>42017<\/td>\n<td>578715<\/td>\n<\/tr>\n<tr>\n<td>42019<\/td>\n<td>167732<\/td>\n<\/tr>\n<tr>\n<td>42021<\/td>\n<td>158500<\/td>\n<\/tr>\n<tr>\n<td>42023<\/td>\n<td>5745<\/td>\n<\/tr>\n<tr>\n<td>42025<\/td>\n<td>58783<\/td>\n<\/tr>\n<tr>\n<td>42027<\/td>\n<td>131489<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Another file in a database. A database management system (DBMS) can relate this file to the prior one illustrated above because they share the list of attributes called \u201cFIPS Code.\u201d<\/p>\n<p>If two data files have at least one common attribute, a DBMS can combine them in a single new file. The common attribute is called a\u00a0<strong>key<\/strong>. In this example, the key was the county FIPS code (FIPS stands for Federal Information Processing Standard). The DBMS allows users to produce new data as well as to retrieve existing data, as suggested by the new \u201c% Change\u201d attribute in the table below.<\/p>\n<table summary=\"Percent change in populations of 15 PA counties from 1980 to 1990\">\n<caption>Percent Change in Populations for PA Counties 1980-1990<\/caption>\n<thead>\n<tr>\n<th>FIPS<\/th>\n<th>County<\/th>\n<th>1980<\/th>\n<th>1990<\/th>\n<th>% Change<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>42001<\/td>\n<td>Adams<\/td>\n<td>78274<\/td>\n<td>84921<\/td>\n<td>8.5<\/td>\n<\/tr>\n<tr>\n<td>42003<\/td>\n<td>Allegheny<\/td>\n<td>1336449<\/td>\n<td>1296037<\/td>\n<td>-3<\/td>\n<\/tr>\n<tr>\n<td>42005<\/td>\n<td>Armstrong<\/td>\n<td>73478<\/td>\n<td>73872<\/td>\n<td>0.5<\/td>\n<\/tr>\n<tr>\n<td>42007<\/td>\n<td>Beaver<\/td>\n<td>186093<\/td>\n<td>187009<\/td>\n<td>0.5<\/td>\n<\/tr>\n<tr>\n<td>42009<\/td>\n<td>Bedford<\/td>\n<td>47919<\/td>\n<td>49322<\/td>\n<td>2.9<\/td>\n<\/tr>\n<tr>\n<td>42011<\/td>\n<td>Berks<\/td>\n<td>336523<\/td>\n<td>352353<\/td>\n<td>4.7<\/td>\n<\/tr>\n<tr>\n<td>42013<\/td>\n<td>Blair<\/td>\n<td>130542<\/td>\n<td>131450<\/td>\n<td>0.7<\/td>\n<\/tr>\n<tr>\n<td>42015<\/td>\n<td>Bradford<\/td>\n<td>60967<\/td>\n<td>62352<\/td>\n<td>2.3<\/td>\n<\/tr>\n<tr>\n<td>42017<\/td>\n<td>Bucks<\/td>\n<td>541174<\/td>\n<td>578715<\/td>\n<td>6.9<\/td>\n<\/tr>\n<tr>\n<td>42019<\/td>\n<td>Butler<\/td>\n<td>152013<\/td>\n<td>167732<\/td>\n<td>10.3<\/td>\n<\/tr>\n<tr>\n<td>42021<\/td>\n<td>Cambria<\/td>\n<td>163062<\/td>\n<td>158500<\/td>\n<td>-2.8<\/td>\n<\/tr>\n<tr>\n<td>42023<\/td>\n<td>Cameron<\/td>\n<td>5913<\/td>\n<td>5745<\/td>\n<td>-2.8<\/td>\n<\/tr>\n<tr>\n<td>42025<\/td>\n<td>Carbon<\/td>\n<td>56846<\/td>\n<td>58783<\/td>\n<td>3.4<\/td>\n<\/tr>\n<tr>\n<td>42027<\/td>\n<td>Centre<\/td>\n<td>124812<\/td>\n<td>131489<\/td>\n<td>5.3<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>A new file produced from the prior two files as a result of two database operations. One operation merged the contents of the two files without redundancy. A second operation produced a new attribute\u2013\u201d% Change\u201d\u2013dividing the difference between \u201c1990 Pop\u201d and \u201c1980 Pop\u201d by \u201c1980 Pop\u201d and expressing the result as a percentage.<\/p>\n<p>Database management systems are valuable because they provide secure means of storing and updating data. Database administrators can protect files so that only authorized users can make changes. DBMS provide transaction management functions that allow multiple users to edit the database simultaneously. In addition, DBMS also provide sophisticated means to retrieve data that meet user specified criteria. In other words, they enable users to select data in response to particular questions. A question that is addressed to a database through a DBMS is called a\u00a0<strong>query<\/strong>.<\/p>\n<p>Database queries include basic set operations, including union, intersection, and difference. The product of a<strong>union<\/strong>\u00a0of two or more data files is a single file that includes all records and attributes, without redundancy. An\u00a0<strong>intersection<\/strong>\u00a0produces a data file that contains only records present in all files. A\u00a0<strong>difference<\/strong>\u00a0operation produces a data file that eliminates records that appear in both original files. (Try drawing Venn diagrams\u2013intersecting circles that show relationships between two or more entities\u2013to illustrate the three operations. Then compare your sketch to\u00a0<a href=\"http:\/\/opentextbc.ca\/files\/geog482\/image\/venn_diagrams%282%29.png\">the venn diagram example<\/a>. ) All operations that involve multiple data files rely on the fact that all files contain a common key. The key allows the database system to relate the separate files. Databases that contain numerous files that share one or more keys are called\u00a0<strong>relational databases<\/strong>. Database systems that enable users to produce information from relational databases are called<strong>relational database management systems<\/strong>.<\/p>\n<p>A common use of database queries is to identify subsets of records that meet criteria established by the user. For example, a credit card company may wish to identify all accounts that are 30 days or more past due. A county tax assessor may need to list all properties not assessed within the past 10 years. Or the U.S. Census Bureau may wish to identify all addresses that need to be visited by census takers, because census questionnaires were not returned by mail. DBMS software vendors have adopted a standardized language called SQL (Structured Query Language) to pose such queries.<\/p>\n<h3><strong>PRACTICE QUIZ<\/strong><\/h3>\n<h2>1.8. Mapping Systems<\/h2>\n<p>GIS (geographic information systems) arose out of the need to perform spatial queries on geographic data. A spatial query requires knowledge of locations as well as attributes. For example, an environmental analyst might want to know which public drinking water sources are located within one mile of a known toxic chemical spill. Or, a planner might be called upon to identify property parcels located in areas that are subject to flooding. To accommodate geographic data and spatial queries, database management systems need to be integrated with mapping systems. Until about 1990, most maps were printed from handmade drawings or engravings. Geographic data produced by draftspersons consisted of graphic marks inscribed on paper or film. To this day, most of the lines that appear on topographic maps published by the U.S. Geological Survey were originally engraved by hand. The place names shown on the maps were affixed with tweezers, one word at a time. Needless to say, such maps were expensive to create and to keep up to date. Computerization of the mapmaking process had obvious appeal.<\/p>\n<p><strong>Computer-aided design (CAD)<\/strong>\u00a0CAD systems were originally developed for engineers, architects, and other design professionals who needed more efficient means to create and revise precise drawings of machine parts, construction plans, and the like. In the 1980s, mapmakers began to adopt CAD in place of traditional map drafting. CAD operators encode the locations and extents of roads, streams, boundaries and other entities by tracing maps mounted on electronic drafting tables, or by key-entering location coordinates, angles, and distances. Instead of graphic features, CAD data consist of digital features, each of which is composed of a set of point locations. Calculations of distances, areas, and volumes can easily be automated once features are digitized. Unfortunately, CAD systems typically do not encode data in forms that support spatial queries. In 1988, a geographer named David Cowen illustrated the benefits and shortcomings of CAD for spatial decision making. He pointed out that a CAD system would be useful for depicting the streets, property parcel boundaries, and building footprints of a residential subdevelopment. A CAD operator could point to a particular parcel, and highlight it with a selected color or pattern. \u201cA typical CAD system\u201d, Cowen observed, \u201ccould not automatically shade each parcel based on values in an assessor\u2019s database containing information regarding ownership, usage, or value, however.\u201d A CAD system would be of limited use to someone who had to make decisions about land use policy or tax assessment.<\/p>\n<p><strong>Desktop mapping\u00a0\u00a0<\/strong>An evolutionary stage in the development of GIS, desktop mapping systems like Atlas*GIS combined some of the capabilities of CAD systems with rudimentary linkages between location data and attribute data. A desktop mapping system user could produce a map in which property parcels are automatically colored according to various categories of property values, for example. Furthermore, if property value categories were redefined, the map\u2019s appearance could be updated automatically. Some desktop mapping systems even supported simple queries that allow users to retrieve records from a single attribute file. Most real-world decisions require more sophisticated queries involving multiple data files. That\u2019s where real GIS comes in.<\/p>\n<p><strong>Geographic information systems (GIS)<\/strong>\u00a0As stated earlier, information systems assist decision makers by enabling them to transform data into useful information. GIS specializes in helping users transform geographic data into geographic information. David Cowen (1988) defined GIS as a decision support tool that combines the attribute data handling capabilities of relational database management systems with the spatial data handling capabilities of CAD and desktop mapping systems. In particular, GIS enables decision makers to identify locations or routes whose attributes match multiple criteria, even though entities and attributes may be encoded in many different data files.<\/p>\n<p>Innovators in many fields, including engineers, computer scientists, geographers, and others, started developing digital mapping and CAD systems in the 1950s and 60s. One of the first challenges they faced was to convert the graphical data stored on paper maps into digital data that could be stored in, and processed by, digital computers. Several different approaches to representing locations and extents in digital form were developed. The two predominant representation strategies are known as \u201cvector\u201d and \u201craster.\u201d<\/p>\n<h2>1.9. Representation Strategies for Mapping<\/h2>\n<p>Recall that data consist of symbols that represent measurements. Digital geographic data are encoded as alphanumeric symbols that represent locations and attributes of locations measured at or near Earth\u2019s surface. No geographic data set represents every possible location, of course. The Earth is too big, and the number of unique locations is too great. In much the same way that public opinion is measured through polls, geographic data are constructed by measuring representative\u00a0<strong>samples<\/strong>\u00a0of locations. And just as serious opinion polls are based on sound principles of statistical sampling, so too do geographic data represent reality by measuring carefully chosen samples of locations. Vector and raster data are, at essence, two distinct sampling strategies.<\/p>\n<p>The\u00a0<strong>vector<\/strong>\u00a0approach involves sampling locations at intervals along the length of linear entities (like roads), or around the perimeter of areal entities (like property parcels). When they are connected by lines, the sampled points form line features and polygon features that approximate the shapes of their real-world counterparts.<\/p>\n<p><a href=\"http:\/\/opentextbc.ca\/files\/geog482\/file\/vector.avi\"><img decoding=\"async\" class=\"aligncenter\" alt=\"Illustration of vector encoding of a reservoir and highway\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/vector.gif\" \/><\/a><\/p>\n<p>Two frames (the first and last) of an animation showing the construction of a vector representation of a reservoir and highway.<\/p>\n<h3><strong>TRY THIS<\/strong><\/h3>\n<p>Click the graphic above to download and view the animation file (vector.avi, 1.6 Mb) in a separate Microsoft Media Player window.<\/p>\n<p>To view the\u00a0<a href=\"http:\/\/opentextbc.ca\/files\/geog482\/file\/vector.mov\">same animation in QuickTime format (vector.mov, 1.6 Mb), click here<\/a>. Requires the QuickTime plugin, which is available free at\u00a0<a href=\"http:\/\/www.apple.com\/quicktime\/download\/\">apple.com<\/a>.<\/p>\n<p>The aerial photograph above left shows two entities, a reservoir and a highway. The graphic above right illustrates how the entities might be represented with vector data. The small squares are nodes: point locations specified by latitude and longitude coordinates. Line segments connect nodes to form line features. In this case, the line feature colored red represents the highway. Series of line segments that begin and end at the same node form polygon features. In this case, two polygons (filled with blue) represent the reservoir.<\/p>\n<p>The vector data model is consistent with how surveyors measure locations at intervals as they traverse a property boundary. Computer-aided drafting (CAD) software used by surveyors, engineers, and others, stores data in vector form. CAD operators encode the locations and extents of entities by tracing maps mounted on electronic drafting tables, or by key-entering location coordinates, angles, and distances. Instead of graphic features, CAD data consist of digital features, each of which is composed of a set of point locations.<\/p>\n<p>The vector strategy is well suited to mapping entities with well-defined edges, such as highways or pipelines or property parcels. Many of the features shown on paper maps, including contour lines, transportation routes, and political boundaries, can be represented effectively in digital form using the vector data model.<\/p>\n<p>The\u00a0<strong>raster<\/strong>\u00a0approach involves sampling attributes at fixed intervals. Each sample represents one cell in a checkerboard-shaped grid.<\/p>\n<p><a href=\"http:\/\/opentextbc.ca\/files\/geog482\/file\/raster.avi\"><img decoding=\"async\" class=\"aligncenter\" alt=\"Illustration of raster encoding of a reservoir and highway\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/raster.gif\" \/><\/a><\/p>\n<p>Two frames (the first and last) of an animation showing the construction of a raster representation of a reservoir and highway.<\/p>\n<p>&nbsp;<\/p>\n<h3><strong>TRY THIS<\/strong><\/h3>\n<p>Click the graphic above to download and view the animation file (raster.avi, 0.8 Mb) in a separate Microsoft Media Player window.<\/p>\n<p>To view the\u00a0<a href=\"http:\/\/opentextbc.ca\/files\/geog482\/file\/raster.mov\">same animation in QuickTime format (raster.mov, 0.6 Mb), click here<\/a>. Requires the QuickTime plugin, which is available free at\u00a0<a href=\"http:\/\/www.apple.com\/quicktime\/download\/\">apple.com<\/a>.<\/p>\n<p>The graphic above illustrates a raster representation of the same reservoir and highway as shown in the vector representation. The area covered by the aerial photograph has been divided into a grid. Every grid cell that overlaps one of the two selected entities is encoded with an attribute that associates it with the entity it represents. Actual raster data would not consist of a picture of red and blue grid cells, of course; they would consist of a list of numbers, one number for each grid cell, each number representing an entity. For example, grid cells that represent the highway might be coded with the number \u201c1\u2033 and grid cells representing the reservoir might be coded with the number \u201c2.\u201d<\/p>\n<p>The raster strategy is a smart choice for representing phenomena that lack clear-cut boundaries, such as terrain elevation, vegetation, and precipitation. Digital airborne imaging systems, which are replacing photographic cameras as primary sources of detailed geographic data, produce raster data by scanning the Earth\u2019s surface pixel by pixel and row by row.<\/p>\n<p>Both the vector and raster approaches accomplish the same thing: they allow us to caricature the Earth\u2019s surface with a limited number of locations. What distinguishes the two is the sampling strategies they embody. The vector approach is like creating a picture of a landscape with shards of stained glass cut to various shapes and sizes. The raster approach, by contrast, is more like creating a mosaic with tiles of uniform size. Neither is well suited to all applications, however. Several variations on the vector and raster themes are in use for specialized applications, and the development of new object-oriented approaches is underway.<\/p>\n<h3><strong>PRACTICE QUIZ<\/strong><\/h3>\n<h2>1.10. Automated Map Analysis<\/h2>\n<p>As I mentioned earlier, the original motivation for developing computer mapping systems was to automate the map making process. Computerization has not only made map making more efficient, it has also removed some of the technological barriers that used to prevent people from making maps themselves. What used to be an arcane craft practiced by a few specialists has become a \u201ccloud\u201d application available to any networked computer user. When I first started writing this course in 1997, my example was the mapping extension included in Microsoft Excel 97, which made creating a simple map as easy as creating a graph. Ten years later, who hasn\u2019t used Google Maps or MapQuest?<\/p>\n<p>As much as computerization has changed the way maps are made, it has had an even greater impact on how maps can be used. Calculations of distance, direction, and area, for example, are tedious and error-prone operations with paper maps. Given a digital map, such calculations can easily be automated. Those who are familiar with CAD systems know this from first-hand experience. Highway engineers, for example, rely on aerial imagery and digital mapping systems to estimate project costs by calculating the volumes of rock that need to be excavated from hillsides and filled into valleys.<\/p>\n<p>The ability to automate analytical tasks not only relieves tedium and reduces errors. It also allows us to perform tasks that would otherwise seem impractical. Consider, for example, if you were asked to plot on a map a 100-meter-wide\u00a0<strong>buffer<\/strong>\u00a0zone surrounding a protected stream. If all you had to work with was a paper map, a ruler, and a pencil, you might have a lengthy job on your hands. You might draw lines scaled to represent 100 meters, perpendicular to the river on both sides, at intervals that vary in frequency with the sinuosity of the stream. Then you might plot a perimeter that connects the end points of the perpendicular lines. If your task was to create hundreds of such buffer zones, you might conclude that automation is a necessity, not just a luxury.<\/p>\n<p><img decoding=\"async\" alt=\"Illustration showing construction of a 100-meter buffer polygon around a stream\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/buffer.gif\" \/><\/p>\n<p>Surrounding a protected stream with a buffer polygon.<\/p>\n<p>Some tasks can be implemented equally well in either vector- or raster- oriented mapping systems. Other tasks are better suited to one representation strategy or another. The calculation of slope, for example, or of<strong>gradient<\/strong>\u2013the direction of maximum slope along a surface\u2013is more efficiently accomplished with raster data. The slope of one raster grid cell may be calculated by comparing its elevation to the elevations of the eight cells that surround it. Raster data are also preferred for a procedure called\u00a0<strong>viewshed analysis<\/strong>\u00a0that predicts which portions of a landscape will be in view, or hidden from view, from a particular perspective.<\/p>\n<p>Some mapping systems provide ways to analyze attribute data as well as locational data. For example, the Excel mapping extension I mentioned above links the geographic data display capabilities of a mapping system with the data analysis capabilities of a spreadsheet. As you probably know, spreadsheets like Excel let users perform calculations on individual fields, columns, or entire files. A value changed in one field automatically changes values throughout the spreadsheet. Arithmetic, financial, statistical, and even certain database functions are supported. But as useful as spreadsheets are, they were not engineered to provide secure means of managing and analyzing large databases that consist of many related files, each of which is the responsibility of a different part of an organization. A spreadsheet is not a DBMS. And by the same token, a mapping system is not a GIS.<\/p>\n<h2>1.11. Geographic Information Systems<\/h2>\n<p>The preceding discussion leads me to revise my working definition:<\/p>\n<p>As I mentioned earlier, a geographer named David Cowen defined GIS as\u00a0<strong>a decision-support tool that combines the capabilities of a relational database management system with the capabilities of a mapping system (<\/strong>1988). Cowen cited an earlier study by William Carstensen (1986), who sought to establish criteria by which local governments might choose among competing GIS products. Carstensen chose site selection as an example of the kind of complex task that many organizations seek to accomplish with GIS. Given the necessary database, he advised local governments to expect that a fully functional GIS should be able to identify property parcels that are:<\/p>\n<ul>\n<li>At least five acres in size;<\/li>\n<li>Vacant or for sale;<\/li>\n<li>Zoned commercial;<\/li>\n<li>Not subject to flooding;<\/li>\n<li>Located not more than one mile from a heavy duty road; and<\/li>\n<li>Situated on terrain whose maximum slope is less than ten percent.<\/li>\n<\/ul>\n<p>The first criterion\u2013identifying parcels five acres or more in size\u2013might require two operations. As described earlier, a mapping system ought to be able to calculate automatically the area of a parcel. Once the area is calculated and added as a new attribute into the database, an ordinary database query could produce a list of parcels that satisfy the size criterion. The parcels on the list might also be highlighted on a map, as in the example below.<\/p>\n<p><img decoding=\"async\" alt=\"Map of property parcels five acres or larger in Ontario California\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/ont_ca_5acres.gif\" \/><\/p>\n<p>The cartographic result of a database query identifying all property parcels greater than or equal to five acres in size. (City of Ontario, CA, GIS Department. Used by permission.)<\/p>\n<p>The ownership status of individual parcels would be an attribute of a property database maintained by a local tax assessor\u2019s office. Parcels whose ownership status attribute value matched the criteria \u201cvacant\u201d or \u201cfor sale\u201d could be identified through another ordinary database query.<\/p>\n<p><img decoding=\"async\" alt=\"Map of property parcels zoned commercial in Ontario California\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/ont_ca_commercial.gif\" \/><\/p>\n<p>The cartographic result of a spatial intersection (or map overlay) operation identifying all property parcels zoned for commercial (C-1) development. (City of Ontario, CA, GIS Department. Used by permission.)<\/p>\n<p>Carstensen\u2019s third criterion was to determine which parcels were situated within areas zoned for commercial development. This would be simple if authorized land uses were included as an attribute in the community\u2019s property parcel database. This is unlikely to be the case, however, since zoning and taxation are the responsibilities of different agencies. Typically, parcels and land use zones exist as separate paper maps. If the maps were prepared at the same scale, and if they accounted for the shape of the Earth in the same manner, then they could be superimposed one over another on a light table. If the maps let enough light through, parcels located within commercial zones could be identified.<\/p>\n<p>The GIS approach to a task like this begins by digitizing the paper maps, and by producing corresponding attribute data files. Each digital map and attribute data file is stored in the GIS separately, like separate map<strong>layers<\/strong>. A fully functional GIS would then be used to perform a\u00a0<strong>spatial intersection<\/strong>\u00a0that is analogous to the overlay of the paper maps. Spatial intersection, otherwise known as\u00a0<strong>map overlay<\/strong>, is one of the defining capabilities of GIS.<\/p>\n<p><img decoding=\"async\" alt=\"Map of property parcels within one mile buffer of a highway in Ontario California\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/ont_ca_buffer.gif\" \/><\/p>\n<p>The cartographic result of a buffer operation identifying all property parcels located within a specified distance of a specified type of highway. (City of Ontario, CA, GIS Department. Used by permission.)<\/p>\n<p>Another of Carstensen\u2019s criteria was to identify parcels located within one mile of a heavy-duty highway. Such a task requires a digital map and associated attributes produced in such a way as to allow heavy-duty highways to be differentiated from other geographic entities. Once the necessary database is in place, a<strong>buffer<\/strong>\u00a0operation can be used to create a polygon feature whose perimeter surrounds all \u201cheavy duty highway\u201d features at the specified distance. A spatial intersection is then performed, isolating the parcels within the buffer from those outside the buffer.<\/p>\n<p>To produce a final list of parcels that meet all the site selection criteria, the GIS analyst might perform an<strong>intersection<\/strong>\u00a0operation that creates a new file containing only those records that are present in all the other intermediate results.<\/p>\n<p><img decoding=\"async\" alt=\"Map showing parcels that meet all search criteria in Ontario California\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/ont_ca_final.gif\" \/><\/p>\n<p>The cartographic result of the intersection of the above three figures. Only the parcels shown in this map satisfy all of the site selection criteria. (City of Ontario, CA, GIS Department. Used by permission.)<\/p>\n<p>I created the maps shown above in 1998 using the Geographic Information Web Server of the City of Ontario, California. Although it is no longer supported, the City of Ontario was one of the first of its kind to provide much of the functionality required to perform a site suitability analysis online. Today, many local governments offer similar Internet map services to current and prospective taxpayers.<\/p>\n<h3><strong>TRY THIS<\/strong><\/h3>\n<p>Find an online site selection utility similar to the one formerly provided by the City of Ontario. Registered Penn State students can post a comment to this page describing the site\u2019s functionality, and comparing it with the capabilities of the example illustrated above.<\/p>\n<h2>1.12. Geographic Information Science and Technology<\/h2>\n<p>So far in this chapter I\u2019ve tried to make sense of GIS in relation to several information technologies, including database management, computer-aided design, and mapping systems. At this point I\u2019d like to expand the discussion to consider GIS as one element in a much larger field of study called \u201cGeographic Information Science and Technology\u201d (GIS&amp;T). As shown in the following illustration, GIS&amp;T encompasses three subfields including:<\/p>\n<ul>\n<li><strong>Geographic Information Science<\/strong>, the multidisciplinary research enterprise that addresses the nature of geographic information and the application of geospatial technologies to basic scientific questions;<\/li>\n<li><strong>Geospatial Technology<\/strong>, the specialized set of information technologies that support acquisition, management, analysis, and visualization of geo-referenced data, including the Global Navigation Satellite System (GPS and others), satellite, airborne, and shipboard remote sensing systems; and GIS and image analysis software tools; and<\/li>\n<li><strong>Applications of GIS&amp;T,<\/strong>\u00a0the increasingly diverse uses of geospatial technology in government, industry, and academia.This is the subfield in which most GIS professionals work.<\/li>\n<\/ul>\n<p>Arrows in the diagram below reflect relationships among the three subfields, as well as to numerous other fields, including Geography, Landscape Architecture, Computer Science, Statistics, Engineering, and many others. Each of these fields has influenced, and some have been influenced by, the development of GIS&amp;T. It is important to note that these fields and subfields do not neatly correspond with professions like GIS analyst, photogrammetrist, or land surveyor. Rather, GIS&amp;T is a\u00a0<em>nexus\u00a0<\/em>of overlapping professions that differ in backgrounds, disciplinary allegiances, and regulatory status.<\/p>\n<p><img decoding=\"async\" alt=\"Diagram showing components of the field of Geographic Information Science and Technology and its relations to other fields.\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/BoK2006_Fig1_Domains_18Feb.jpg\" \/><\/p>\n<p>The field of Geographic Information Science and Technology (GIS&amp;T) and its relations to other fields. Two-way relations that are half-dashed represent asymmetrical contributions between allied fields. (\u00a9 2006 Association of American Geographers and University Consortium for Geographic Information Science. Used by permission. All rights reserved.)<\/p>\n<p>The illustration above first appeared in the\u00a0<em>Geographic Information Science and Technology Body of Knowledge\u00a0<\/em>(DiBiase, DeMers, Johnson, Kemp, Luck, Plewe, and Wentz, 2006), published by the University Consortium for Geographic Information Science (UCGIS) and the Association of American Geographers (AAG) in 2006. The\u00a0<em>Body of Knowledge<\/em>\u00a0is a community-developed inventory of the knowledge and skills that define the GIS&amp;T field.\u00a0Like the bodies of knowledge developed in Computer Science and other fields, the<em>\u00a0GIS&amp;T BoK<\/em>\u00a0represents the GIS&amp;T knowledge domain as a hierarchical list of knowledge areas, units, topics, and educational objectives. The ten knowledge areas and 73 units that make up the first edition are shown in the table below. Twenty-six \u201ccore\u201d units (those in which all graduates of a degree or certificate program should be able to demonstrate some level of mastery) are shown in bold type. Not shown are the 329 topics that make up the units, or the 1,660 education objectives by which topics are defined. These appear in the full text of the\u00a0<em>GIS&amp;T BoK.\u00a0<\/em>Unfortunately, the full text is not freely available online. An important related work produced by the U.S. Department of Labor is, however. We\u2019ll take a look at that shortly.<\/p>\n<h3>KNOWLEDGE AREAS AND UNITS COMPRISING THE 1ST EDITION OF THE GIS&amp;T BOK<\/h3>\n<p><strong>-Knowledge Area AM. Analytical Methods<\/strong><br \/>\n-Unit AM1 Academic and analytical origins<br \/>\n-Unit AM2 Query operations and query languages<br \/>\n<strong>-Unit AM3 Geometric measures<br \/>\n-Unit AM4 Basic analytical operations<br \/>\n-Unit AM5 Basic analytical methods<\/strong><br \/>\n-Unit AM6 Analysis of surfaces<br \/>\n-Unit AM7 Spatial statistics<br \/>\n-Unit AM8 Geostatistics<br \/>\n-Unit AM9 Spatial regression and econometrics<br \/>\n-Unit AM10 Data mining<br \/>\n-Unit AM11 Network analysis<br \/>\n-Unit AM12 Optimization and location-allocation modeling<\/p>\n<p><strong>-Knowledge Area CF. Conceptual Foundations<\/strong><br \/>\n-Unit CF1 Philosophical foundations<br \/>\n-Unit CF2 Cognitive and social foundations<br \/>\n<strong>\u00a0 -Unit CF3 Domains of geographic information<br \/>\n-Unit CF4 Elements of geographic information<\/strong><br \/>\n-Unit CF5 Relationships<br \/>\n-Unit CF6 Imperfections in geographic information<\/p>\n<p><strong>-Knowledge Area CV. Cartography and Visualization<\/strong><br \/>\n-Unit CV1 History and trends<br \/>\n<strong>-Unit CV2 Data considerations<br \/>\n-Unit CV3 Principles of map design<\/strong><br \/>\n-Unit CV4 Graphic representation techniques<br \/>\n-Unit CV5 Map production<br \/>\n<strong>-Unit CV6 Map use and evaluation<\/strong><\/p>\n<p><strong>-Knowledge Area DA. Design Aspects<\/strong><br \/>\n-Unit DA1 The scope of GI S&amp;T system design<br \/>\n-Unit DA2 Project definition<br \/>\n-Unit DA3 Resource planning<br \/>\n<strong>-Unit DA4 Database design<\/strong><br \/>\n-Unit DA5 Analysis design<br \/>\n-Unit DA6 Application design<br \/>\n-Unit DA7 System implementation<\/p>\n<p><strong>-Knowledge Area DM. Data Modeling<\/strong><br \/>\n-Unit DM1 Basic storage and retrieval structures<br \/>\n<strong>-Unit DM2 Database management systems<br \/>\n-Unit DM3 Tessellation data models<br \/>\n-Unit DM4 Vector and object data models<\/strong><br \/>\n-Unit DM5 Modeling 3D, temporal, and uncertain phenomena<\/p>\n<p><strong>-Knowledge Area DN. Data Manipulation<\/strong><br \/>\n<strong>-Unit DN1 Representation transformation<br \/>\n-Unit DN2 Generalization and aggregation<\/strong><br \/>\n-Unit DN3 Transaction management of geospatial data<\/p>\n<p><strong>-Knowledge Area GC. Geocomputation<\/strong><br \/>\n-Unit GC1 Emergence of geocomputation<br \/>\n-Unit GC2 Computational aspects and neurocomputing<br \/>\n-Unit GC3 Cellular Automata (CA) models<br \/>\n-Unit GC4 Heuristics<br \/>\n-Unit GC5 Genetic algorithms (GA)<br \/>\n-Unit GC6 Agent-based models<br \/>\n-Unit GC7 Simulation modeling<br \/>\n-Unit GC8 Uncertainty<br \/>\n-Unit GC9 Fuzzy sets<\/p>\n<p><strong>-Knowledge Area GD. Geospatial Data<\/strong><br \/>\n&#8211;<strong>Unit GD1 Earth geometry<br \/>\n&#8211;<\/strong>Unit GD2 Land partitioning systems<br \/>\n<strong>\u00a0 -Unit GD3 Georeferencing systems<br \/>\n-Unit GD4 Datums<br \/>\n-Unit GD5 Map projections<br \/>\n-Unit GD6 Data quality<br \/>\n-Unit GD7 Land surveying and GPS<br \/>\n<\/strong>-Unit GD8 Digitizing<br \/>\n-Unit GD9 Field data collection<br \/>\n<strong>\u00a0 -Unit GD10 Aerial imaging and photogrammetry<br \/>\n-Unit GD11 Satellite and shipboard remote sensing<br \/>\n-Unit GD12 Metadata, standards, and infrastructures<\/strong><\/p>\n<p><strong>-Knowledge Area GS. GIS&amp;T and Society<\/strong><br \/>\n-Unit GS1 Legal aspects<br \/>\n-Unit GS2 Economic aspects<br \/>\n-Unit GS3 Use of geospatial information in the public sector<br \/>\n-Unit GS4 Geospatial information as property<br \/>\n-Unit GS5 Dissemination of geospatial information<br \/>\n<strong>-Unit GS6 Ethical aspects of geospatial information and technology<\/strong><br \/>\n-Unit GS7 Critical GIS<\/p>\n<p><strong>-Knowledge Area OI. Organizational and Institutional Aspects<\/strong><br \/>\n-Unit OI1 Origins of GI S&amp;T<br \/>\n-Unit O2 Managing the GI system operations and\u00a0\u00a0 infrastructure<br \/>\n-Unit OI3 Organizational structures and procedures<br \/>\n-Unit OI4 GI S&amp;T workforce themes<br \/>\n<strong>-Unit OI5 Institutional and inter-institutional aspects<br \/>\n-Unit OI6 Coordinating organizations (national and international)<\/strong><\/p>\n<p>Ten knowledge areas and 73 units comprising the 1st edition of the GIS&amp;T BoK. Core units are indicated with bold type.\u00a0 (\u00a9 2006 Association of American Geographers and University Consortium for Geographic Information Science. Used by permission. All rights reserved.)<\/p>\n<p>Notice that the knowledge area that includes the most core units is GD: Geospatial Data. This course focuses on the sources and distinctive characteristics of geographic data. This is one part of the knowledge base that most successful geospatial professionals possess. The Department of Labor\u2019s Geospatial Technology Competency Model (GTCM) highlights this and other essential elements of the geospatial knowledge base. We\u2019ll consider it next.<\/p>\n<h2>1.13. Geospatial Competencies and Our Curriculum<\/h2>\n<p>A body of knowledge is one way to think about the GIS&amp;T field. Another way is as an industry made up of agencies and firms that produce and consume goods and services, generate sales and (sometimes) profits, and employ people. In 2003, the U.S. Department of Labor (DoL) identified \u201cgeospatial technology\u201d as one of 14 \u201chigh growth\u201d technology industries, along with biotech, nanotech, and others. However, the DoL also observed that the geospatial technology industry was ill-defined, and poorly understood by the public.<\/p>\n<p>Subsequent efforts by the DoL and other organizations helped to clarify the industry\u2019s nature and scope. Following a series of \u201croundtable\u201d discussions involving industry thought leaders, the Geospatial Information Technology Association (GITA) and the Association of American Geographers (AAG) submitted the following \u201cconcensus\u201d definition to DoL in 2006:<\/p>\n<p>The geospatial industry acquires, integrates, manages, analyzes, maps, distributes, and uses geographic, temporal, and spatial information and knowledge. The industry includes basic and applied research, technology development, education, and applications to address the planning, decision making, and operational needs of people and organizations of all types.<\/p>\n<p>In addition to the proposed industry definition, the GITA and AAG report recommended that DoL establish additional occupations in recognition of geospatial industry workforce activities and needs. At the time, the existing geospatial occupations included only Surveyors, Surveying Technicians, Mapping Technicians, and Cartographers and Photogrammetrists. Late in 2009, with input from the GITA, AAG, and other stakeholders, the DoL established six new geospatial occupations: Geospatial Information Scientists and Technologists, Geographic Information Systems Technicians, Remote Sensing Scientists and Technologists, Remote Sensing Technicians, Precision Agriculture Technicians, and Geodetic Surveyors.<\/p>\n<h3><strong>TRY THIS<\/strong><\/h3>\n<p>Investigate the geospatial occupations at the\u00a0<a href=\"http:\/\/www.onetonline.org\/\">U.S. Department of Labor\u2019s\u00a0 \u201cO*Net\u201d database<\/a>. Enter \u201cgeospatial\u201d in the search field named \u201cOccupation Quick Search.\u201d Follow links to occupation descriptions. Note the estimates for 2008 employment and employment growth through 2018. Also note that, for some anomalous reason, the keyword \u201cgeospatial\u201d is not associated with the occupation \u201cGeodetic Surveyor.\u201d<\/p>\n<p><img decoding=\"async\" alt=\"Screen capture of Department of Labor's O-Net site\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/o-net.png\" \/><\/p>\n<p>Meanwhile, DoL commenced a \u201ccompetency modeling\u201d initiative for high-growth industries in 2005. Their goal was to help educational institutions like ours meet the demand for qualified technology workers by identifying what workers need to know and be able to do. At DoL, a\u00a0<em>competency<\/em>\u00a0is \u201cthe capability to apply or use a set of related knowledge, skills, and abilities required to successfully perform \u2018critical work functions\u2019 or tasks in a defined work setting\u201d (Ennis 2008). A\u00a0<em>competency model<\/em>\u00a0is \u201ca collection of competencies that together define successful performance in a particular work setting.\u201d<\/p>\n<p>Workforce analysts at DoL began work on a Geospatial Technology Competency Model (GTCM) in 2005. Building on their research, a panel of accomplished practitioners and educators produced a complete draft of the GTCM, which they subsequently revised in response to public comments. Published in June 2010, the GTCM identifies the competencies that characterize successful workers in the geospatial industry. In contrast to\u00a0<em>GIS&amp;T Body of Knowledge,<\/em>\u00a0an academic project meant to define the nature and scope of the field, the GTCM is an industry specification the defines what individual workers and students should aspire to know and learn.<\/p>\n<h3><strong>TRY THIS<\/strong><\/h3>\n<p>Explore the\u00a0<a href=\"http:\/\/www.careeronestop.org\/CompetencyModel\/\">Geospatial Technology Competency Model (GTCM)<\/a>\u00a0at the U.S. Department of Labor\u2019s Competency Model Clearinghouse. Under \u201cIndustry Competency Models,\u201d follow the link \u201cGeospatial Technology.\u201d There, the pyramid (as shown below) is an image map which you can click to reveal the various competencies. The complete GTCM is also available as a Word doc and PDF file.<\/p>\n<p>&nbsp;<\/p>\n<p><img decoding=\"async\" alt=\"Screen capture of the Department of Labor's Geospatial Technology Competency Model site\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/gtcm.png\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>The GTCM specifies several \u201ctiers\u201d of competencies, progressing from general to occupationally specific. Tiers 1 through 3 (the gray and red layers), called Foundation Competencies, specify general workplace behaviors and knowledge that successful workers in most industries exhibit. Tiers 4 and 5 (yellow) include the distinctive technical competencies that characterize a given industry and its three sectors: Positioning and Data Acquisition, Analysis and Modeling, and Programming and Application Development. Above Tier 5 are additional Tiers corresponding to the occupation-specific competencies and requirements that are specified in the occupation descriptions published at O*NET Online, and in a Geospatial Management Competency Model that is in development as of January, 2012.<\/p>\n<p>One way educational institutions and students can use the GTCM is as a guideline for assessing how well curricula align with workforce needs. The Penn State Online GIS program conducted such an assessment in 2011. Results appear in the spreadsheet linked below.<\/p>\n<h3><strong>TRY THIS<\/strong><\/h3>\n<p>Open the\u00a0<a href=\"http:\/\/opentextbc.ca\/natureofgeoinfo\/files\/geog482\/image\/GTCM_assessment_Penn_State_2011.xlsx\">attached Excel spreadsheet<\/a>\u00a0to see how our Penn State Online GIS curricula address workforce needs identified in the GTCM.<\/p>\n<p>The sheet will open on a cover page. At the bottom of the sheet are\u00a0<strong>tabs<\/strong>\u00a0that correspond to Tiers 1-5 of the GTCM. Click the tabs to view the worksheet associated with the Tier you want to see.<\/p>\n<p>In each Tier worksheet,\u00a0<strong>rows<\/strong>\u00a0correspond to the GTCM competencies.<strong>\u00a0Columns<\/strong>\u00a0correspond to the Penn State Online courses included in the assessment. Courses that are required for most students are highlighted light blue. Course authors and instructors were asked to state what students actually do in relation to each of the GTCM competencies. Use the\u00a0<strong>scroll bar<\/strong>\u00a0at the bottom right edge of the sheet to reveal more courses.<\/p>\n<p>Open the\u00a0<a title=\"GTCM assessment of Penn State Online GIS curriculum\" href=\"http:\/\/opentextbc.ca\/natureofgeoinfo\/files\/geog482\/image\/gtcm_spreadsheet_demo.swf\">attached Flash movie<\/a>\u00a0to view a video demonstration of how to navigate the spreadsheet.<\/p>\n<p>By studying this spreadsheet you\u2019ll gain insight about how individual courses, and how the Penn State Online curriculum as a whole, relates to geospatial workforce needs. If you\u2019re interested in comparing ours to curricula at other institutions, ask if they\u2019ve conducted a similar assessment. If they haven\u2019t, ask why not.<\/p>\n<p>Finally, don\u2019t forget that you can preview much of our online courseware through our\u00a0<a href=\"http:\/\/open.ems.psu.edu\/\">Open Educational Resouces initiative<\/a>.<\/p>\n<h2>1.14. Distinguishing Properties of Geographic Data<\/h2>\n<p>The claim that geographic information science is a distinct field of study implies that spatial data are somehow special data. Goodchild (1992) points out several distinguishing properties of geographic information. I have paraphrased four such properties below. Understanding them, and their implications for the practice of geographic information science, is a key objective of this course.<\/p>\n<ol>\n<li>Geographic data represent spatial locations and non-spatial attributes measured at certain times.<\/li>\n<li>Geographic space is continuous.<\/li>\n<li>Geographic space is nearly spherical.<\/li>\n<li>Geographic data tend to be spatially dependent.<\/li>\n<\/ol>\n<p>Let\u2019s consider each of these properties next.<\/p>\n<h2>1.15. Locations and Attributes<\/h2>\n<p><strong>Geographic data represent spatial locations and non-spatial attributes measured at certain times.<\/strong>Goodchild (1992, p. 33) observes that \u201ca spatial database has dual keys, allowing records to be accessed either by attributes or by locations.\u201d Dual keys are not unique to geographic data, but \u201cthe spatial key is distinct, as it allows operations to be defined which are not included in standard query languages.\u201d In the intervening years, software developers have created variations on SQL that incorporate spatial queries. The dynamic nature of geographic phenomena complicates the issue further, however. The need to pose spatio-temporal queries challenges geographic information scientists (GIScientists) to develop ever more sophisticated ways to represent geographic phenomena, thereby enabling analysts to interrogate their data in ever more sophisticated ways.<\/p>\n<h2>1.16. Continuity<\/h2>\n<p>Geographic space is continuous. Although dual keys are not unique to geographic data, one property of the spatial key is. \u201cWhat distinguishes spatial data is the fact that the spatial key is based on two continuous dimensions\u201d (Goodchild, 1992, p.33). \u201cContinuous\u201d refers to the fact that there are no gaps in the Earth\u2019s surface. Canyons, crevasses, and even caverns notwithstanding, there is no position on or near the surface of the Earth that cannot be fixed within some sort of coordinate system grid. Nor is there any theoretical limit to how exactly a position can be specified. Given the precision of modern positioning technologies, the number of unique point positions that could be used to define a geographic entity is practically infinite. Because it\u2019s not possible to measure, let alone to store, manage, and process, an infinite amount of data,\u00a0<strong>all geographic data is selective, generalized, approximate<\/strong>. Furthermore,\u00a0<strong>the larger the territory covered by a geographic database, the more generalized the database tends to be<\/strong>.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>Geographic data are generalized according to scale. Click on the buttons beneath the map to zoom in and out on the town of Gorham. (U.S. Geological Survey). (<strong>Note:<\/strong>\u00a0You will need to have the Adobe Flash player installed in order to complete this exercise. If you do not already have the Flash player, you can\u00a0<a href=\"http:\/\/www.adobe.com\/shockwave\/download\/index.cgi?P1_Prod_Version=ShockwaveFlash\">download it for free from Adobe<\/a>.)<\/p>\n<p>&nbsp;<\/p>\n<p>For example, the illustration above shows a town called Gorham depicted on three different\u00a0<strong>topographic maps<\/strong>\u00a0produced by the United States Geological Survey. Gorham occupies a smaller space on the small-scale (1:250,000) map than it does at 1:62,000 or at 1:24,000. But the relative size of the feature isn\u2019t the only thing that changes. Notice that the shape of the feature that represents the town changes also. As does the number of features and the amount of detail shown within the town boundary and in the surrounding area. The name for this characteristically parallel decline in map detail and map scale is\u00a0<strong>generalization<\/strong>.<\/p>\n<p>It is important to realize that generalization occurs not only on printed maps, but in digital databases as well. It is possible to represent phenomena with highly detailed features (whether they be made up of high-resolution raster grid cells or very many point locations) in a single\u00a0<strong>scale-independent\u00a0<\/strong>database. In practice, however, highly detailed databases are not only extremely expensive to create and maintain, but they also bog down information systems when used in analyses of large areas. For this reason, geographic databases are usually created at several scales, with different levels of detail captured for different intended uses.<\/p>\n<h2>1.17. Nearly Spherical<\/h2>\n<p><strong>Geographic space is nearly spherical.<\/strong>\u00a0The fact that the Earth is nearly, but not quite, a sphere poses some surprisingly complex problems for those who wish to specify locations precisely.<\/p>\n<p><img decoding=\"async\" alt=\"World map showing the differences in elevation between a geoid and a reference ellipsoid.\" src=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-content\/uploads\/sites\/19\/2014\/01\/geoid_map.jpg\" \/><\/p>\n<p>Differences in elevation between a geoid model and a reference ellipsoid. Deviations range from a high of 75 meters (colored red, over New Guinea) to a low of 104 meters (colored purple, in the Indian Ocean). (National Geodetic Survey, n. d.).<\/p>\n<p>The\u00a0<strong>geographic coordinate system<\/strong>\u00a0of latitude and longitude coordinates provides a means to define positions on a sphere. Inaccuracies that are unacceptable for some applications creep in, however, when we confront the Earth\u2019s \u201cactual\u201d irregular shape, which is called the\u00a0<strong>geoid<\/strong>. Furthermore, the calculations of angles and distance that surveyors and others need to perform routinely are cumbersome with\u00a0<strong>spherical coordinates<\/strong>.<\/p>\n<p>That consideration, along with the need to depict the Earth on flat pieces of paper, compels us to transform the globe into a plane, and to specify locations in\u00a0<strong>plane coordinates<\/strong>\u00a0instead of spherical coordinates. The set of mathematical transformations by which spherical locations are converted to locations on a plane\u2013called\u00a0<strong>map projections<\/strong>\u2013all lead inevitably to one or another form of inaccuracy.<\/p>\n<p>All this is trouble enough, but we encounter even more difficulties when we seek to define \u201cvertical\u201d positions (elevations) in addition to \u201chorizontal\u201d positions. Perhaps it goes without saying that an elevation is the height of a location above some\u00a0<strong>datum<\/strong>, such as mean sea level. Unfortunately, to be suitable for precise positioning, a datum must correspond closely with the Earth\u2019s actual shape. Which brings us back again to the problem of the geoid.<\/p>\n<p>We will consider these issues in greater depth in Chapter 2. For now, suffice it to say that geographic data are unique in having to represent phenomena that are distributed on a continuous and nearly spherical surface.<\/p>\n<h2>1.18. Spatial Dependency<\/h2>\n<p><strong>Geographic data tend to be spatially dependent<\/strong>. Spatial dependence is \u201cthe propensity for nearby locations to influence each other and to possess similar attributes\u201d (Goodchild, 1992, p.33). In other words, to paraphrase a famous geographer named Waldo Tobler, while everything is related to everything else, things that are close together tend to be more related than things that are far apart. Terrain elevations, soil types, and surface air temperatures, for instance, are more likely to be similar at points two meters apart than at points two kilometers apart. A statistical measure of the similarity of attributes of point locations is called\u00a0<strong>spatial autocorrelation<\/strong>.<\/p>\n<p>Given that geographic data are expensive to create, spatial dependence turns out to be a very useful property. We can sample attributes at a limited number of locations, then estimate the attributes of intermediate locations. The process of estimating unknown values from nearby known values is called<strong>interpolation<\/strong>. Interpolated values are reliable only to the extent that the spatial dependence of the phenomenon can be assumed. If we were unable to assume some degree of spatial dependence, it would be impossible to represent continuous geographic phenomena in digital form.<\/p>\n<h3><strong>PRACTICE QUIZ<\/strong><\/h3>\n<h2>19. Geographic Data and Geographic Questions<\/h2>\n<p>The ultimate objective of all geospatial data and technologies, after all, is to produce knowledge. Most of us are interested in data only to the extent that they can be used to help understand the world around us, and to make better decisions.\u00a0 Decision making processes vary a lot from one organization to another. In general, however, the first steps in making a decision are to articulate the questions that need to be answered, and to gather and organize the data needed to answer the questions (Nyerges &amp; Golledge, 1997).<\/p>\n<p>Geographic data and information technologies can be very effective in helping to answer certain kinds of questions. The expensive, long-term investments required to build and sustain GIS infrastructures can be justified only if the questions that confront an organization can be stated in terms that GIS is equipped to answer. As a specialist in the field, you may be expected to advise clients and colleagues on the strengths and weaknesses of GIS as a decision support tool. To follow are examples of the kinds of questions that are amenable to GIS analyses, along with questions that GIS is not so well suited to help answer.<\/p>\n<h3>QUESTIONS CONCERNING INDIVIDUAL GEOGRAPHIC ENTITIES<\/h3>\n<p>The simplest geographic questions pertain to individual entities. Such questions include:<\/p>\n<h4><strong>QUESTIONS ABOUT SPACE<\/strong><\/h4>\n<ul>\n<li>Where is the entity located?<\/li>\n<li>What is its extent?<\/li>\n<\/ul>\n<h4><strong>QUESTIONS ABOUT ATTRIBUTES<\/strong><\/h4>\n<ul>\n<li>What are the attributes of the entity located there?<\/li>\n<li>Do its attributes match one or more criteria?<\/li>\n<\/ul>\n<h4><strong>QUESTIONS ABOUT TIME<\/strong><\/h4>\n<ul>\n<li>When were the entity\u2019s location, extent or attributes measured?<\/li>\n<li>Has the entity\u2019s location, extent, or attributes changed over time?<\/li>\n<\/ul>\n<p>Simple questions like these can be answered effectively with a good printed map, of course. GIS becomes increasingly attractive as the number of people asking the questions grows, especially if they lack access to the required paper maps.<\/p>\n<h3>QUESTIONS CONCERNING MULTIPLE GEOGRAPHIC ENTITIES<\/h3>\n<p>Harder questions arise when we consider relationships among two or more entities. For instance, we can ask:<\/p>\n<h3>QUESTIONS ABOUT SPATIAL RELATIONSHIPS<\/h3>\n<ul>\n<li>Do the entities contain one another?<\/li>\n<li>Do they overlap?<\/li>\n<li>Are they connected?<\/li>\n<li>Are they situated within a certain distance of one another?<\/li>\n<li>What is the best route from one entity to the others?<\/li>\n<li>Where are entities with similar attributes located?<\/li>\n<\/ul>\n<h3>QUESTIONS ABOUT ATTRIBUTE RELATIONSHIPS<\/h3>\n<ul>\n<li>Do the entities share attributes that match one or more criteria?<\/li>\n<li>Are the attributes of one entity influenced by changes in another entity?<\/li>\n<\/ul>\n<h3>QUESTIONS ABOUT TEMPORAL RELATIONSHIPS<\/h3>\n<ul>\n<li>Have the entities\u2019 locations, extents, or attributes changed over time?<\/li>\n<\/ul>\n<p>Geographic data and information technologies are very well suited to answering moderately complex questions like these. GIS is most valuable to large organizations that need to answer such questions often.<\/p>\n<h3>QUESTIONS THAT GIS IS NOT PARTICULARLY GOOD AT ANSWERING<\/h3>\n<p>Harder still, however, are\u00a0<strong>explanatory questions<\/strong>\u2013such as\u00a0<em>why<\/em>\u00a0entities are located where they are,\u00a0<em>why<\/em>\u00a0they have the attributes they do, and\u00a0<em>why<\/em>\u00a0they have changed as they have. In addition, organizations are often concerned with\u00a0<strong>predictive questions<\/strong>\u2013such as what will happen at\u00a0<em>this<\/em>\u00a0location if thus-and-so happens at<em>that<\/em>\u00a0location? In general, commercial GIS software packages cannot be expected to provide clear-cut answers to explanatory and predictive questions right out of the box. Typically, analysts must turn to specialized statistical packages and simulation routines. Information produced by these analytical tools may then be re-introduced into the GIS database, if necessary. Research and development efforts intended to more tightly couple analytical software with GIS software are underway within the GIScience community. It is important to keep in mind that decision support tools like GIS are no substitutes for human experience, insight, and judgment.<\/p>\n<p>At the outset of the chapter I suggested that producing information by analyzing data is something like producing energy by burning coal. In both cases, technology is used to realize the potential value of a raw material. Also in both cases, the production process yields some undesirable by-products. Similarly, in the process of answering certain geographic questions, GIS tends to raise others, such as:<\/p>\n<ul>\n<li>Given the intrinsic imperfections of the data, how reliable are the results of the GIS analysis?<\/li>\n<li>Does the information produced through GIS analysis tend to systematically benefit some constituent groups at the expense of others?<\/li>\n<li>Should the data used to make the decision be made public?<\/li>\n<li>Does the use of GIS affect the organization\u2019s decision-making processes in ways that are beneficial to its management, its employees, and its customers?<\/li>\n<\/ul>\n<p>As is the case in so many endeavors, the answer to a geographic question usually includes more questions.<\/p>\n<h3><strong>TRY THIS<\/strong><\/h3>\n<p>Can you cite an example of a \u201chard\u201d question that you and your GIS system have been called upon to address? Registered Penn State students can post a comment directly to this page.<\/p>\n<h2>1.20. Summary<\/h2>\n<p>It\u2019s a truism among specialists in geographic information that the lion\u2019s share of the cost of most GIS projects is associated with the development and maintenance of a suitable database. It seems appropriate, therefore, that our first course in geographic information systems should focus upon the properties of geographic data.<\/p>\n<p>I began this first chapter by defining data in a generic sense, as sets of symbols that represent measurements of phenomena. I suggested that data are the raw materials from which information is produced. Information systems, such as database management systems, are technologies that people use to transform data into the information needed to answer questions, and to make decisions.<\/p>\n<p>Spatial data are special data. They represent the locations, extents, and attributes of objects and phenomena that make up the Earth\u2019s surface at particular times. Geographic data differ from other kinds of data in that they are distributed along a continuous, nearly spherical globe. They also have the unique property that the closer two entities are located, the more likely they are to share similar attributes.<\/p>\n<p>GIS is a special kind of information system that combines the capabilities of database management systems with those of mapping systems. GIS is one object of study of the loosely-knit, multidisciplinary field called Geographic Information Science and Technology. GIS is also a profession\u2013one of several that make up the geospatial industry. As Yogi Berra said, \u201cIn theory, there\u2019s no difference between theory and practice. In practice there is.\u201d In the chapters and projects that follow, we\u2019ll investigate the nature of geographic information from both conceptual and practical points of view.<\/p>\n<h3>COMMENTS AND QUESTIONS<\/h3>\n<p>Registered students are welcome to post comments, questions, and replies to questions about the text. Particularly welcome are anecdotes that relate the chapter text to your personal or professional experience. In addition, there are discussion forums available in the ANGEL course management system for comments and questions about topics that you may not wish to share with the whole world.<\/p>\n<p>To post a comment, scroll down to the text box under \u201cPost new comment\u201d and begin typing in the text box, or you can choose to reply to an existing thread. When you are finished typing, click on either the \u201cPreview\u201d or \u201cSave\u201d button (Save will actually submit your comment). Once your comment is posted, you will be able to edit or delete it as needed. In addition, you will be able to reply to other posts at any time.<\/p>\n<p>Note: the first few words of each comment become its \u201ctitle\u201d in the thread.<\/p>\n<h2>1.21. Bibliography<\/h2>\n<p>Carstensen, L. W. (1986). Regional land information systems development using relational databases and geographic information systems.\u00a0<em>Proceedings of the AutoCarto<\/em>, London, 507-516.<\/p>\n<p>City of Ontario, California. (n.d.).\u00a0<em>Geographic information web server<\/em>. Retrieved on July 6, 1999 from\u00a0<a href=\"http:\/\/www.ci.ontario.ca.us\/gis\/index.asp\">http:\/\/www.ci.ontario.ca.us\/gis\/index.asp<\/a>(since retired).<\/p>\n<p>Cowen, D. J. (1988). GIS versus CAD versus DBMS: What are the differences?\u00a0<em>Photogrammetric Engineering and Remote Sensing<\/em>\u00a054:11, 1551-1555.<\/p>\n<p>DiBiase, D. and twelve others (2010).\u00a0<a href=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/chapter\/files\/sites\/file\/DiBiase_etal_2010_GTCM_URISA_Journal.pdf\">The New Geospatial Technology Competency Model: Bringing workforce needs into focus<\/a>.\u00a0<em>URISA Journal<\/em>22:2, 55-72.<\/p>\n<p>DiBiase, D, M. DeMers, A. Johnson, K. Kemp, A. Luck, B. Plewe, and E. Wentz (2007).\u00a0<a href=\"http:\/\/opentextbc.ca\/natureofgeographicinformation\/chapter\/files\/sites\/file\/BoK_CaGIS_2007.pdf\">Introducing the First Edition of the\u00a0<em>GIS&amp;T Body of Knowledge<\/em><\/a>.\u00a0<em>Cartography and Geographic Information Science,<\/em>\u00a034(2), pp. 113-120. U.S. National Report to the International Cartographic Association.<\/p>\n<p>Ennis, M. R. (2008). Competency models: A review of the literature and the role of the employment and training administration (ETA).<a href=\"http:\/\/www.careeronestop.org\/COMPETENCYMODEL\/info_documents\/OPDRLiteratureReview.pdf\">http:\/\/www.careeronestop.org\/COMPETENCYMODEL\/info_documents\/OPDRLiteratureReview.pdf<\/a>.<\/p>\n<p>GITA and AAG (2006). Defining and communicating geospatial industry workforce demand: Phase I report.<\/p>\n<p>Goodchild, M. (1992). Geographical information science.\u00a0<em>International Journal of Geographic Information Systems<\/em>\u00a06:1, 31-45.<\/p>\n<p>Goodchild, M. (1995). GIS and geographic research. In J. Pickles (Ed.),<em>Ground truth: the social implications of geographic information systems<\/em>(pp. of chapter). New York: Guilford.<\/p>\n<p>National Decision Systems.\u00a0<em>A zip code can make your company lots of money!<\/em>\u00a0Retrieved on July 6, 1999 from<a href=\"http:\/\/laguna.natdecsys.com\/lifequiz\">http:\/\/laguna.natdecsys.com\/lifequiz<\/a>\u00a0(since retired).<\/p>\n<p>National Geodetic Survey. (1997).\u00a0<em>Image generated from 15\u2032x15\u2032 geoid undulations covering the planet Earth.\u00a0<\/em>Retrieved 1999, from<a href=\"http:\/\/www.ngs.noaa.gov\/GEOID\/geo-index.html\">http:\/\/www.ngs.noaa.gov\/GEOID\/geo-index.html<\/a>\u00a0(since retired).<\/p>\n<p>Nyerges, T. L. &amp; Golledge, R. G. (n.d.)\u00a0<em>NCGIA core curriculum in GIS<\/em>, National Center for Geographic Information and Analysis, University of California, Santa Barbara, Unit 007. Retrieved November 12, 1997, from<a href=\"http:\/\/www.ncgia.ucsb.edu\/giscc\/units\/u007\/u007.html\">http:\/\/www.ncgia.ucsb.edu\/giscc\/units\/u007\/u007.html<\/a>\u00a0(since retired).<\/p>\n<p>United States Department of the Interior Geological Survey. (1977). [map]. 1:24 000. 7.5 minute series. Washington, D.C.: USDI.<\/p>\n<p>United States Geologic Survey. \u201cBellefonte, PA Quadrangle\u201d (1971). [map]. 1:24 000. 7.5 minute series. Washington, D.C.:USGS.<\/p>\n<p>University Consortium for Geographic Information Science. Retrieved April 26, 2006, from\u00a0<a href=\"http:\/\/www.ucgis.org\/\">http:\/\/www.ucgis.org<\/a><\/p>\n<p>Wilson, J. D. (2001). Attention data providers: A billion-dollar application awaits.\u00a0<em>GEOWorld<\/em>, February, 54.<\/p>\n<p>Worboys, M. F. (1995).\u00a0<em>GIS: A computing perspective<\/em>. London: Taylor and Francis.<\/p>\n<p><a title=\"Go to previous page\" href=\"http:\/\/opentextbc.ca\/natureofgeoinfo\/c1_p20.html\">\u2039 20. Summary<\/a>\u00a0<a title=\"Go to parent page\" href=\"http:\/\/opentextbc.ca\/natureofgeoinfo\/c1.html\">up<\/a>\u00a0<a title=\"Go to next page\" href=\"http:\/\/opentextbc.ca\/natureofgeoinfo\/c2.html\">Chapter 2: Scales and Transformations \u203a<\/a><\/p>\n","protected":false},"author":1,"menu_order":1,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":["david-dibiase"],"pb_section_license":""},"chapter-type":[],"contributor":[47],"license":[],"class_list":["post-40","chapter","type-chapter","status-publish","hentry","contributor-david-dibiase"],"part":79,"_links":{"self":[{"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/pressbooks\/v2\/chapters\/40","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/wp\/v2\/users\/1"}],"version-history":[{"count":7,"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/pressbooks\/v2\/chapters\/40\/revisions"}],"predecessor-version":[{"id":891,"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/pressbooks\/v2\/chapters\/40\/revisions\/891"}],"part":[{"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/pressbooks\/v2\/parts\/79"}],"metadata":[{"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/pressbooks\/v2\/chapters\/40\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/wp\/v2\/media?parent=40"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/pressbooks\/v2\/chapter-type?post=40"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/wp\/v2\/contributor?post=40"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/opentextbc.ca\/natureofgeographicinformation\/wp-json\/wp\/v2\/license?post=40"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}