COMP 8042All work should be done individually.Geographic Information SystemGeographic information systems organize information pertaining to geographic features andprovide various kinds of access to the information. A geographic feature may possess manyattributes (see below). In particular, a geographic feature has a specific location. Thereare a number of ways to specify location. For this project, we will use latitude and longitude,which will allow us to deal with geographic features at any location on Earth. Areasonably detailed tutorial on latitude and longitude can be found in the Wikipedia athttp://en.wikipedia.org/wiki/Latitude and http://en.wikipedia.org/wiki/Longitude.The GIS record files were obtained from the website for the USGS Board on GeographicNames (http://geonames.usgs.gov). The file begins with a descriptive header line, followedby a sequence of GIS records, one per line, which contain the fields provide in Table 1 inAppendix in the indicated order.Notes:• See https://geonames.usgs.gov/domestic/states fileformat.htm for the full field descriptions.• The type specifications used here have been modified from the source (URL above) tobetter reflect the realities of your programming environment.• Latitude and longitude may be expressed in DMS (degrees/minutes/seconds, 0820830W)format, or DEC (real number, -82.1417975) format. In DMS format, latitude will alwaysbe expressed using 6 digits followed by a single character specifying the hemisphere, andlongitude will always be expressed using 7 digits followed by a hemisphere designator.• Although some fields are mandatory, some may be omitted altogether. Best practice isto treat every field as if it may be left unspecified. Certain fields are necessary in orderto index a record: the feature name and the primary latitude and primary longitude.If a record omits any of those fields, you may discard the record, or index it as far aspossible.In the GIS record file, each record will occur on a single line, and the fields will be separatedby pipe (‘|’) symbols. Empty fields will be indicated by a pair of pipe symbols with nocharacters between them. See the posted VA Monterey.txt file for many examples.GIS record files are guaranteed to conform to this syntax, so there is no explicit requirementthat you validate the files. On the other hand, some error-checking during parsing may helpyou detect errors in your parsing logic.The file can be thought of as a sequence of bytes, each at a unique offset from the beginningof the file, just like the cells of an array. So, each GIS record begins at a unique offset fromthe beginning of the file.Line TerminationEach line of a text file ends with a particular marker (known as the line terminator). InMS-DOS/Windows file systems, the line terminator is a sequence of two ASCII characters(CR + LF, 0X0D0A). In Unix systems, the line terminator is a single ASCII character (LF).Other systems may use other line termination conventions.Why should you care? Which line termination is used has an effect on the file offsets for allbut the first record in the data file. As long as were all testing with files that use the sameline termination, we should all get the same file offsets. But if you change the file format (ofthe posted data files) to use different line termination, you will get different file offsets thanare shown in the posted log files. Most good text editors will tell you what line terminationis used in an opened file, and also let you change the line termination scheme.In Figure 1, note that some record fields are optional, and that when there is no given valuefor a field, there are still delimiter symbols for it.Also, some of the lines are “wrapped” to fit into the text box; lines are never “wrapped” inthe actual data files.Figure 1: Sample Geographic Data RecordsAssignmentYou will implement a system that indexes and provides search features for a file of GISrecords, as described above.Your system will build and maintain several in-memory index data structures to supportthese operations:• Importing new GIS records into the database file• Retrieving data for all GIS records matching given geographic coordinates• Retrieving data for all GIS records matching a given feature name and state• Retrieving data for all GIS records that fall within a given (rectangular) geographicregion• Displaying the in-memory indices in a human-readable mannerYou will implement a single software system in C++ to perform all system functions.Program InvocationThe program will take the names of three files from the command line, like this:./GIS Note that this implies your main class must be named GIS, and be able to be compiledsimply using a g++ compile command. Preferably, you are encouraged to create make filesfor the project and provide the required dependency files in your submission.The database file should be created as an empty file; note that the specified database file mayalready exist, in which case the existing file should be truncated or deleted and recreated.If the command script file is not found the program should write an error message to theconsole and exit. The log file should be rewritten every time the program is run, so if the filealready exists it should be truncated or deleted and recreated.System OverviewThe system will create and maintain a GIS database file that contains all the records thatare imported as the program runs. The GIS database file will be empty initially. All theindexing of records will be done relative to this file.There is no guarantee that the GIS record file will not contain two or more distinct recordsthat have the same geographic coordinates. In fact, this is natural since the coordinates areexpressed in the usual DMS system. So, we cannot treat geographic coordinates as a primary(unique) key.The GIS records will be indexed by the Feature Name and State (abbreviation) fields. Thisname index will support finding offsets of GIS records that match a given feature name andstate abbreviation.The GIS records will also be indexed by geographic coordinate. This coordinate index willsupport finding offsets of GIS records that match a given primary latitude and primarylongitude.The system will include a buffer pool, as a front end for the GIS database file, to improvesearch speed. See the discussion of the buffer pool below for detailed requirements. Whenperforming searches, retrieving a GIS record from the database file must be managed throughthe buffer pool. During an import operation, when records are written to the database file,the buffer pool will be bypassed, since the buffer pool would not improve performance duringimports.When searches are performed, complete GIS records will be retrieved from the GIS databasefile that your program maintains. The only complete GIS records that are stored in memoryat any time are those that have just been retrieved to satisfy the current search, or individualGIS records created while importing data or GIS records stored in the buffer pool.Aside from where specific data structures are required, you may use any suitable STL libraryimplementation you like.Each index should have the ability to write a nicely-formatted display of itself to an outputstream.Name Index InternalsThe name index will use a hash table for its physical organization. Each hash table entrywill store a feature name and state abbreviation (separately or concatenated, as you like)and the file offset(s) of the matching record(s). Since each GIS record occupies one line inthe file, it is a trivial matter to locate and read a record given nothing but the file offset atwhich the record begins.Your table will use quadratic probing to resolve collisions, with the quadratic function (n2+n)2to compute the step size.The hash table must use a contiguous physical structure (array). The initial size of the tablewill be 1024, and the table will resize itself automatically, by doubling its size whenever thetable becomes 70% full.Since the specified table sizes given above are powers of 2, an empty slot will always be foundunless the table is full.You can use your desired hash function (e.g. elfhash), and apply it to the concatenation ofthe feature name and state (abbreviation) field of the data records. Precisely how you formthe concatenation is up to you.You must be able to display the contents of the hash table in a readable manner.Coordinate Index InternalsThe coordinate index will use a bucket PR quadtree for the physical organization. In a bucketPR quadtree, each leaf stores up to K data objects (for some fixed value of K). Upon insertion,if the added value would fall into a leaf that is already full, then the region correspondingto the leaf will be partitioned into quadrants and the K+1 data objects will be inserted intothose quadrants as appropriate. As is the case with the regular PR quadtree, this may leadto a sequence of partitioning steps, extending the relevant branch of the quadtree by multiplelevels. In this project, K will probably equal 4, but I reserve the right to specify a differentbucket size with little notice, so this should be easy to modify.The index entries held in the quadtree will store a geographic coordinate and a collection ofthe file offsets of the matching GIS records in the database file.Note: do not confuse the bucket size with any limit on the number of GIS records that maybe associated with a single geographic coordinate. A quadtree node can contain index objectsfor up to K different geographic coordinates. Each such index object can contain referencesto an unlimited number of different GIS records.The PR quadtree implementation should follow good design practices, and its interface shouldbe somewhat similar to that of the BST. You are expected to implement different types forthe leaf and internal nodes, with appropriate data membership for each, and an abstract basetype from which they are both derived.You must be able to display the PR quadtree in a readable manner. The display must clearlyindicate the structure of the tree, the relationships between its nodes, and the data objectsin the leaf nodes.Buffer Pool DetailsThe buffer pool for the database file should be capable of buffering up to 15 records, andwill use LRU replacement. You may use any structure you like to organize the pool slots;however, since the pool will have to deal with record replacements, some structures will bemore efficient (and simpler) to use. You may use any classes from STL library you think areappropriate.It is up to you to decide whether your buffer pool stores interpreted or raw data; i.e., whetherthe buffer pool stores GIS record objects or just strings.You must be able to display the contents of the buffer pool, listed from MRU to LRU entry,in a readable manner. The order in which you retrieve records when servicing a multi-matchsearch is not specified, so such searches may result in different orderings of the records withinthe buffer pool. That is OK.A Note on Coordinates and Spatial RegionsIt is important to remember that there are fundamental differences between the notion thata geographic feature has specific coordinates (which may be thought of as a point) and thenotion that each node of the PR quadtree corresponds to a particular sub-region of thecoordinate space (which may contain many geographic features).In this assignment, coordinates of geographic features are specified as latitude/longitudepairs, and the minimum resolution is one second of arc. Thus, you may think of the geographiccoordinates as being specified by a pair of integer values.On the other hand, the boundaries of the sub-regions are determined by performing arithmeticoperations, including division, starting with the values that define the boundaries of theworld. Unless the dimensions of the world happen to be powers of 2, this can quickly leadto regions whose boundaries cannot be expressed exactly as integer values. You may usefloating-point values or integer values to represent region boundaries when computing regionboundaries during splitting and quadtree traversals. If you use integers, be careful not tounintentionally create “gaps” between regions.Your implementation should view the boundary between regions as belonging to one of thoseregions. The choice of a particular rule for handling this situation is left to you.When carrying out a region search, you must determine whether the search region overlapswith the region corresponding to a subtree node before descending into that subtree. Youmay define a Rectangle class which could be (too) useful.Other System ElementsThere should be an overall controller that validates the command line arguments and managesthe initialization of the various system components. The controller should hand off executionto a command processor that manages retrieving commands from the script file, and makingthe necessary calls to other components in order to carry out those commands.Naturally, there should be a data type that models a GIS record.There may well be additional system elements, whether data types or data structures, orsystem components that are not mentioned here. The fact no additional elements are explicitlyidentified here does not imply that you will not be expected to analyze the design issuescarefully, and to perhaps include such elements.Aside from the command-line interface, there are no specific requirements for interfaces ofany of the classes that will make up your GIS; it is up to you to analyze the specificationand come up with an appropriate set of classes, and to design their interfaces to facilitate thenecessary interactions. It is probably worth pointing out that an index (e.g., a geographiccoordinate index) should not simply be a naked container object (e.g, quadtree); if that’s notclear to you, think more carefully about what sort of interface would be appropriate for anindex, as opposed to a container.Command FileThe execution of the program will be driven by a script file. Lines beginning with a semicoloncharacter (‘;’) are comments and should be ignored. Blank lines are possible. Each line inthe command file consists of a sequence of tokens, which will be separated by single tabcharacters. A line terminator will immediately follow the final token on each line. Thecommand file is guaranteed to conform to this specification, so you do not need to worryabout error-checking when reading it.The first non-comment line will specify the world boundaries to be used:worldThis will be the first command in the file, and will occur once. It specifies theboundaries of the coordinate space to be modeled. The four parameters will belongitude and latitudes expressed in DMS format, representing the vertical andhorizontal boundaries of the coordinate space.It is certainly possible that the GIS record file will contain records for featuresthat lie outside the specified coordinate space. Such records should be ignored;i.e., they will not be indexed.Each subsequent non-comment line of the command file will specify one of the commandsdescribed below. One command is used to load records into your database from externalfiles:importAdd all the valid GIS records in the specified file to the database file. This meansthat the records will be appended to the existing database file, and that thoserecords will be indexed inCOMP 8042作业代做、代写c/c++程序语言作业、代做g++课程设计作业、System作业代写 代做R语言程序|帮 the manner described earlier. When the import iscompleted, log the number of entries added to each index, and the longest probesequence that was needed when inserting to the hash table. (A valid record is onethat lies within the specified world boundaries.)Another command requires producing a human-friendly display of the contents of an indexstructure:debug[ quad | hash | pool ]Log the contents of the specified index structure in a fashion that makes theinternal structure and contents of the index clear. It is not necessary to be overlyverbose here, but you should include information like key values and file offsetswhere appropriate.Another simply terminates execution, which is handy if you want to process only part of acommand file:quitTerminate program execution.The other commands involve searches of the indexed records:what is atFor every GIS record in the database file that matches the given coordinate>, log the offset at which the record was found, and the feature name,county name, and state abbreviation. Do not log any other data from the matchingrecords.what isFor every GIS record in the database file that matches the given and , log the offset at which the record was found, andthe county name, the primary latitude, and the primary longitude. Do not logany other data from the matching records.what is inFor every GIS record in the database file whose coordinates fall within the closedrectangle with the specified height and width, centered at the ,log the offset at which the record was found, and the feature name, the state name,and the primary latitude and primary longitude. Do not log any other data fromthe matching records. The half-height and half-width are specified as seconds.The what is in command takes an optional modifier, -long, which causes thedisplay of a long listing of the relevant records. The switch will be the first tokenfollowing the name of the command. If this switch is present, then for every GISrecord in the database file whose coordinates fall within the closed rectangle withthe specified height and width, centered at the , logevery important non-empty field, nicely formatted and labeled. See the posted logfiles for an example. Do not log any empty fields. The half-height and half-widthare specified as seconds.The what is in command also takes an optional modifier, causing the search resultsto be filtered:-filter [ pop | water | structure ]The switch and its modifier will be the first and second tokens following the nameof the command. If present, this causes the set of matching records to be filteredto only show those whose feature type field corresponds to the given filter specifier.See Table 2 in the Appendix for instructions on how to interpret the feature typesshown above.If a is specified for a command, it will be expressed as a pair oflatitude/longitude values, expressed in the same DMS format that is used in the GIS recordfiles.For all the commands, if a search results in displaying information about multiple records,you have to use a sort algorithm to show them in a sorted manner, the choice of sort algorithmand feature to perform the sort on, is up to you. Optionally you can receive the sort algorithmname (in case you have implemented multiple) and the feature to perform the sort on withan argument immediately after the other switch/modifier pair(s) if any available. Make sureyou mention the implemented sort algorithm and discuss its performance in your final report.Sample command scripts, and corresponding log files, are provided alongside this descriptionfile. As a general rule, every command should result in some output. In particular, adescriptive message should be logged if a search yields no matching records.Log File DescriptionYour output should be clear, concise, well labeled, and correct. You should begin the logwith a few lines identifying yourself, and listing the names of the input files that are beingused.The remainder of the log file output should come directly from your processing of the commandfile. You are required to echo each comment line, and each command that you processto the log file so that it’s easy to determine which command each section of your outputcorresponds to. Each command (except for “world”) should be numbered, starting with 1,and the output from each command should be well formatted, and delimited from the outputresulting from processing other commands.SubmissionFor this assignment, you must submit an archive (zip or tar) file containing all the sourcecode files for your implementation (i.e., .cpp files). Submit only the source files. Do notsubmit the compiled files or any of object files. If you use packages in your implementation(and that’s good practice), your archive file must include the correct directory structure forthose packages, and your GIS.cpp file must be in the top directory when the archive file isunpacked. Your code must be ready to compile using g++ -std=c++11 or a simpleMakefile if you have a more complicated structure. Make sure no visual studio relateddependencies or solution files are there when submitting the result, since I certainly will notuse visual studio to test and grade your project.Alongside your source files, I need a pdf file describing your solution, general architecture ofyour code, and the list of data structures you have implemented or used from STL. Run oneof the scripts and put a screen shot of that run in your report, as well.The correctness of your solution will be evaluated by executing your solution on a collectionof test data files. Be sure to test your solution with all of the data sets that are posted, sinceI will use a variety of data sets, including at least one very large data one (perhaps hundredsof thousands of records) in my evaluation.As it is stated in the beginning of the file description this project must be done individually.You are not allowed to copy a single function from another person nor from theinternet without citing them. You may use any of the previously provided lab sourcecodes completed by yourself to reduce the implementation time of this assignment(don’t forget to mention which lab codes you have used in your final report).AppendixTable 1: Geographic Data Record FormatName Type Length/Decimals Short DescriptionFeature ID Integer 10 Permanent, unique feature record identifierand official feature nameFeature Name String 120Feature Class String 50 See Table 2 later in this specificationStateAlpha String 2 The unique two letter alphabetic codeand the unique two number code for a US StateStateNumeric String 2CountryName String 100 The name and unique three number codefor a county or county equivalent CountyNumeric String 3PrimaryLatitudeDMS String 7 The official feature locationDMS-degrees/minutes/secondsDEC-decimal degreesNote: Records showing “Unknown” and zeros forthe latitude and longitude DMS and decimal fields,respectively, indicate that the coordinates of thefeature are unknown. They are recorded in thedatabase as zeros to satisfy the format requirementsof a numerical data type. They are not errors anddo not reference the actual geographic coordinatesat 0 latitude, 0 longitude.DMS String 7 Source coordinates of linear feature only(Class = Stream, Valley, Arroyo)DMS-degrees/minutes/secondsDEC-decimal degreesNote: Records showing “Unknown” and zeros forthe latitude and longitude DMS and decimal fields,respectively, indicate that the coordinates ofthe feature are unknown. They are recorded in thedatabase as zeros to satisfy the format requirementsof a numerical data type. They are not errors anddo not reference the actual geographic coordinatesat 0 latitude, 0 longitude.SourceLongitudeDMS String 8Elevation(meters) Integer 5 Elevation in meters above (-below) sea levelof the surface at the primary coordinatesElevation(feet) Integer 6 Elevation in feet above (-below) sea levelof the surface at the primary coordinatesMap Name String 100 Name of USGS base series topographicmap containing the primary coordinates.Date Created String Date The date the feature was initiallycommitted to the database.Date Edited String Date The date any attribute of an existingfeature was last edited.The Feature Class field of the GIS records may contain any of the following designators. We careabout these because of the -filter switch that may be used with the what is in commands. Thetable below indicates which of the standard correspond to one of the filter specifiers.Table 2: Feature Class DesignatorsClass Type DescriptionAirport structure Manmade facility maintained for the use of aircraft(airfield, airstrip, landing field, landing strip).Arch Natural arch-like opening in a rock mass(bridge, natural bridge, sea arch).AreaAny one of several areally extensive natural featuresnot included in other categories(badlands, barren, delta, fan, garden).Arroyo water Watercourse or channel through which water mayoccasionally flow (coulee, draw, gully, wash).BarNatural accumulation of sand, gravel, or alluviumforming an underwater or exposed embankment(ledge, reef, sandbar, shoal, spit).Basin Natural depression or relatively low area enclosedby higher land (amphitheater, cirque, pit, sink).Bay waterIndentation of a coastline or shoreline enclosing apart of a body of water; a body of water partlysurrounded by land(arm, bight, cove, estuary, gulf, inlet, sound).BeachThe sloping shore along a body of water that iswashed by waves or tides and is usually coveredby sand or gravel(coast, shore, strand).BenchArea of relatively level land on the flank of anelevation such as a hill, ridge, or mountainwhere the slope of the land rises on one sideand descends on the opposite side (level).Bend waterCurve in the course of a stream and (or) the landwithin the curve; a curve in a linear body of water(bottom, loop, meander).Bridge structureManmade structure carrying a trail, road, or othertransportation system across a body of water ordepression (causeway, overpass, trestle).Building structureA manmade structure with walls and a roof forprotection of people and (or) materials,but not including church, hospital, or school.Canal waterManmade waterway used by watercraft or fordrainage, irrigation, mining, or water power(ditch, lateal).Cape Projection of land extending into a bodyof water (lea, neck, peninsula, point).CaveNatural underground passageway or chamber,or a hollowed out cavity in the side of a cliff(cavern, grotto).Cemetery A place or area for burying the dead(burial, burying ground, grave, memorial garden).CensusA statistical area delineated locally specificallyfor the tabulation of Census Bureau data(census designated place, census county division,unorganized territory, various types ofAmerican Indian/Alaska Native statistical areas).Distinct from Civil and Populated Place.Channel waterLinear deep part of a body of water throughwhich the main volume of water flows andis frequently used as aroute for watercraft(passage, reach, strait, thoroughfare, throughfare).Church structure Building used for religious worship(chapel, mosque, synagogue, tabernacle, temple).CivilA political division formed for administrative purposes(borough, county, incorporated place, municipio,parish, town, township).Distinct from Census and Populated Place.CliffVery steep or vertical slope(bluff, crag, head, headland, nose, palisades,precipice, promontory, rim, rimrock).CraterCircular-shaped depression at the summit ofa volcanic cone or one on the surface ofthe land caused by the impact of a meteorite;a manmade depression caused by an explosion(caldera, lua).CrossingA place where two or more routes oftransportation form a junction or intersection(overpass, underpass).Dam structureWater barrier or embankment built across thecourse of a stream or into a body of water tocontrol and (or) impound the flow of water(breakwater, dike, jetty).Falls water Perpendicular or very steep fall of water inthe course of a stream (cascade, cataract, waterfall).Flat Relative level area within a region of greater relief(clearing, glade, playa).ForestBounded area of woods, forest, or grassland underthe administration of a political agency (see ”woods”)(national forest, national grasslands, State forest).GapLow point or opening between hills or mountains orin a ridge or mountain range(col, notch, pass, saddle, water gap, wind gap).Glacier waterBody or stream of ice moving outward and downslopefrom an area of accumulation; an area of relativelypermanent snow or ice on the top or side of a mountainor mountainous area (icefield, ice patch, snow patch).Gut water Relatively small coastal waterway connecting largerbodies of water or other waterways (creek, inlet, slough).Harbor water Sheltered area of water where ships or other watercraftcan anchor or dock (hono, port, roads, roadstead).Hospital structure Building where the sick or injured may receive medicalor surgical attention (infirmary).IslandArea of dry or relatively dry land surrounded by water orlow wetland (archipelago, atoll, cay, hammock, hummock,isla, isle, key, moku, rock).Isthmus Narrow section of land in a body of water connectingtwo larger land areas.Lake water Natural body of inland water (backwater, lac, lagoon,laguna, pond, pool, resaca, waterhole).Lava Formations resulting from the consolidation of molten rockon the surface of the Earth (kepula, lava flow).Levee structure Natural or manmade embankment flanking a stream(bank, berm).LocalePlace at which there is or was human activity;it does not include populated places, mines, and dams(battlefield, crossroad, camp, farm, ghost town, landing,railroad siding, ranch, ruins, site, station, windmill).Military Place or facility used for various aspects of or relatingto military activity.MinePlace or area from which commercial minerals are orwere removed from the Earth; not including oilfield(pit, quarry, shaft).Oilfield Area where petroleum is or was removed from the Earth.Park structurePlace or area set aside for recreation or preservation of acultural or natural resource and under some form ofgovernment administration; not including National orState forests or Reserves (national historical landmark,national park, State park, wilderness area).Pillar Vertical, standing, often spire-shaped, natural rock formation(chimney, monument, pinnacle, pohaku, rock tower).PlainA region of general uniform slope, comparatively level andof considerable extent(grassland, highland, kula, plateau, upland).PopulatedPlace popPlace or area with clustered or scattered buildings and apermanent human population (city, settlement, town, village).A populated place is usually not incorporated and bydefinition has no legal boundaries. However, a populatedplace may have a corresponding ”civil” record, the legalboundaries of which may or may not coincide with the perceivedpopulated place. Distinct from Census a转自:http://www.6daixie.com/contents/13/5031.html