讲解:CISC-235、Python/Java、C/C++、HOTNCUR|Matlab

CISC-235Fall 2018Assignment 2A certain university has decided to get in on the blockbuster film game by creating a set ofinter-connected movies which will be collectively called the HOTNCU (Harvard of the NorthCinematic Universe). Each movie will focus on the gripping adventures of one or moresuper-heroes who happen to be students, staff or faculty at the mysterious University Q,situated in the far-away and not-cold-at-all land of Notario. (The University Administrationis very proud of having come up with this clever disguise for the actual setting of the movies.)The projected number of movies in the series will be at least 3000 and not more than 4000.Each movie under development has been assigned a project code name to preserve secrecy.Each code name is an 8-letter English word. A sample set of code names is provided in thefile HOTNCU_potential_codenames_2018F.txt.You have been assigned the task of creating a data structure that can contain 4000 items or more support insert and search operations provide access to each item with a maximum of 3 steps (that is, the maximum numberof table addresses examined during an insert operation can be no more than 3). Formore information on this, see below.Your hard-earned data structures expertise has convinced you that neither a sorted array nora binary tree can meet this requirement, so you have settled on using a hash table.The HOTNCU Project Director was previously a Computer Science professor and she hastaken an interest in your project. She has already decided that you are required to use someform of open addressing. She is aware that your table will need to be > 4000 in size but shewants you to try to minimize it.She wants you to explore at least two forms of open addressing: quadratic probing anddouble hashing. For each method she wants you to experiment with different hashingfunctions and details of the collision resolution methods to determine a table size that lets youachieve the required performance standard. See below for a discussion of how to computethe necessary information.Part 1:Formulate an hypothesis regarding which of quadratic probing or double hashing willachieve the required performance standard with smaller table sizes.Part 2:Decide how you will convert the code names into usable key values. This may involveconverting each code name to an integer, or simply treating each code name as a bit string.You will also find a wealth of ideas on the Internet. Whatever method you decide on, explainwhy you chose it and remember to cite your source if it is not your own creation.You may wish to take advantage of the fact that all the code names have exactly 8 characters.Part 3:Implement a hash table where collisions are resolved by quadratic probing.Use an hashing function of your own choice. You must implement the algorithmyourself. Using downloaded code from external sources is not permitted – but writing yourown code based on a published algorithm is fine (remember to cite your sources). You maywish to experiment with different functions to minimize the number of collisions.Try at least three combinations of and :1)2) (remember that you will have to convert the result to an integer),3) other combinations of your choice.For each combination, find a table size that lets you achieve the requirement on maximumprobe sequence length. Use experimentation to get close to the minimum table size thatsatisfies the requirement.For experimentation, I suggest that you draw many random samples of 4000 strings from theprovided text file, and run your hashing algorithms on the samples with a range of table sizesto determine which sizes meet the performance requirement.A table size must be rejected if there is any code name in the sample for which the insertoperation exceeds the maximum number of steps.Part 4:Repeat Part 3 but with Double Hashing instead of Quadratic Probing. Try at least threecombinations of and . For each combination, find a table size that achieves therequired performance.You are free to choose any hashing functions you like for and , but as with Part 3you must implement them yourself.Part 5:Write a report according the format specified for this course, including an analysis of yourexperimental results and whether or not they support your hypothesis.Computing the Number of Steps in an InsertionEvery time your program looks at the content of a table address, that counts as a step. So ifyou are inserting a value and you try addresses 717, 5, and 2083 before finally inserting thevalue in address 3006, that counts as four steps.Logistics:You may complete the programming part of this assignment in Python, Java, C or C++.You must submit your source code, properly formatted and documented You must alsosubmit a PDF file containing your report. All files must contain your name and studentnumber, and must contain the following statement: “I confirm that this submission is my ownwork and is consistent with the Queens regulations on Academic Integrity.” Combine yourfiles into a .zip archive for uploading.You are required to work individually on this assignment. You may discuss the problem ingeneral terms with others and brainstorm ideas, but you may not share code. This includesletting others read your code or your written conclusions.The due date for this assignment is 11:59 PM, November 16, 2018.转自:http://ass.3daixie.com/2018110830818150.html

你可能感兴趣的:(讲解:CISC-235、Python/Java、C/C++、HOTNCUR|Matlab)