WI20-Assignment4
Running Queries against complex models using the STL
Due Saturday Feb 29 at 11:15pm
Introduction
In this assignment, you will continue to develop your software design skills AND gain experience working with STL containers, iterators, and algorithms. You're going to be building a small data container, and a simple query language to retrieve results. This assignment will lead you to get some experience with the following idioms:
- Some practice with STL containers and iterators
- Practice building a complex model
- Practice using interfaces and the
adapter
design pattern - Practice using the
Command
design pattern - Practice building a "mini-language" to work with a virtual model
NOTE: We've given you some starter code. But that doesn't mean it's all complete, or won't need to be adapted to meet your needs. YOU are responsible to make changes to the given code if your solution requires it. You can't change or remove the definition of given methods, but you can add your own, and change how things are implemented. (You might not need to make any changes).
This assignment has four parts:
[图片上传失败...(image-3fe1a2-1582791266402)]
Part 1 -- Reading a JSON file
In part one -- you'll read the contents of a (complex) text file in the JSON format. Fortunately - we're providing you with tools that give you a big headstart on this step.
Part 2 -- Building a Model
(graph) of "elements" that reflects the JSON file
The JSON file you are reading describes a complicated graph of data elements (collections of key/value pairs, and simple element lists). You will use STL containers to build an in-memory representation (a model) of the hierarchy represented in the JSON file. You will build and manage that graph in a class called Model
.
Part 3 -- Processing queries
Once your model is built, it's time to ask it questions. Your model offers a set of methods that makes it possible to build simple queries:
//These indicate where to look in your model...
ModelQuery& within(const std::string &aTarget);
ModelQuery& before(const std::string &aTarget);
ModelQuery& after(const std::string &aTarget);
//These specify what you want to retrieve...
ModelQuery& all(const std::string &aTarget);
ModelQuery& first(const std::string &aTarget);
ModelQuery& last(const std::string &aTarget);
ModelQuery& nth(const std::string &aTarget, size_t anIndex);
//these statements get results (and are always the last part of a query)
size_t count(); //how MANY items are in the result set?
double sum(const std::string &aField); //what's the sum of a given key/value in the result set
double avg(const std::string &aField); //what's the sum of a given key/value in the result set
ModelCollection* get(); //get a result (one or more elements, or a value if you're targeting a key)
std::string get('keyvalue'); //get the value as a string for the named key
-- A Note about these statement types
In the list of methods (above), there are 3 types of statements:
- Those that specify where to look (
within
,before
,after
) (you may specify more than 1) - Those that specify what you want (
all
,first
,last
,nth
) - Those that produce a result (
get
,count
,avg
,sum
) -- these are always specified last!
A large variety of queries can be built by chaining various statements together. Most statements can be given in any order, but the count
, sum
, and get
statements must be the last in the query.
-- Example
To build (and run) a query, you start by asking your Model
class for a new query:
auto theQuery = theModel.query();
Then you add specific statements to the query, to express what you're looking for.
theQuery.within("sammy").after("list");
Then you tell the query the type of result you're looking for. This may be a collection of nodes in your model (get()
), or the count()
or sum()
of values in the result set.
Putting it all together, it looks like this:
size_t theCount=theQuery.within("xxx").first("chandini").after("renjei").count();
This query indicates it wants to know how how many nodes with the name "Chanhini", follow a node named ("renjei").
Part 4 -- Returning results (one or multiple items)
Once you've proceeded the query, you will return a result. The result may be a single item (like a count or a value), or it might be a collection of ModelNode
elements from your Model
. When you need to return a collection of ModelNodes
from your Model
, you'll return a ModelCollection
. This is a object (probably uses STL container internally) to hold a set of results from the query.
Understanding JSON
JSON or JavaScript Object Notation is a lightweight text-based open standard designed for human-readable data interchange. It is easy to read and write and also language independent.
- It is primarily used to transmit data between a server and web applications. Web services and APIs use JSON format to provide public data.
- It is also used for serializing and transmitting structured data over network connection.
This is what a json looks like -
{
"courseId":"ECE-141A",
"courseName":"Software Foundations - I",
"studentList":[
{
"id":"foo",
"name":"Foo Bar",
"pId":"A22222222",
"gpa":3.95,
"isGradStudent":true
},
{
"id":"bar",
"name":"Bar Foo",
"pId":"A99999999",
"gpa":3.85,
"isGradStudent":false
}
],
"schedule": {
"lectures": "Tu-Thu 9:30AM",
"discussions": "Mon 4:00PM"
}
}
Syntax
Json data can either be an Object or a List of Objects. Curly braces (
{}
) hold objects and Square brackets([]
) hold lists.In an object, data is represented as collection of key-value pairs. Each key is followed by colon(
:
) and then its valye. key-value pairs are separated by commas(,
). For example -"courseId":"ECE-141A"
is a key-value pair in the above example. It is followed by a comma(,
) and then the next pair is mentioned.key is always a string in double quotes (
""
). For example -courseId
,courseName
,studentList
,gpa
etc. are all keys in the above example.-
Json supports multiple data types for values. Value can be one of the followings -
- number - floating point numbers or Integers. For instance - value for the field
gpa
- string - a Unicode string in double quotes(
""
). For instance - value for fieldscourseId
,courseName
etc. - boolean - boolean flag (
true/false
). For instance - value for theisGradStudent
field in the student object. - json object - Value can also be another json object. For instance - value for the
schedule
key. This gives json a tree like structure and makes the storage of hierarchial data easy. - array - The value can also be an array. Square brackets(
[]
) are used to hold arrays. Values within an array are separated by commas. The value can be any valid value type(number, string, boolean, json object or even an array). For instance - the value for the keystudentList
is an array of json objects. - null - The value for a particular key can also be
null
. This is equivalent to key being not present.
- number - floating point numbers or Integers. For instance - value for the field
The flexibility in the value types allows us to store complex data structures. For example - You can have an object which contains a list of objects where each object has a list of lists.
Notice that there are no double quotes
""
for primitive values likenumber
,boolean
andnull
.true
and"true"
are different values.
Resources
There are a lot of good resources on the web if you want to read further. Sharing a few here -
- https://www.tutorialspoint.com/json/index.htm
- https://json-schema.org/understanding-json-schema/
Constructing a model from a JSON file
Using the provided JSONParser
class
Good news -- we've already built a JSONParser
class that can read the data in the given JSON file. As the JSONParser
reads your file, it will notify an object that implements the JSONListener
interface. Your JSONListener
will follow the intructions provided by the JSONParser
, and build a corresponding graph of objects in memory (using STL containers and custom classes you build) to fully represent the graph described in the JSON file.
The JSONListener
interface is shown below:
virtual bool addKeyValuePair(const std::string &aKey, const std::string &aValue)=0;
virtual bool addItem(const std::string &aValue)=0;
virtual bool openContainer(const std::string &aKey, Element aType)=0;
virtual bool closeContainer(const std::string &aKey, Element aType)=0;
Building Your In-Memory Object Graph (model)
The JSONParser
can tell you what it finds in a given JSON file, but it doesn't know how to build an object graph. You have to write code to do that yourself. Here's what you'll do:
- Your
Model
class is a subclsss of theJSONListener
interface (so you'll implement those methods) - As you get instructions from the
JSONParser
, yourJSONListener
will build an in-memory collection of items according to the instructions you receive. Build this hierarchy any way you like -- and use any of the STL classes.
Processing Queries
Once you have constructed an in-memory representation of the JSON file, it's time to answer queries about it. To do so, you'll need to construct a command-processor and accept and answer simple questions. We already created the first part of this:
- Define
CommandProcessor
interface (already done for you inJSONParser
. - Connect the
CommandProcessor
to a filestream (to read commands from a file). Each line of the command file contains one query - As the processor reads each line of the "commands" file, its
run()
method is called. - The
CommandPocessor
will interact with yourModel
object to answer the associated question and provide a response. You'll print your answer to the given output stream.
NOTE: The commands given on each line of your "commands" file may need to be reordered when you convert them to statements suitable for a query.
Your query "language"
These indicate your target location, or location relative to another thing
within(name)
The within(name)
statement indicates that results for your named element must occur within a named element (container). The within()
statement is often used in combination with other statements.
before(name)
The before(name)
statement specifies that results for your named element must occur before a named element.
after(name)
The after(name)
statement specifies that results for your named element my occur after a named element.
These indicate the target you're looking for...
all(name)
The all
statement is used when the user wants to return every element that matches the given name.
first(name)
The first()
statement limits the number of results to the first element that matches the given query inside a container. Imagine you have a complex Model
that contains N elements with the label of "foo". Using the .first()
statement, the result set would only include the first "foo" element found during iteration.
last(name)
The last()
statement limits the number of results to the last single element that matches the given query. Imagine you have a complex Model
that contains N elements with the label of "foo". Using the .last()
statement, the result set would only include the last "foo" element found during iteration.
nth(name, pos)
The nth()
statement limits the number of results to the nth single element that matches the given query. Imagine you have a complex Model
that contains N elements with the label of "foo". Using the .nth(2)
statement, the result set would only include the nth "foo" element found during iteration.
We represent the position of elements using a cardinal numbering scheme -- where the first element pos = 1.
These statements gather the results and return them to the user...
count(name)
The count()
function specifies that the result should be an integer quantity that represents the number of elements in the result set.
sum(name)
The sum()
function specifies that the result should be an integer quantity that represents the computed "sum" of element (values) in the result set that match the given fieldname.
avg(name)
The avg()
function specifies that the result should be an integer quantity that represents the computed "average" of element (values) in the result set that match the given fieldname.
get()
The get()
function is by far the most common statement. When used, it returns the collection of elements from your Model
that the given query specification. The result may contain one or more elements.
get('key')
The get(key)
function attempts to retrieve the value for a key, as specified by the given query.
Query Examples
Assuming you have a JSON file like this:
{
"root" : {
"part-a" : [
"sub1", "sub2"
],
"part-b" : [
"sub1",
{
"part-c" : {
"key1" : "value1",
"key2" : "value2"
}
}
],
"part-d" : [
"sub4"
]
}
}
And it turns into an in-memory model like this:
[图片上传失败...(image-7de4a4-1582791266402)]
The queries are applied to a specific level you are at(Starting from root). For nested queries, we can chain the keys using dot .
. We can also use star *
operator in target queries(first, before, last etc.) to return node irrespective of the name. We can assume that the queries would be executed in left-to-right order. We'll test your code only on queries which are valid or makes sense.
NOTE: Query logic always follow these rules:
- If no location is given, presume the query implies the top (root) of your model
- If the query asks for "un-named" elements (like all()) -- only search in the given container (don't search within child-nodes)
- If the query asks for a specific key (e.g. "chandhini") -- you must search in all child nodes inside the named locationl
- The asterisk (
*
) means "any" or "everything"
Here are a few sample queries:
-
first("root")
will return a collection with a singleNode
- it's key asroot
. -
within('root.sub3').get('key1')
-- get a single value for the specified key -
within("root.part-b").first("*")
. will return a node which just containssub1
. Notice how we used the.
operator to chain keys for sub-tree lookups. -
after("root").all("*")
will return an empty result since there's no node after root in the model. Notice that all the data is within root. -
.within("root").all("*")
will return a list of nodes corresponding topart-a
,part-b
,part-d
. Notice how the result is different from the first query. Since you didn't name a specific key, we only search in the given container (root
). -
.within("root").after("part-a").first("*")
will return a node corresponding topart-b
. -
.within("root").before("part-d").first("*")
will return a node corresponding topart-b
. Notice that the behavior offirst
query would differ if the prior query wasafter
(return the next node) orbefore
(return the node jus before). -
.within("root").first("part-a").all("*").count()'
will return count as2
. It first goes inside the root. then it looks for firstpart-a
beforepart-d
. This returns a result with a single node with key aspart-a
. Then you do.all("*")"
to get the child elements of the node. Now the result contains two nodes (["sub1", "sub2"]
). Finally the count of this result is returned which is2
.
Similarly, you can think of other such valid queries. Please ask your questions on SlideSpace if there's any confusion about what a particular query should return so that others can also benefit from it. It's kind of fun to see how creative you can be -- or -- try to break the system!
Returning a Query Response from your Model
Your query response will take one of three forms, depending on the query:
- A count (
integer
) - A compute result (
avg
,sum
) after analyzing a result set - A
ModelCollection
of elements as a set of results from theget()
request
If your query requires that you return multiple
ModelNodes
, then you do that by returning aModelCollection
. This object should be able to hold as manyModelNodes
from yourModel
as are necessary. Note, however, that aModelCollection
doesn't own theModelNodes
and should not delete them. They will ultimately be deleted when yourModel
gets destructed.
JSONParser Code Analysis (EXTRA CREDIT)
Write a brief analysis of the code provided in the JSONParser
class. Review for clarity of design, simplicity, correctness, coding style. Write your answer in a textfile called "code-review.txt".
Testing
We've provided a testing framework for you, much like before. To run a test, you need to give the Testing
class two files (examples provided). First -- you need a .json file that contains the data you to load into your Model
. Then you need a "commands" file. The commands file has a series of line, each of which is a text-representation of a query. Your testing class reads each line, and calls the CommandProcessor
class. The CommandProcessor
class is responsible for converting the text version of the command into a series of "query statements" that your Model
provides.
While we've given you a simple test file (test.json) and associated commands file (test.txt). After you get these to work, try making your own files:
- Using the
JSON
example above, make a commands file to try out different queries. - Try making your own JSON file (or get one from the web) -- then try making sample queries to test.
Grading
- 60% - Code that builds a correct in-memory model
- 10% - Code that converts command file (lines) to correct query statements
- 10% - Code that correctly executes every statement type
- 10% - Code that returns the correct results
- 10% - Code with no functions greater than 50 lines of code
- 10% - EXTRA CREDIT: Reasonable answer to
JSONParser
Code Review question (above)
Submitting your work
Due Saturday Feb 29 at 11:15pm
As usual, submit your code to github, and also to gradescope. Don't forget to fill out the student.json file too.
Good luck!