On this page, you will answer questions about data that interests you.
Notice that these block names include the word "table" or "record" before the second input. These expected input data types can help you avoid bugs caused by using an input that does not match that the selector expects to receive.
A table is represented in Snap! as a list of lists. If you right-click (or control-click on a Mac) a table, you can switch to "list view" and see how the data (and column headings) are stored. See examples of table view and list view.
You can see the column number by holding your mouse pointer over the letter at the top of the column in table view.
You may need to use map
, keep
, or combine
to answer your question. Click to see where you learned about these higher order functions.
map
on Unit 3 Lab 2 Page 5: Transforming Every List Item.keep
on Unit 2 Lab 3 Page 5: Keep
ing Items from a List .combine
on Unit 2 Lab 4 Page 3: Other Mathematical Reporters.Click for example questions to ask about a single column.
average
block.)minimum
block.)Notice that all of these examples only require data from one column. If you want to ask a question that requires looking at another column (for example, "What's the model of the car with the highest MPG?"), you can do the Take It Further Activity below.
keep
. Filtering is a powerful technique for finding information and recognizing patterns in data. For example, filtering can help you answer questions like "What is the average city MPG for just the Subarus in this dataset?"keep
all the records from cars for which the 14th field is "Subaru." Then, we take column 9 of those records (the "City MPG") and find their average.
Notice that there are many digits in the answer above. How many digits are given in the table for each car's MPG? An important rule in data science is not to claim more precision in a result than is warranted by the given data, so this answer should be rounded to 19.
headings
, data
, record
, field
, column
blocks described in problem 1 above, your project also includes the following four mathematical blocks that students import from their work in Unit 2 Lab 4 Page 2: Making a Mathematical Library. Test them each with a simple list like to make sure they behave as you expect.average
.)minimum
.)Notice that the column you use to filter the data (such as year) doesn't have to be the column you are asking about (such as transmission).
Sometimes, you want to keep a subset of your data (such as "Which cars were made in 2010?"), but other times, you just want one item that matches your requirement, often because what you really want to know is whether any items match, and as soon as you find one, the answer is "yes" (such as "Were any electric cars made in 2010?"). Snap! has a higher order function that works similarly to keep
, but it reports only the first item that's found, so it can be faster.
Find first
is a higher order function like keep
, map
, and combine
, because it takes a function (a predicate) as input. (Find first
is like item (1) of (keep)
.)
Click for an example of keep
vs. find first
.
You can access or change data to create new information by using:
Map
to transform every element of a data set (such as doubling every element in a list, or extracting the parent’s email from every student record)Keep
or find first
to filter a data set (such as keeping only the positive numbers from a list, or keeping only students who signed up for band from a database of all students)Combine
to combine or compare data in some way (such as adding up a list of numbers, or finding the student who has the highest GPA) bar chart
, which you will learn on the next page)
Classifying data is extracting groups of data with a common characteristic.
The by intervals of
input to group
should be left empty when, as in this example, the field on which you're grouping is text rather than numbers. Later on this page you'll see how to use intervals in graphing.
keep
functions.)
The bar chart
function works like the group
function, but with special features for numeric data: it allows you to select upper and lower limits of the data; you can have a range of values in one bucket, such as values 6–10, values 11–15, and so on; and it sorts the groups. For example, here is the cars data grouped by city MPG (column 9):
The number in column A is the largest value included in each group. If the values aren't all integers, the next group includes anything larger. For example, the group numbered 15 includes values from 10.0001 (or anything more than 10) to exactly 15.
You can plot the data from bar chart
to visualize them:
The mode of a data set is the value that appears most often in it.