Lab 7h Fun With Higher-Order Functions

Purpose The purpose of this lab is to induce headaches with λ, apply some of our newly-found skills to a practical problem, and cut down on unneeded computation.

When working with λ and higher order operations, signatures are very important. They are there to keep you sane and give you a hint as to how to use them. Heed their wisdom, as to go forward you must go back, and to touch the light you must pass beneath the shadow.

Functions on Functions

Exercise 1 Design the function two that takes a function f : [X -> X] as input and returns another function that applies f twice in a row. That is, two returns a function which first applies f to its input, then applies f again to the output of the first application (all within one function call).

Exercise 2 Design the function three, similarly, that applies f three times in a row.

Exercise 3 Design the functions one and zero. Writing one is easy, but what about zero? What does it mean to apply a function to its argument zero times?

The functions zero, one, two, and three look curiously similar to numbers: all they do is repeat their input some number of times. Let’s call functions like these "repeaters":

; A Repeater is a [X -> X] -> [X -> X]
; that, given a unary function f, outputs a function that
; will repeatedly apply f

Exercise 4 Design a function rep->nat which consumes a Repeater as input and produces the number of times it repeats its argument. So:
(check-expect (rep->nat zero) 0)
(check-expect (rep->nat one) 1)
(check-expect (rep->nat two) 2)
(check-expect (rep->nat three) 3)
(check-expect (rep->nat (λ (f) (λ (x) ((three f) ((two f) x))))) 5)
Hint: If you have a Repeater, all you can do is give it some inputs and see what it gives you back. So your task here is simply to devise some inputs that will force it to tell you which number it represents:
; rep->nat : Repeater -> Nat
; Discover how many times the given Repeater repeats its input
(define (rep->nat rep)
((rep ?) ?))

Exercise 5 Design the function rep-add1 that increments a Repeater by 1.

Exercise 6 Design the function nat->rep that converts a natural number n to the Repeater that repeats its input n times. Use the following data definition to process natural numbers recursively.
; A Nat (natural number) is one of:
; - 0
; - (add1 Nat)

Repeaters give us an alternative representation of natural numbers, and the cool part is that we can build them with λ. Can we also do arithmetic with them using nothing more than λ?

Exercise 7 Before moving on to the next exercises, update your definition of nat->rep to not use zero or rep-add1 to get some good practice of using just λ.

Exercise 8 Design the function rep+ that takes two Repeaters and returns the Repeater that represents their sum. Don’t use rep->nat, nat->rep, or built-in arithmetic.

Exercise 9 Design the function rep* that takes two Repeaters and returns the Repeater that represents their product. (Again, no rep->nat, nat->rep, or built-in arithmetic.)
Hint It’s shorter than rep+.

Exercise 10 Design the function rep-expt that takes two Repeaters and returns the Repeater that represents their exponentiation: (rep-expt n m) = nm. (Don’t use rep->nat, nat->rep, or built-in arithmetic!)
Hint It’s shorter than rep*.

Exercise 11 Can we use repeaters to compute functions we care about?
The Fibonacci sequence is defined as:
F(0) = 0
F(1) = 1
F(n) = F(n-1) + F(n-1)

; A FibPair is a (list Nat Nat),
; where the first number is the nth (n>0) number in the Fibonacci sequence,
; and the second number is the (n-1)th number in the Fibonacci sequence

; fib : Nat -> Nat
; Compute the nth fibonacci number
(define (fib n)
(second (((nat->rep n) ?) ?)))

Finish the above code. Use the given data definition to help you. }

Making Google

This is a highly simplified account of what search engines do, and is one of the simplest metrics by which search engines rank their results. If this lab has whetted your appetite for more sophisticated techniques, consider taking an upper-level class in Information Retrieval!

Or at least some subset of it. Today in lab we will implement what Google started as: a search engine. That may sound intimidating, especially after the headache that is the first half of the lab, so let’s start off with some simple helpers.

Exercise 12 Design the function count that will take a function p of type [X -> Boolean] and a [Listof X] lox and will output how many elements x in the list return true when p is applied to them. This function should not use length or filter, as that would require building up a list, when all we really want is the length of that list.

Exercise 13 Design the function unique that will take a [Listof X] and return the same list, except if an element in the list appears later in the list, remove its first apperance. Use foldr.

; A Corpus is a [Listof Document]
; A Document is a [Listof String] where every element is an English word, all lower-case

Let’s think what makes a document relevant when someone searches the word "lackadaisical". How often it appears in the document is a good indicator of how relevant it is. In a document about the etymology of the word, for example, the word would appear quite often. In a novel, it may appear once or twice, but someone interested in the word "lackadaisical" would almost definitely not be interested in finding that novel. Thus, term frequency is a good indicator of how relevant a document is to that term.

Exercise 14 Design the function term-frequency, that takes a Document and a String and returns what percentage of the words in the document are equal to the given string. Use count.

; A Query is a [Listof String] where every element is an English word, all lower-case

Now let’s think about what makes a document irrelevant when someone searches for the query (list "a" "storm" "of" "swords"). "storm" and "swords" are certainly unique words and won’t appear in just any document, but "a" and "of" aren’t and would. Therefore, how many documents a word appears in is inversely proportional to how relevant it is to a query.

Exercise 15 Design the function idf, which is short for inverse-document-frequency, that takes a String and a Corpus and outputs the total number of documents divided by the number of documents the given word appears in... plus 1, to avoid division by zero in case the word never appears at all.

Exercise 16 Design the function tf-idf, that takes a Query, a Document and the Corpus it belongs to and outputs the sum of the product of each search term’s term-frequency in that document and its inverse-document-frequency.

Exercise 17 Design the function search-engine, that given a Corpus, will output a function that takes a Query and outputs the documents sorted in descending order by the tf-idf applied that query, document, and corpus. Use sort.

Yay! You are Google!

Exercise 18 Challenge: sort uses a function that compares two elements to each other. It has been proven that all comparison-based sorting functions will compare at least one element x to other elements more than once.
In your search-engine function, documents were compared to each other based on their tf-idf value. Therefore, for at least some documents, this value was computed more than once. That’s bad!
The problem we’re going to solve is this: sometimes, we want to sort a list of values based on some (possibly slow) computation applied to each one. We definitely don’t want to do this computation on the same element more than once.
Design sort/compute-once, which takes a [List X] lox, an [X -> Y] f, and a [Y Y -> Boolean] y-comp, and sorts lox by y-comp after f is applied to its elements. f should only be computed once on each element, the function should still return a list identical to lox (but with the order changed), and it should use sort.

← prev up next →

	General Information
	Course Contract
	Texts
	Advice
	Office Hours, EMail
	Syllabus
	Labs
	Problem Sets
	The Style

	Lab 1h BSL in Dr Racket
	Lab 2h Structures
	Lab 3h The World
	Lab 4h Lists
	Lab 5h Abstraction and natural number recursion
	Lab 6h Loops and Lambda
	Lab 7h Fun With Higher-Order Functions
	Lab 8h Documents
	Lab 9h Generative Recursion and Accumulators
	Lab 10h Chat Client
	Lab 11h Type Inference