Application: Mapped Filters*

In data processing, we are often provided with a huge set of raw data, some which may not be useful for us if viewed individually. In most cases, we would want to convert these data into linguistic expression, for instance identifying some special events. Take stock price as an example. Are you able to interpret how is a company performing just by viewing its daily stock price? Wouldn't it be more useful if we are able to identify the period when the company is doing well or badly and what is the maximum or minimum price during that period of time? So, how could we manipulate the data to produce information which are meaningful for the users and extract events from the data stream?

 In the process, filtering, sorting, summarizing and analysing would be involved.
    	    From the four built-in words which are higher-order functions (
MAP
,
FILTER
,
REDUCE
and
ZIPWITH
),
MAP
and
FILTER
are the most suitable functions for these tasks as
MAP
can process the original input and map them into information that we want while
FILTER
will be able to filter off redundant information. Unfortunately, not all processes can be done using
MAP
and
FILTER
individually. The reason is the former is a one-to-one function and will still produce useless data. As for the latter, it can only return the original data without being processed. Hence, could we create a new word
MAPF
which can do both
MAP
and
FILTER
? The answer is of course yes, by using higher order function.
	: mapf ( seq xt -- seq ) ~ { f }
		[: f >R ;] map
		['] R> filter
	; 
	

MAPF
takes a sequence and an XT as arguments. The XT is converted to a local function f, where the function takes a value as an argument and returns two items, a new value and a flag. The returned value is flagged with
TRUE
if we want to print out this value later and
FALSE
if otherwise.
MAP
will take the sequence and execute f on the items in the sequence. The flag will then be pushed into the return stack. The flag is then used by
FILTER
to filter off useless data. Now, we are able to do tasks such as finding the maximum value or average over a fixed window length, which cannot be done using
MAP
and
FILTER
seperately.

In the following examples, we will make use of 
MAPF
to process information.

Examples

Example 1: Finding average

Given a sequence s = (s0, s1, s2, ...), write a program to produce a sequence which gives the average value of every four steps in the original sequence.

Step 1: Use references to store variales

Use the word
ref
to create a reference. Using reference, we are able to store a counter and the current sum at each step.

To use
ref
,
	: use-ref 
		0 ref { count }  \ set count's initial value to 0
		count @ .  \ get count => display 0 
		1 count !  \ set count to 1
		count @    \ get count => display 1
	;
Basically,
ref
creates a closure each time it is called and put the top item on the stack into the closure's data section. In this example, when
use-ref
is run,
ref
will create a closure and insert
0
into the closure's data section. Then, we bind the closure, which is an XT, to
count
so we can access the XT and its data section.

You will learn how to implement
ref
when you come to macros.

Step 2: Calculate the average every four steps

We first bind a reference to 
COUNT
and another to
SUM
. At each step, we will add one to the counter and add the current value to the sum. So, why use references instead of declaring variables outside of
AVG
for
COUNT
and
SUM
? This is because by declaring variables, we are not able to reset
COUNT
and
SUM
back to zero everytime we are to implement this code on a new sequence.

Continuing with 
AVG
, if the counter is at four, we will then get the average and flag it with
TRUE
. Remember to reset the counter and sum at this point of time. Else, we will just return zero with
FALSE
because we do not want to print out this value later. Also take note that
AVG
only outputs an XT.

	: avg ( v -- xt ) 
		0 ref { count } 0 ref { sum }
		[:
			sum +! 1 count +! 
			count @ dup 4 = 
			if sum @ swap /. true 0 count ! 0 sum !
			else 0 false then
		;]
	;
	
Implementation
	1 2 ... avg mapf 5 take .list 
	2.5 6.5 10.5 14.5 18.5
	
Example 2: Runs

In stock market, it is important to detect a financial bubble. So, using the data on stock price, we would like to obtain runs in order to check for prolonged uptrend or downtrend.

Step 1: Define a word which detects the sign of a value

We first define a word which return +1 if the value is positive and -1 otherwise.

	: positive-negative ( n -- n ) 
		0 > if 1 else -1 then 
	; 
	
Step 2: Find the changes in the stock price

CHANGES
takes a sequence of stock prices as arguments and returns another sequence which states that the change in stock price is positive or negative.

	: changes ( seq -- seq ) 
		dup tail swap ['] - zipwith
		['] positive-negative map
	;
	
Step 3: Calculate the frequency of consecutive positives or negatives

Using 
REF
, we again bind one to a counter and one to the current state. If the current value is the same as previous, we add one to the counter and return zero. Otherwise, we return the current count and reset the counter to one. We only flag those numbers with
TRUE
if they are not zero.

	: freq ( n -- xt )
		0 ref { count } 0 ref { state }
		[: 
			dup state @ = 
			swap state !
			if state @ count +! 0
			else count @ state @ count ! then
			dup 0 = if false else true then
		;]
	;
	
Implementation
	nil 95 cons 105 cons 120 cons 110 cons 100 cons 130 cons 140 cons 150 cons 130 cons 120 cons dup .list
 	120 130 150 140 130 100 110 120 105 95 ok
	changes dup .list
	1 1 -1 -1 -1 1 1 -1 -1 ok
	freq mapf .list 
	2 -3 2 ok
	

Quiz

Question 1

Instead of producing a sequence which gives the average value of every four steps as shown in Example 1, write a program which gives the maximum value of every 10 steps of the original sequence.



Question 2

Suppose that you are an owner of a start-up company which produces tables and chairs. The profit of selling a table and chair are $100 and $80 respectively. It takes 2 hours and $10 to produce a table while it takes only 1 hour and $7 for a chair. Given that you currently have 100 hours and $500 to produce the first batch of tables and chairs, list out all the possible profits that you can attain without violating the constraints. Assume that you are given a sequence of tuples, in which each tuple shows a possible combination of numbers of tables and chairs.



Next: Decompilation