# Application: Mapped Filters*

In data processing, we are often provided with a huge set of raw data, some which may not be useful for us if viewed individually. In most cases, we would want to convert these data into linguistic expression, for instance identifying some special events. Take stock price as an example. Are you able to interpret how is a company performing just by viewing its daily stock price? Wouldn't it be more useful if we are able to identify the period when the company is doing well or badly and what is the maximum or minimum price during that period of time? So, how could we manipulate the data to produce information which are meaningful for the users and extract events from the data stream?

In the process, filtering, sorting, summarizing and analysing would be involved. From the four built-in words which are higher-order functions (MAP,

FILTER,

REDUCEand

ZIPWITH),

MAPand

FILTERare the most suitable functions for these tasks as

MAPcan process the original input and map them into information that we want while

FILTERwill be able to filter off redundant information. Unfortunately, not all processes can be done using

MAPand

FILTERindividually. The reason is the former is a one-to-one function and will still produce useless data. As for the latter, it can only return the original data without being processed. Hence, could we create a new word

MAPFwhich can do both

MAPand

FILTER? The answer is of course yes, by using higher order function.

: mapf ( seq xt -- seq ) ~ { f } [: f >R ;] map ['] R> filter ;

MAPFtakes a sequence and an XT as arguments. The XT is converted to a local function f, where the function takes a value as an argument and returns two items, a new value and a flag. The returned value is flagged with

TRUEif we want to print out this value later and

FALSEif otherwise.

MAPwill take the sequence and execute f on the items in the sequence. The flag will then be pushed into the return stack. The flag is then used by

FILTERto filter off useless data. Now, we are able to do tasks such as finding the maximum value or average over a fixed window length, which cannot be done using

MAPand

FILTERseperately. In the following examples, we will make use of

MAPFto process information.

## Examples

**Example 1: Finding average**

Given a sequence s = (s_{0}, s_{1}, s_{2}, ...),
write a program to produce a sequence which gives the average value of every four steps in the original sequence.

__Step 1: Use references to store variales__Use the word

refto create a reference. Using reference, we are able to store a counter and the current sum at each step. To use

ref,

: use-ref 0 ref { count } \ set count's initial value to 0 count @ . \ get count => display 0 1 count ! \ set count to 1 count @ \ get count => display 1 ;Basically,

refcreates a closure each time it is called and put the top item on the stack into the closure's data section. In this example, when

use-refis run,

refwill create a closure and insert

0into the closure's data section. Then, we bind the closure, which is an XT, to

countso we can access the XT and its data section. You will learn how to implement

refwhen you come to macros.

__Step 2: Calculate the average every four steps__We first bind a reference to

COUNTand another to

SUM. At each step, we will add one to the counter and add the current value to the sum. So, why use references instead of declaring variables outside of

AVGfor

COUNTand

SUM? This is because by declaring variables, we are not able to reset

COUNTand

SUMback to zero everytime we are to implement this code on a new sequence. Continuing with

AVG, if the counter is at four, we will then get the average and flag it with

TRUE. Remember to reset the counter and sum at this point of time. Else, we will just return zero with

FALSEbecause we do not want to print out this value later. Also take note that

AVGonly outputs an XT.

: avg ( v -- xt ) 0 ref { count } 0 ref { sum } [: sum +! 1 count +! count @ dup 4 = if sum @ swap /. true 0 count ! 0 sum ! else 0 false then ;] ;

__Implementation__

1 2 ... avg mapf 5 take .list 2.5 6.5 10.5 14.5 18.5

**Example 2: Runs**

In stock market, it is important to detect a financial bubble. So, using the data on stock price, we would like to obtain runs in order to check for prolonged uptrend or downtrend.

__Step 1: Define a word which detects the sign of a value__

We first define a word which return +1 if the value is positive and -1 otherwise.

: positive-negative ( n -- n ) 0 > if 1 else -1 then ;

__Step 2: Find the changes in the stock price__

CHANGEStakes a sequence of stock prices as arguments and returns another sequence which states that the change in stock price is positive or negative.

: changes ( seq -- seq ) dup tail swap ['] - zipwith ['] positive-negative map ;

__Step 3: Calculate the frequency of consecutive positives or negatives__Using

REF, we again bind one to a counter and one to the current state. If the current value is the same as previous, we add one to the counter and return zero. Otherwise, we return the current count and reset the counter to one. We only flag those numbers with

TRUEif they are not zero.

: freq ( n -- xt ) 0 ref { count } 0 ref { state } [: dup state @ = swap state ! if state @ count +! 0 else count @ state @ count ! then dup 0 = if false else true then ;] ;

__Implementation__

nil 95 cons 105 cons 120 cons 110 cons 100 cons 130 cons 140 cons 150 cons 130 cons 120 cons dup .list 120 130 150 140 130 100 110 120 105 95 ok changes dup .list 1 1 -1 -1 -1 1 1 -1 -1 ok freq mapf .list 2 -3 2 ok

## Quiz

### Question 1

Instead of producing a sequence which gives the average value of every four steps as shown in Example 1, write a program which gives the maximum value of every 10 steps of the original sequence.: avg ( v -- xt ) 0 ref { count } 0 ref { maxx } [: maxx @ max maxx ! 1 count +! count @ dup 10 = if maxx @ true 0 count ! 0 maxx ! else 0 false then ;] ;

### Question 2

Suppose that you are an owner of a start-up company which produces tables and chairs. The profit of selling a table and chair are $100 and $80 respectively. It takes 2 hours and $10 to produce a table while it takes only 1 hour and $7 for a chair. Given that you currently have 100 hours and $500 to produce the first batch of tables and chairs, list out all the possible profits that you can attain without violating the constraints. Assume that you are given a sequence of tuples, in which each tuple shows a possible combination of numbers of tables and chairs.: f ( tuple -- n b ) dup 0 @@ { tables } 1 @@ { chairs } 100 tables * 80 chairs * + tables 2 * chairs 1 * + 100 <= tables 10 * chairs 7 * + 500 <= and ;