Wednesday, April 27, 2011

"Inside" Functors -- Multiple Arguments

(The examples here work with the version of insidefunctor tagged as "v1")

Say we want to support something like

> each(x) + each(y)

If we're going to call a function on multiple arguments, each one of which might specify some new behavior, we have to resolve the conflict somehow. For a start, let's say we give each argument a "level" and call one of them the "winner".

> apply.check.functor = function(func, args) {

> if (length(args) == 0) {

> return(func())

> }

> functor.levels = lapply(args, function(x) {

> if (is.inside.functor(x)) {

> level(x)

> }

> else {

> 0

> }

> })

> winner.i = which.max(functor.levels)

> winner.arg = args[[winner.i]]

> if (!is.inside.functor(winner.arg)) {

> do.call(func, args)

> }

> else {

> apply.functor(winner.arg, func, args)

> }

> }

This means we must also modify fmap to pass on multiple arguments:

> fmap = function(func) {

> params = formals(args(func))

> new.func = function() {

> .args = as.list(environment())

> apply.check.functor(func, .args)

> }

> formals(new.func) = params

> new.func

> }

And now apply.functor.each is going to have to do the work of reconciling the possibly competing messages:

> apply.functor.each = function(inside, func, args, caller) {

> our.level = level(inside)

> args.boxed = args

> for (i in seq_along(args.boxed)) {

> arg = args.boxed[[i]]

> if (is.inside.functor(arg) && level(arg) >= our.level) {

> if (length(inside$items) != length(arg$items)) {

> stop("Axis mismatch: ", inside, " and ", arg)

> }

> }

> else {

> args.boxed[[i]] = insert.each(inside, arg)

> }

> }

> items = list()

> for (i in seq_along(inside$items)) {

> piece.args = lapply(args.boxed, function(arg) {

> arg$items[[i]]

> })

> res = apply.check.functor(func, piece.args)

> items[[i]] = res

> }

> each(items)

> }

This insert.each is new: it pulls an argument into the functor by broadcasting it along the axis being iterated over:

> insert.each = function(inside, obj) {

> each(lapply(inside$items, function(.) obj))

> }

We need to define that level method. For now just make all levels 1 until we think of a good reason to make them otherwise.

> level = function(...) {

> UseMethod("level")

> }

> level.each = function(...) {

> 1

> }

Then retrieve those functions from the last post:

> is.inside.functor = function(...) {

> UseMethod("is.inside.functor")

> }

> is.inside.functor.default = function(...) {

> F

> }

> is.inside.functor.each = function(inside) {

> T

> }

> apply.functor = function(...) {

> UseMethod("apply.functor")

> }

> each = function(arg) {

> inside = list(items = arg)

> class(inside) = "each"

> inside

> }

And see if this gives something reasonable:

> x = list(1, 2, 3)

> y = list(4, 5, 6)

> `%+%` = fmap(`+`)

> each(x) %+% each(y)

$items
$items[[1]]
[1] 5

$items[[2]]
[1] 7

$items[[3]]
[1] 9


attr(,"class")
[1] "each"

> each(x) %+% 1

$items
$items[[1]]
[1] 2

$items[[2]]
[1] 3

$items[[3]]
[1] 4


attr(,"class")
[1] "each"

Now we can almost run that code from the beginning, fmap still has a problem:

> `%:%` = fmap(`:`)

> print(`%:%`)


function () { .args = as.list(environment()) apply.check.functor(func, .args) } <environment: 0x2fdd930>

The problem is `:` does not have any formal parameters. seq will fail too because it's parameters are '...'. These can be solved, but for now define new functions

> seq. = fmap(function(a, b) {

> seq(a, b)

> })

> sum. = fmap(function(x) {

> sum(x)

> })

> sum.(seq.(1, each(x)))

$items
$items[[1]]
[1] 1

$items[[2]]
[1] 3

$items[[3]]
[1] 6


attr(,"class")
[1] "each"

This opens up a real opportunity. Languages like R and Matlab already support something very similar to each(): for numeric vectors, x + y means add up the corresponding elements. And "corresponding" means having the same sequential position.

But just because two vectors have the same length does not mean they correspond. And normally R will not check that for you. But using inside functors we can check.

Since the above functions are still rather incomplete and this is getting to be a lot of code sitting around in one place, for what follows I am going to use the package insidefunctor from https://github.com/ellbur/r-inside-functor. So let's load the package:

> rm(list = ls())

> library(insidefunctor)

In the insidefunctor package, each is slightly more generalized. Anything can be eached if it supports the methods

  • unpack(object)
  • pack(object, items)
  • make.axis(object)

The package already defines these functions for vectors and lists. Let's make a new kind of object that remembers the dimension it runs along.

> as.dimension = function(items) {

> dimension = list(items = items, id = next.dimension.id())

> class(dimension) = "dimension"

> dimension

> }

> unpack.dimension = function(dimension) {

> dimension$items

> }

> pack.dimension = function(dimension, items) {

> dimension$items = items

> dimension

> }

> make.axis.dimension = function(dimension) {

> seq = seq_along(dimension$items)

> attr(seq, "id") = dimension$id

> seq

> }

> dimension.id.counter = 0

> next.dimension.id = function() {

> dimension.id.counter <<- dimension.id.counter + 1

> dimension.id.counter

> }

Setting the 'id' attribute of the returned axis ensures that each will not let you line up two dimensions whose ids differ.

Let's check that code:

> x = as.dimension(c(1, 2, 3))

> y = as.dimension(c(4, 5, 6))

> `%+.%` = fmap(`+`)

> sq. = fmap(function(z) z^2)

> try(collect(each(x) %+.% each(x)), silent = T)

$items
$items[[1]]
[1] 2

$items[[2]]
[1] 4

$items[[3]]
[1] 6


$id
[1] 1

attr(,"class")
[1] "dimension"

> try(collect(each(x) %+.% sq.(each(x))), silent = T)

$items
$items[[1]]
[1] 2

$items[[2]]
[1] 6

$items[[3]]
[1] 12


$id
[1] 1

attr(,"class")
[1] "dimension"

> try(collect(each(x) %+.% each(y)), silent = T)

> geterrmessage()

Error in apply.functor.each(winner.arg, func, args, apply.check.functor) : Axis mismatch: 11list(items = c(1, 2, 3), id = 1)c(1, 2, 3)1:3list() and 11list(items = c(4, 5, 6), id = 2)c(4, 5, 6)1:3list()

Excellent. We can add x to itself or something calculated from itself, but we can't add x to y because we haven't told each that those variables lie along the same axis -- maybe they don't.

If we want them to correspond, we can say so explicitly.

> align = function(dim1, dim2) {

> if (length(dim1$items) != length(dim2$items)) {

> stop("Cannot align; lengths differ")

> }

> dim1$id = dim2$id

> dim1

> }

> y = align(y, x)

> try(collect(each(x) %+.% each(y)))

$items
$items[[1]]
[1] 5

$items[[2]]
[1] 7

$items[[3]]
[1] 9


$id
[1] 1

attr(,"class")
[1] "dimension"

Now at least it can't happen by accident.

No comments:

Post a Comment