Challenges and solutions for creating functions to manipulate arrays in R when the number of dimensions is unknown.
array
s in RBelow I am creating an array of dimensions \(3 \times 2 \times 4\) with each entry containing a unique value.
, , 1
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
, , 2
[,1] [,2]
[1,] 7 10
[2,] 8 11
[3,] 9 12
, , 3
[,1] [,2]
[1,] 13 16
[2,] 14 17
[3,] 15 18
, , 4
[,1] [,2]
[1,] 19 22
[2,] 20 23
[3,] 21 24
class(x)
[1] "array"
You can access the entry \((1, 1, 1)\), i.e. the cell value in first entries of each dimension, in R by:
x[1, 1, 1]
[1] 1
If you want the entries \((i, 1, 1)\) where \(i = 1, 2, 3\) then you can leave the first element blank in R like below:
x[, 1, 1]
[1] 1 2 3
In the above code, the result is a vector but if you wanted to keep the array structure as is then you could add drop = FALSE
like below:
x[, 1, 1, drop = FALSE]
, , 1
[,1]
[1,] 1
[2,] 2
[3,] 3
If you want the entries \((1, j, k)\) where \(j = 1, 2\) and \(k = 1, 2, 3, 4\), then you can leave the first two entries in the square bracket like below:
x[1, , ]
[,1] [,2] [,3] [,4]
[1,] 1 7 13 19
[2,] 4 10 16 22
The above result isn’t actually a vector but a two dimensional array, or more specifically it has the classes matrix
and array
.
class(x[1, , ])
[1] "matrix" "array"
I can modify elements in an array by using the assignment operator (<-
or =
) like below:
x3 <- x2 <- x
x2[1, , ] <- NA
x2
, , 1
[,1] [,2]
[1,] NA NA
[2,] 2 5
[3,] 3 6
, , 2
[,1] [,2]
[1,] NA NA
[2,] 8 11
[3,] 9 12
, , 3
[,1] [,2]
[1,] NA NA
[2,] 14 17
[3,] 15 18
, , 4
[,1] [,2]
[1,] NA NA
[2,] 20 23
[3,] 21 24
, , 1
[,1] [,2]
[1,] 1 2
[2,] 2 5
[3,] 3 6
, , 2
[,1] [,2]
[1,] 3 4
[2,] 8 11
[3,] 9 12
, , 3
[,1] [,2]
[1,] 5 6
[2,] 14 17
[3,] 15 18
, , 4
[,1] [,2]
[1,] 7 8
[2,] 20 23
[3,] 21 24
Up to this point, it’s pretty straight forward. But let’s say now we create a function that returns the first element of the first dimension.
index_first <- function(x) {
x[1, , ]
}
index_first(x)
[,1] [,2] [,3] [,4]
[1,] 1 7 13 19
[2,] 4 10 16 22
[,1] [,2]
[1,] 1 1
[2,] 1 1
[3,] 1 1
[4,] 1 1
The above function works fine for the arrays x
and y
. But what if the number of dimension is different?
So how do we change our function so it works for an array of any number of dimensions? This is where it gets quite challenging. And while I’m at it, let me throw another challenge.
Suppose now I want a function that modifies the entries in the first element of the first dimension by a user supplied value.
modify_first <- function(x, value) {
x[1, ,] <- value
}
Again this works fine until we have an array with different number of dimensions.
modify_first(z, 1)
Error in x[1, , ] <- value: incorrect number of subscripts
So how would you modify the function so this can be generalised for arrays with a different number of dimensions?
In the first instance, it’s useful to know that the square brackets are in fact functions so the codes below are equivalent:
`[`(x, 1, , )
[,1] [,2] [,3] [,4]
[1,] 1 7 13 19
[2,] 4 10 16 22
x[1, , ]
[,1] [,2] [,3] [,4]
[1,] 1 7 13 19
[2,] 4 10 16 22
The assignment operator for arrays can be written like below where the last argument is the value to replace the indexed array.
`[<-`(x2, 1, , , 0)
, , 1
[,1] [,2]
[1,] 0 0
[2,] 2 5
[3,] 3 6
, , 2
[,1] [,2]
[1,] 0 0
[2,] 8 11
[3,] 9 12
, , 3
[,1] [,2]
[1,] 0 0
[2,] 14 17
[3,] 15 18
, , 4
[,1] [,2]
[1,] 0 0
[2,] 20 23
[3,] 21 24
Below is similar.
x2[1, , ] <- 0
I say similar because the above actually modifies x2
but the call before that didn’t. Below is the actual equivalent operation.
x2 <- `[<-`(x2, 1, , , 0)
modify_first <- function(x, value) {
d <- dim(x)
do.call("[<-", c(list(x, 1), rep(list(bquote()), length(d) - 1), list(value)))
}
modify_first(x, 3)
, , 1
[,1] [,2]
[1,] 3 3
[2,] 2 5
[3,] 3 6
, , 2
[,1] [,2]
[1,] 3 3
[2,] 8 11
[3,] 9 12
, , 3
[,1] [,2]
[1,] 3 3
[2,] 14 17
[3,] 15 18
, , 4
[,1] [,2]
[1,] 3 3
[2,] 20 23
[3,] 21 24
modify_first(z, 3)
, , 1, 1
[,1] [,2]
[1,] 3 3
[2,] NA NA
, , 2, 1
[,1] [,2]
[1,] 3 3
[2,] NA NA
, , 1, 2
[,1] [,2]
[1,] 3 3
[2,] NA NA
, , 2, 2
[,1] [,2]
[1,] 3 3
[2,] NA NA
So you might wonder when you need such a result. I actually used this for the edibble
R-package to create a kind of generalised version of Latin square design, i.e. an array that kind of stitches up multiple Latin squares.
set.seed(1)
edibble::latin_array(dim = c(3, 3, 3), nt = 3)
, , 1
[,1] [,2] [,3]
[1,] 1 3 2
[2,] 3 2 1
[3,] 2 1 3
, , 2
[,1] [,2] [,3]
[1,] 2 1 3
[2,] 1 3 2
[3,] 3 2 1
, , 3
[,1] [,2] [,3]
[1,] 3 2 1
[2,] 2 1 3
[3,] 1 3 2
Beyond the above, I’m not sure who needs to manipulate arrays with dynamic dimensions. If you have a use case, I’d love to know.
The above challenges are sort of challenges that I hope to include in the Advanced R Programming unit that’s planned for Honours level in the Business Analytics major at Monash University. If you want to learn more about R as a programming lanuage (instead of a data analysis tool) then I’d recommend the Advanced R book by Hadley Wickham.
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/emitanaka/emitanaka.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Tanaka (2022, Jan. 18). emi tanaka: Manipulating arrays with dynamic dimensions in R. Retrieved from https://emitanaka.org/posts/2022-01-18-manipulating-arrays-in-R/
BibTeX citation
@misc{tanaka2022manipulating, author = {Tanaka, Emi}, title = {emi tanaka: Manipulating arrays with dynamic dimensions in R}, url = {https://emitanaka.org/posts/2022-01-18-manipulating-arrays-in-R/}, year = {2022} }