Conversation
|
OK, I will convert to draft and you can mark it as ready for review when you feel you've supplied all the missing bits. |
|
Hi, this is ready for review. |
gwhitney
left a comment
There was a problem hiding this comment.
I can't comment on the utility/advisability of exposing these functions in the top-level user interface of mathjs,but I am assuming that's something you've already worked out with Jos. So I am explicitly not attempting to judge whether these functions should be added. Presuming they should be, here's my review of the PR to add them.
src/expression/embeddedDocs/function/matrix/broadcastMatrices.js
Outdated
Show resolved
Hide resolved
| syntax: [ | ||
| 'broadcastSizes(sizeA, sizeB)' | ||
| ], | ||
| description: 'Broadcast the sizes of matrices to a compatible size', |
There was a problem hiding this comment.
Similarly, I am concerned about at least the grammatical correctness here. It's not that any sizes are being broadcast here, per se, is it, but rather that the size resulting from a broadcast is being computed, right? So shouldn't the description be something more like "Compute the size that would result from broadcasting a list of matrices of the given sizes, if possible"? (Again, this function can throw an error if the sizes are incompatible, correct?)
This also observation also, for me, calls into question the name of the function. Would broadcastSize or sizeOfBroadcast be more descriptive, again since no sizes are being broadcast, per se?
There was a problem hiding this comment.
The terminology used by numpy is
numpy.broadcast_shapes(*args)
Broadcast the input shapes into a single shape.
I think I understand where are you coming from, because the original sizes are kept intact. But maybe it's an implicit definition, because when one adds numbers, nothing happens to the numbers, we could say is to compute the result from adding numbers.
Yes this function will throw an error for incompatible sizes.
There was a problem hiding this comment.
No, I meant that it's the matrices that are broadcast, not their sizes. This function does not actually do any broadcasting. It just computes a size, and so it should be named accordingly, I think. Your thoughts?
There was a problem hiding this comment.
Any further thoughts here? I am recommending:
- rename the function to either
sizeOfBroadcastorbroadcastSize, and - describe it as
Compute the size of a matrix that would result from broadcasting matrices of the given sizes, if they are compatible
because, as a matter of the plain meanings of the words, a matrix/array can be broadcast, but a size cannot be broadcast.
There was a problem hiding this comment.
Sorry, even though I had a response planned from the first comment I was debating myself on how to present the response.
Broadcasting, when applied to a matrix's shape, does not imply treating the shape itself as a data structure to be broadcasted. Much like the phrase 'running water' describes flow rather than the physical stride of a 'running person,' broadcasting a shape is a functional convention.
I don't know if the convention all comes from numpy, but it's a common trend stdlib: broadcast-shapes
I think the name of the function is ok as is, but it could be described as you suggest to avoid a circular definition.
There was a problem hiding this comment.
Sure, we should just strive for both the clearest name of the function and the clearest description of what it does. If you think that "broadcastSizes(s, t, u)" is clearer than "broadcastSize(s, t, u)", I am not going to quibble. But I do prefer a more explicit description, thanks.
| @@ -0,0 +1,15 @@ | |||
| export const broadcastToDocs = { | |||
| name: 'broadcastTo', | |||
There was a problem hiding this comment.
I worry about the name of this function. It seems to me that since sizes look like matrices, visually, broadcastTo([3], [2, 2]) could look like it is supposed to broadcast the first matrix to be compatible with the second, i.e. produce [3,3] rather than [3, 3; 3, 3]. I would strongly recommend considering renaming the function to broadcastToSize([3], [2, 2]) to avoid this ambiguity.
There was a problem hiding this comment.
I understand. Many of these are taken from numpy and have counterparts in jax / mlx / pytorch and maybe others.
numpy.broadcast_to(array, shape, subok=False)
Broadcast an array to a new shape.
I don't have a strong opinion on this, just please review if it makes sense to follow that convention.
There was a problem hiding this comment.
I am the one less familiar with the territory here. That's why this was couched as a suggestion. Please select the name you think is best, including leaving it be, unless @josdejong weighs in otherwise. Please just post your final decision here.
There was a problem hiding this comment.
Again, to land this PR we need a decision on the final name here. If you are on the fence, I recommend switching to broadcastToSize() in an effort to steer away from the ambiguity I raised. It seems to me a more fully specified name can't be harmful here. (And I don't think we need to worry too much about fidelity to numpy names, since after all to begin with what they call a "shape" we call a "size" so we're not adopting numpy terminology right from the start.)
There was a problem hiding this comment.
I agree about the differences between shape and size. I'm considering broadcastToSize(), will review and resolve this comment.
| export const createBroadcastMatrices = /* #__PURE__ */ factory(name, dependencies, ({ typed }) => { | ||
| /** | ||
| * Broadcast multiple matrices together. | ||
| * Return and array of matrices with the broadcasted sizes. |
There was a problem hiding this comment.
Typo: "and" -> "an"
This documentation is way too terse for someone who's not already familiar with the operation of broadcasting matrices (which is not necessarily all that common or standard) to understand what is going on. Somewhere in the documentation needs to be a careful documentation from the ground up with examples what it means to broadcast two or more matrices. That could be here, or it could be elsewhere (like in the general matrix documentation page) and then be linked to here. Such documentation might already exist, and then all you need is a link.
This documentation should say what happens with incompatible sizes.
Finally, you have "sizes" plural. But isn't it the case that there is only one common size produced by broadcasting a list of matrices?
There was a problem hiding this comment.
OK, the typo/grammar issues are gone, but so is any explanation, so far as I can see. Either the broadcasting operation needs to be described in detail here, or there needs to be a link to somewhere that it is described. The documentation cannot simply assume that the reader knows what it means to "broadcast matrices against each other." That's not a standard, well-known operation. Also, there needs to be a description of the compatibility requirements on the matrices and what happens if those requirements don't hold.
|
|
||
| export const createBroadcastSizes = /* #__PURE__ */ factory(name, dependencies, ({ typed }) => { | ||
| /** | ||
| * Calculate the broadcasted size of one or more matrices or arrays. |
There was a problem hiding this comment.
As per my comments on the internal docs, shouldn't this be something more like "Calculate the size that would result from broadcasting one or more matrices or arrays, given the sizes of the input collections."?
The same comments about having documentation on the operation of broadcasting either here or linked here apply to this function as well. Also mention of what happens with incompatible sizes.
There was a problem hiding this comment.
Still needs further editing/documentation.
test/unit-tests/utils/array.test.js
Outdated
| @@ -702,9 +702,9 @@ describe('util.array', function () { | |||
| }) | |||
|
|
|||
| it('should broadcast leave arrays as such when only one is supplied', function () { | |||
There was a problem hiding this comment.
I know you didn't create these problems, but there are typos/ungrammaticality in the labels of both this test and the following one. Please fix.
| /** | ||
| * Broadcast a matrix or array to a specified size. | ||
| * | ||
| * The input collection is conceptually expanded to match the given dimensions, |
There was a problem hiding this comment.
Maybe instead "entries of the input collection are duplicated to match the given size," ?
| * | ||
| * The input collection is conceptually expanded to match the given dimensions, | ||
| * following broadcasting rules. The returned object is a new matrix or array | ||
| * with the requested size; the original input is not modified. |
There was a problem hiding this comment.
Where do I find these "broadcasting rules"?
There was a problem hiding this comment.
Good question, I don't think broadcasting is described with specific rules, the chapter can be found at broadcasting.
The best source I've found is from numpy.
https://numpy.org/doc/stable/user/basics.broadcasting.html
there is one from octave
https://docs.octave.org/latest/Broadcasting.html#Broadcasting-1
I personally didn't know about this topic until a few years ago even after using Matlab/Octave extensively. The links I'm sharing is not an assumption of anyone's knowledge, just sharing them to try to answer the question.
I don't know what would be best, extend the chapter, have a better phrasing of "broadcasting rules" or something else.
There was a problem hiding this comment.
Good question, I don't think broadcasting is described with specific rules, the chapter can be found at broadcasting.
All good. Just make that phrase "broadcasting rules" a link to that spot in the on-line docs, and make any fixes/additions you deem valuable to that section on broadcasting (for example, at least the first example needs to be corrected, as [1,2] + 3 = [4,5], not [3,4] as shown). Then all will be well. Similaly, the doc sections in the broadcast functions themselves should link to that broadcasting link. Thanks!
Yes. I took this comment as an OK. Part of the argument is that these are exposed by numpy even if broadcasting is deeply integrated. Also during the implementation of broadcasting there were some discussions about the specific functions. I think this means it's ok, but if not please let me know. |
Yes, you convinced (an initially skeptical) Jos so all OK :) |
|
Glen, thanks for reviewing the work of David. I indeed think it's a good idea to add these functions. |
| category: 'Matrix', | ||
| syntax: [ | ||
| 'broadcastMatrices(A, B)' | ||
| ], |
There was a problem hiding this comment.
Should the syntax read 'broadcastMatrices(A, B, ...)' since any number of arguments are allowed?
| const result = M.create() | ||
| result._size = size.valueOf() | ||
| result._data = broadcastTo(M.valueOf(), size.valueOf()) | ||
| result._datatype = M.datatype() |
There was a problem hiding this comment.
We don't want to be breaking encapsulation of the internal format of matrices here. In particular, this implementation is supposed to work with an arbitrary Matrix implementation, based on its typed-function signature. Hence it can't delve into the internal fields, as they are assuming M is a DenseMatrix, which it might not be.
Therefore, this should be M.create(broadcastTo(M.valueOf(), size.valueOf()), M.datatype()). I understand your concern about unnecessarily recomputing the size. If you really want to get around that, there are some options:
- Decide that broadcasting always returns a DenseMatrix, and use DenseMatrix creation methods that take your word for the size, if there are any such methods. This plan might not be wise at a time when you/we are contemplating adding other general-purpose matrix implementations besides DenseMatrix, creating a world in which we would not want operations to capriciously convert back into DenseMatrix.
- Extend the interface of M.create() to optionally take a guaranteed size and/or other validation-short-circuiting options. That might require a number of coordinated changes in the Matrix classes.
Of course I am open to other ideas. But we don't want DenseMatrix-internals-specific code here. This observation also suggests there should be unit tests for the broadcasting functions on SparseMatrix arguments. Please add some if they aren't there.
There was a problem hiding this comment.
For this case I think it can be left as you mention M.create(broadcastTo(M.valueOf(), size.valueOf()), M.datatype()). Maybe for the future some options could be added to the create matrix to skipCloning, skipValidation, skipPreProcess, etc.
Hi this is according to discussion #3516
It's missing some tests and type checking.