Jump to content

Talk:Row- and column-major order/Archive 1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1

2dimensional and 3 dimensional arrays

what is the general formula for locating an item in a 2dimensional and 3dimensional array?

It depends on the row-major or column major storage method and how it is extended to higher dimensions. Math-wise, an three dimensional (x,y,z) coordinate is pretty common, and you can assume (1,2,3) refers to the x=1, y=2, z=3 point. if you were going to serialize this into a set of numbers, you might step through your whole array along x first, then y , then z, in a row-major-ish way, or along z (the right-most dimension) first, then y, then x in a column-major-ish way. If you have nx, ny, and nz slots along each of the dimensions, then row-major wise, x varies fastest, so you 'd locate the (x,y,z)th item at z*ny*nx+y*nx+x -- Column-major-wise z varies fastest and x varies slowest, so (x,y,z) would be at the z+y*nz+x*ny*nz cell. Of course there's lots of opportunity for off-by-one errors depending on whether your arrays are zero or one based. Drf5n 17:49, 25 August 2006 (UTC)

Multi-dimensional arrays

The bit on multi-dimensional arrays is a little difficult when you start thinking of what 'column' means in a three or higher dimensional tuple. For instance in a three dimensional (x,y,z) array is it clear that 'x' is the row and 'z' (definitely not 'y') is the column? Or for an n-dimensional (d1,d2,d3,...,dn) array is the row-major/column-major distinction better as something like row-major means d1 varying fastest / is the inner loop versus column-major means dn varying fastest / is the inner loop? Beyond 2d, the column-major concept seems a little fuzzy. Drf5n 18:08, 25 August 2006 (UTC)

It's defined precisely in the article. The terminology is less apt in that case, I agree, but it is totally standard. — Steven G. Johnson (talk)

What are advantages/disadvantages of row-major vs col-major?

It would be a helpful addition to this article to tell us why it's one way in C and the other in Fortran. Are they different for a reason, or was it a random decision? —The preceding unsigned comment was added by Msouth@gmail.com (talkcontribs) 17:32, 4 January 2007 (UTC).

Row major is the most common representation because it means that the rows of a 2-D array are just sub-arrays within a larger array. In C, for instance, multidimensional arrays are not defined directly, but rather as being arrays of arrays. You can't do this with a column major representation, because the inner-most array is interleaved throughout the outer array. Row major representation is also going to give you better cache locality where you are accessing each sub-array independently, such as an array of fixed-length strings.
An advantage of column major is that it may make more sense for vector operations, which is important for scientific computing. Consider the matrix row operations of adding one row to all other rows. You would proceed by multiplying each element in the first column by the first element of the row, and then the next column, and so on. Since each column is contiguous in a column major representation, you will get better cache locality, and so fewer cache misses and better performance.
Something like this needs to be in the article, but I'm not comfortable adding it without a citation, which I don't have. AaronWL 11:05, 22 March 2007 (UTC)
I think if you are making a language it's an arbitrary decision. However, it is important that a programmer using that language knows what order the language uses. When defining the array, the programmer decides how to arrange the axes (xyz, zyx, xzy, etc), and it is up to them to arrange the axes in a way that enhances performance. The programmer can't do this without knowing how the language is going to unfold that multidimensional array (ie which major-order it uses).24.124.100.8 05:08, 8 November 2007 (UTC)
In principle, there aren't really any intrinsic advantages one way or the other. As long as you know what the ordering is, you can order your operations accordingly to maximize locality. (In Aaron's example, for row-major order you would process entire rows at a time, rather than processing one column at a time in column-major, to maximize locality.)
In practice, row-major has an advantage in that human programmers tend to order their loops in the same order as the indices. That is, a programmer will tend to write:
           for i = 1 to m
              for j = 1 to n
                 do something with the array element Ai,j
rather than
           for j = 1 to n
              for i = 1 to m
                 do something with the array element Ai,j
The former loop ordering maximizes locality for row-major order, and the latter maximizes locality for column-major order. As a consequence, many Fortran programmers accidentally put their loops in the wrong order and get poor cache performance, and considerable effort was invested into developing compiler optimizations that could fix the loop ordering.
This kind of thing is discussed in lots of Fortran optimization guides precisely because it is so easy to get wrong (you don't see it to the same extent in C optimization guides). It's probably possible to track down a reputable source giving this as a motivation for preferring row-major order. — Steven G. Johnson (talk) 22:47, 11 March 2010 (UTC)

Confusing nomenclature

I would suggest that there needs to be a clarification that even though memory is stored row or column major in different languages, Matlab and Fortran still INDEX their arrays in a row-major format (i.e. a(2,3) is the element in the second row, third column of matrix a). Can anyone discuss a little on this topic? I got in a hideous argument with a coworker (who programs almost entirely in C++ and I in Fortran) over this subject, and we had to open matlab to settle the argument. We were both stumped.

Andykass (talk) 17:21, 6 May 2010 (UTC)

Row-major and column-major are descriptions of mappings from multidimensional indices to one-dimensional indices (e.g. linear memory). By itself, a multidimensional index like a(2,3) is neither "row-major" nor "column-major". — Steven G. Johnson (talk) 20:21, 6 May 2010 (UTC)
Ah, got it. Thank you. Would you think that point would be valuable on the main page, just to avoid confusion such as mine? Andykass (talk) 16:42, 7 May 2010 (UTC)

Yeah, the whole thing is confusing as is. As I understand it, C &co do this:

array[row][col]

While VB &co do this:

array(col, row)

The array is stored the same in memory, row after row, as you'd expect. And this is why e.g. in VB the Redim Preserve statement can only touch the last dimension, since the array may have to be reallocated and this may need a memcopy. —Preceding unsigned comment added by 192.87.151.1 (talk) 11:00, 11 February 2011 (UTC)

Title

The article's title is Row-major order while it isn't specific to row-major order only; it also fully describes column-major order. Wouldn't it make more sense to rename the article Row-major and column-major order? Column-major order is used by multiple popular applications/libraries/etc. and it isn't uncommon to include both alternatives in a title. ctxppc (talk) 16:51, 8 May 2013 (UTC)

Agreed. Perhaps array index ordering or something all-encompassing? —Ben FrantzDale (talk) 13:35, 9 May 2013 (UTC)
I agree with Row-major and column-major order or Row- and column-major order or similar. I don't think we should go more generic than that; other possible indexing schemes (e.g. Morton order) should have their own articles. — Steven G. Johnson (talk) 14:17, 9 May 2013 (UTC)
Moved to Row- and column-major order. — RFST (talk) 10:05, 19 November 2016 (UTC)

C example vs fortran example

If the intent is to contrast row-major and column-major, then do it in the same language. Switching language just confuses the issue, not to mention the fact that the one language has 0 based arrays and the other 1 based. The point here is to present the differences between column-major and row-major, so the clearest way to do that is to present only those differences. If you want a Fortran example, do it afterwards in both row and column major.

Besides the fact that there are vastly more C programmers than there are Fortran programmers, so I find it questionable to use Fortran as an example language altogether. It seems far more likely that a random reader would be familiar and comfortable with C than with Fortran. — Preceding unsigned comment added by 2001:8000:1434:E00:4DD3:41C0:4A5E:A94C (talk) 00:31, 27 April 2017 (UTC) e: oh, I see. It's because Fortran itself uses column-major, while C uses row-major. Well, still pretty confusing... — Preceding unsigned comment added by 2001:8000:1434:E00:4DD3:41C0:4A5E:A94C (talk) 00:34, 27 April 2017 (UTC)

My disagreement here would be that part of the issue is specifically that various languages and libraries have each adopted a particular order, and that those decisions have been influenced by other languages and libraries. In this sense, C and FORTRAN are often presented as prime examples, so a forced attempt to present the subject with a neutralised approach would even be counterfactual. Of course, in every language you can roll your own multi-dimensional arrays based on underlying linear storage, but this is not the "normal" usage, so it should definitely not replace a language-differential presentation. There is still such a thing as culture and history, also in computer science, and an encyclopedia should not be designed out of ignorance. — RFST (talk) 19:11, 11 November 2017 (UTC)

Is it possible that the Fortran table is totally wrong and actually demonstrates Row-Major order? Maybe I fried my brain trying to understand this already today and I have missed something but it seems that for example the Fortran table "access A(2,3), value a23" means row 2, column 3. This issue is confusing enough without Wikipedia having an issue like this in it. Could someone please confirm I am not going insane? — Preceding unsigned comment added by 178.19.210.162 (talk) 17:05, 20 March 2018 (UTC)

the C part of this article seems to be wrong

C does not access multi dimensional arrays in the form this page shows. it might even store how this page says, but the way to acess it is [y][x] not how is show in this example

sources: all my source urls were blacklisted... so heh, just google multidimensional arrays in C

well at least Visual Studio 2017 C++ compiler agrees with it


I think this is very confusing too, especially with the "Address" column.
The column-major table for C would tend to suggest it's possible to address multidimensional arrays in C in a column-major fashion.
Plus, the Address column adds to the confusion by suggesting (IMO) that A[0][0] and A[1][0] are contiguous in memory.
Moreover, in column-major order, I would expect A[0][1] to be a21 and A[1][0] to be a12, for example.
As far as I can see, the row-major and column-major C tables are the same, the entries are just not in the same order.
I dare not editing the article because I probably just don't understand the point it's trying to make, but we can at least agree the phrasing and overall presentation are very confusing. --Cylardor (talk) 03:29, 15 January 2020 (UTC)

Tiled/non-linear/swizzled encodings

For matrices accessed in both directions with spatial locality in both directions, one alternative to row- and column- major orders is to store the matrix in a "tiled" format, so iterating either rows or columns allows the caches to kick-in. There's a link to the Morton order page in the "see also" section, which is good (Morton order is one particular scheme for tiling; many exist). These tiled schemes are popular on GPUs to access textures, but I believe are also employed in scientific computing to optimize cache behaviour of matrices where either row-major or column-major would be suboptimal (c.f. https://en.wikipedia.org/wiki/Z-order_curve#Linear_algebra)

It might be nice to add info on this. Arzg (talk) 13:45, 9 November 2019 (UTC)

Archive 1