Rank Transformation of an Array











up vote
3
down vote

favorite












Is there a built in function which rank transforms an array of data? By rank transformation I mean



data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}


where each value in data is assigned a rank from minimum to maximum where the lowest value in data is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering does not accomplish this as we obtain



Ordering[data]
{3,7,1,2,4,5,6}


Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where



Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])


should equal



SpearmanRho[Transpose[{X,Y}]][[1,2]]


where X and Y are equally lengthed arrays of data.










share|improve this question
























  • What do you want to return when there are ties?
    – Carl Woll
    Nov 26 at 19:20










  • Ah, great question. Give me a moment to respond in this comment with an edit.
    – tquarton
    Nov 26 at 19:30










  • I've actually edited the question to address your point Carl.
    – tquarton
    Nov 26 at 19:38










  • closely related / possible duplicate: How to get the ranked order
    – kglr
    Nov 26 at 22:40















up vote
3
down vote

favorite












Is there a built in function which rank transforms an array of data? By rank transformation I mean



data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}


where each value in data is assigned a rank from minimum to maximum where the lowest value in data is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering does not accomplish this as we obtain



Ordering[data]
{3,7,1,2,4,5,6}


Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where



Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])


should equal



SpearmanRho[Transpose[{X,Y}]][[1,2]]


where X and Y are equally lengthed arrays of data.










share|improve this question
























  • What do you want to return when there are ties?
    – Carl Woll
    Nov 26 at 19:20










  • Ah, great question. Give me a moment to respond in this comment with an edit.
    – tquarton
    Nov 26 at 19:30










  • I've actually edited the question to address your point Carl.
    – tquarton
    Nov 26 at 19:38










  • closely related / possible duplicate: How to get the ranked order
    – kglr
    Nov 26 at 22:40













up vote
3
down vote

favorite









up vote
3
down vote

favorite











Is there a built in function which rank transforms an array of data? By rank transformation I mean



data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}


where each value in data is assigned a rank from minimum to maximum where the lowest value in data is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering does not accomplish this as we obtain



Ordering[data]
{3,7,1,2,4,5,6}


Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where



Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])


should equal



SpearmanRho[Transpose[{X,Y}]][[1,2]]


where X and Y are equally lengthed arrays of data.










share|improve this question















Is there a built in function which rank transforms an array of data? By rank transformation I mean



data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}


where each value in data is assigned a rank from minimum to maximum where the lowest value in data is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering does not accomplish this as we obtain



Ordering[data]
{3,7,1,2,4,5,6}


Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where



Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])


should equal



SpearmanRho[Transpose[{X,Y}]][[1,2]]


where X and Y are equally lengthed arrays of data.







functions data






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 26 at 19:43

























asked Nov 26 at 18:37









tquarton

25717




25717












  • What do you want to return when there are ties?
    – Carl Woll
    Nov 26 at 19:20










  • Ah, great question. Give me a moment to respond in this comment with an edit.
    – tquarton
    Nov 26 at 19:30










  • I've actually edited the question to address your point Carl.
    – tquarton
    Nov 26 at 19:38










  • closely related / possible duplicate: How to get the ranked order
    – kglr
    Nov 26 at 22:40


















  • What do you want to return when there are ties?
    – Carl Woll
    Nov 26 at 19:20










  • Ah, great question. Give me a moment to respond in this comment with an edit.
    – tquarton
    Nov 26 at 19:30










  • I've actually edited the question to address your point Carl.
    – tquarton
    Nov 26 at 19:38










  • closely related / possible duplicate: How to get the ranked order
    – kglr
    Nov 26 at 22:40
















What do you want to return when there are ties?
– Carl Woll
Nov 26 at 19:20




What do you want to return when there are ties?
– Carl Woll
Nov 26 at 19:20












Ah, great question. Give me a moment to respond in this comment with an edit.
– tquarton
Nov 26 at 19:30




Ah, great question. Give me a moment to respond in this comment with an edit.
– tquarton
Nov 26 at 19:30












I've actually edited the question to address your point Carl.
– tquarton
Nov 26 at 19:38




I've actually edited the question to address your point Carl.
– tquarton
Nov 26 at 19:38












closely related / possible duplicate: How to get the ranked order
– kglr
Nov 26 at 22:40




closely related / possible duplicate: How to get the ranked order
– kglr
Nov 26 at 22:40










3 Answers
3






active

oldest

votes

















up vote
5
down vote



accepted










What about this?



Ordering[Ordering[data]]



{3, 4, 1, 5, 6, 7, 2}




Since Ordering is the bottleneck, here a variant that needs only one call to Ordering:



Ranking[data_] := Module[{a},
a = Range[Length[data]];
a[[Ordering[data]]] = a;
a
]


Comparison:



data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First
b = Ordering[Ordering[data]]; // RepeatedTiming // First
a == b



0.13



0.234



True







share|improve this answer























  • Brilliant! This does it. Thanks very much.
    – tquarton
    Nov 26 at 19:13










  • You're welcome.
    – Henrik Schumacher
    Nov 26 at 19:14










  • Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
    – tquarton
    Nov 26 at 19:37


















up vote
1
down vote













I'll answer my own question with a constructed function which does the job:



Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]


Ordering gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.






share|improve this answer






























    up vote
    0
    down vote













    Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]



    {3, 4, 1, 5, 6, 7, 2}




    This gives the same result as Ordering@Ordering@#& if there are no ties in the input data.



    If input data has ties:



    Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]



    {1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}




    It is faster than Ordering@Ordering@#& but slower than Henrik Schumacher's Ranking:



    SeedRandom[1]
    data = RandomReal[{-1, 1}, 1000000];
    a = Ranking[data]; // RepeatedTiming // First



    0.18




    b = Ordering[Ordering[data]]; // RepeatedTiming // First



    0.307




    c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First



    0.226




    a == b == c



    True




    A slightly faster alternative (still slower than Ranking):



    ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
    d = ranks @ data; // RepeatedTiming // First



    0.203




    a == b == c == d



    True







    share|improve this answer























      Your Answer





      StackExchange.ifUsing("editor", function () {
      return StackExchange.using("mathjaxEditing", function () {
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
      });
      });
      }, "mathjax-editing");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "387"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f186727%2frank-transformation-of-an-array%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      5
      down vote



      accepted










      What about this?



      Ordering[Ordering[data]]



      {3, 4, 1, 5, 6, 7, 2}




      Since Ordering is the bottleneck, here a variant that needs only one call to Ordering:



      Ranking[data_] := Module[{a},
      a = Range[Length[data]];
      a[[Ordering[data]]] = a;
      a
      ]


      Comparison:



      data = RandomReal[{-1, 1}, 1000000];
      a = Ranking[data]; // RepeatedTiming // First
      b = Ordering[Ordering[data]]; // RepeatedTiming // First
      a == b



      0.13



      0.234



      True







      share|improve this answer























      • Brilliant! This does it. Thanks very much.
        – tquarton
        Nov 26 at 19:13










      • You're welcome.
        – Henrik Schumacher
        Nov 26 at 19:14










      • Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
        – tquarton
        Nov 26 at 19:37















      up vote
      5
      down vote



      accepted










      What about this?



      Ordering[Ordering[data]]



      {3, 4, 1, 5, 6, 7, 2}




      Since Ordering is the bottleneck, here a variant that needs only one call to Ordering:



      Ranking[data_] := Module[{a},
      a = Range[Length[data]];
      a[[Ordering[data]]] = a;
      a
      ]


      Comparison:



      data = RandomReal[{-1, 1}, 1000000];
      a = Ranking[data]; // RepeatedTiming // First
      b = Ordering[Ordering[data]]; // RepeatedTiming // First
      a == b



      0.13



      0.234



      True







      share|improve this answer























      • Brilliant! This does it. Thanks very much.
        – tquarton
        Nov 26 at 19:13










      • You're welcome.
        – Henrik Schumacher
        Nov 26 at 19:14










      • Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
        – tquarton
        Nov 26 at 19:37













      up vote
      5
      down vote



      accepted







      up vote
      5
      down vote



      accepted






      What about this?



      Ordering[Ordering[data]]



      {3, 4, 1, 5, 6, 7, 2}




      Since Ordering is the bottleneck, here a variant that needs only one call to Ordering:



      Ranking[data_] := Module[{a},
      a = Range[Length[data]];
      a[[Ordering[data]]] = a;
      a
      ]


      Comparison:



      data = RandomReal[{-1, 1}, 1000000];
      a = Ranking[data]; // RepeatedTiming // First
      b = Ordering[Ordering[data]]; // RepeatedTiming // First
      a == b



      0.13



      0.234



      True







      share|improve this answer














      What about this?



      Ordering[Ordering[data]]



      {3, 4, 1, 5, 6, 7, 2}




      Since Ordering is the bottleneck, here a variant that needs only one call to Ordering:



      Ranking[data_] := Module[{a},
      a = Range[Length[data]];
      a[[Ordering[data]]] = a;
      a
      ]


      Comparison:



      data = RandomReal[{-1, 1}, 1000000];
      a = Ranking[data]; // RepeatedTiming // First
      b = Ordering[Ordering[data]]; // RepeatedTiming // First
      a == b



      0.13



      0.234



      True








      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Nov 26 at 19:18

























      answered Nov 26 at 19:09









      Henrik Schumacher

      46k466132




      46k466132












      • Brilliant! This does it. Thanks very much.
        – tquarton
        Nov 26 at 19:13










      • You're welcome.
        – Henrik Schumacher
        Nov 26 at 19:14










      • Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
        – tquarton
        Nov 26 at 19:37


















      • Brilliant! This does it. Thanks very much.
        – tquarton
        Nov 26 at 19:13










      • You're welcome.
        – Henrik Schumacher
        Nov 26 at 19:14










      • Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
        – tquarton
        Nov 26 at 19:37
















      Brilliant! This does it. Thanks very much.
      – tquarton
      Nov 26 at 19:13




      Brilliant! This does it. Thanks very much.
      – tquarton
      Nov 26 at 19:13












      You're welcome.
      – Henrik Schumacher
      Nov 26 at 19:14




      You're welcome.
      – Henrik Schumacher
      Nov 26 at 19:14












      Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
      – tquarton
      Nov 26 at 19:37




      Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
      – tquarton
      Nov 26 at 19:37










      up vote
      1
      down vote













      I'll answer my own question with a constructed function which does the job:



      Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]


      Ordering gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.






      share|improve this answer



























        up vote
        1
        down vote













        I'll answer my own question with a constructed function which does the job:



        Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]


        Ordering gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.






        share|improve this answer

























          up vote
          1
          down vote










          up vote
          1
          down vote









          I'll answer my own question with a constructed function which does the job:



          Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]


          Ordering gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.






          share|improve this answer














          I'll answer my own question with a constructed function which does the job:



          Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]


          Ordering gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 26 at 19:07

























          answered Nov 26 at 18:58









          tquarton

          25717




          25717






















              up vote
              0
              down vote













              Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]



              {3, 4, 1, 5, 6, 7, 2}




              This gives the same result as Ordering@Ordering@#& if there are no ties in the input data.



              If input data has ties:



              Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]



              {1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}




              It is faster than Ordering@Ordering@#& but slower than Henrik Schumacher's Ranking:



              SeedRandom[1]
              data = RandomReal[{-1, 1}, 1000000];
              a = Ranking[data]; // RepeatedTiming // First



              0.18




              b = Ordering[Ordering[data]]; // RepeatedTiming // First



              0.307




              c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First



              0.226




              a == b == c



              True




              A slightly faster alternative (still slower than Ranking):



              ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
              d = ranks @ data; // RepeatedTiming // First



              0.203




              a == b == c == d



              True







              share|improve this answer



























                up vote
                0
                down vote













                Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]



                {3, 4, 1, 5, 6, 7, 2}




                This gives the same result as Ordering@Ordering@#& if there are no ties in the input data.



                If input data has ties:



                Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]



                {1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}




                It is faster than Ordering@Ordering@#& but slower than Henrik Schumacher's Ranking:



                SeedRandom[1]
                data = RandomReal[{-1, 1}, 1000000];
                a = Ranking[data]; // RepeatedTiming // First



                0.18




                b = Ordering[Ordering[data]]; // RepeatedTiming // First



                0.307




                c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First



                0.226




                a == b == c



                True




                A slightly faster alternative (still slower than Ranking):



                ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
                d = ranks @ data; // RepeatedTiming // First



                0.203




                a == b == c == d



                True







                share|improve this answer

























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]



                  {3, 4, 1, 5, 6, 7, 2}




                  This gives the same result as Ordering@Ordering@#& if there are no ties in the input data.



                  If input data has ties:



                  Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]



                  {1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}




                  It is faster than Ordering@Ordering@#& but slower than Henrik Schumacher's Ranking:



                  SeedRandom[1]
                  data = RandomReal[{-1, 1}, 1000000];
                  a = Ranking[data]; // RepeatedTiming // First



                  0.18




                  b = Ordering[Ordering[data]]; // RepeatedTiming // First



                  0.307




                  c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First



                  0.226




                  a == b == c



                  True




                  A slightly faster alternative (still slower than Ranking):



                  ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
                  d = ranks @ data; // RepeatedTiming // First



                  0.203




                  a == b == c == d



                  True







                  share|improve this answer














                  Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]



                  {3, 4, 1, 5, 6, 7, 2}




                  This gives the same result as Ordering@Ordering@#& if there are no ties in the input data.



                  If input data has ties:



                  Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]



                  {1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}




                  It is faster than Ordering@Ordering@#& but slower than Henrik Schumacher's Ranking:



                  SeedRandom[1]
                  data = RandomReal[{-1, 1}, 1000000];
                  a = Ranking[data]; // RepeatedTiming // First



                  0.18




                  b = Ordering[Ordering[data]]; // RepeatedTiming // First



                  0.307




                  c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First



                  0.226




                  a == b == c



                  True




                  A slightly faster alternative (still slower than Ranking):



                  ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
                  d = ranks @ data; // RepeatedTiming // First



                  0.203




                  a == b == c == d



                  True








                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 27 at 0:31

























                  answered Nov 27 at 0:24









                  kglr

                  174k8196401




                  174k8196401






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Mathematica Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f186727%2frank-transformation-of-an-array%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      AnyDesk - Fatal Program Failure

                      How to calibrate 16:9 built-in touch-screen to a 4:3 resolution?

                      QoS: MAC-Priority for clients behind a repeater