Jump to content

The "Analysis of Nabla scoring system" thread


Recommended Posts

The Nabla scoring system is one in which players are given a score based on how they performed compared to other people who played the same scenario on the same side as them.

IE If you played Allies in "Station Haapsalu", then your score is compared to the score of all the other allied players in that scenario, and the better you did compared to them, the higher your Nabla score is. The difference from your score to the average score is the basis for your Nabla score.

This is supposed to mean that the Nabla score selects the best players in a tournament, irrespective of which side they get assigned to, and irrespective of whether the scenarios were balanced.

Sounds like a great idea. It's the basis for the ROW tournament.

Over in the ROW V thread, an interesting result was published. It showed that of the 25 slots that make up the top 5 positions across the 5 games in round one, 23 of them were occupied by Allies, based on Nabla scoring.

This can be interpreted as saying "for all 5 scenarios, the 'allied' side just happened to be assigned to the better players".

At first glance, this is too much of a coincidence to be believed.

I'd be interested to hear people's analysis of what this result means, and whether our assumptions about what Nabla scoring is doing are correct.

GaJ.

Link to comment
Share on other sites

  • Replies 212
  • Created
  • Last Reply

Top Posters In This Topic

People have started to wonder whether Nabla in fact favours the players who have the harder battle in an inbalanced scenario. "If it is hard to do well, and you manage to do well, then you get a high Nabla score".

This has some appeal. But if it is true it would mean that:

1) The Allies had the harder time in all 5 battles

_and_

2) At least 5 people assigned to Allies managed to do way better than all the others.

Is this/could this be the case?

GaJ

Link to comment
Share on other sites

I'll re-cap a few relevent comments from the main ROW thread:

1) Overall average score for the allies (average of the sum of averages) was 44%. While they did have it tough in two scenarios, overall the allies scored only 6% lower--not a large margin.

2) Intuitively, it seems that winning big in a hard battle gives you a higher score than doing so in an "easy" battle because you can beat the average by many more percentage points, however, the formula by which the nabla scores are derived from the "normalized difference from the mean" is not known. I can tell that it isn't at all linear, but whether it fully corrects for the hardness of a battle, I can't tell without more info or more experimentation with the numbers.

3) The sample size is really, really small.

I would really like to know more about the formula used. I thought I had figured out that it did take into account the fact that the mean may not be anywhere near 50%, but now I'm not sure. Someone who knows please weigh in...

Link to comment
Share on other sites

Oh, no offense meant by the "hysteria" comment in the other thread. Just seemed like comclusions were being jump on that weren't necessarily implied by the data.

I obviously think it's an interesting topic as well. Coming up with a formula to pick the best players out of a tourney like this is tough. Nabla is good, but perhaps some refinement could come from this kind of discussion, and applying alternate formulas to the data.

If the degree of difficulty for a given side is not already addressed in the formula, it needs to be.

Link to comment
Share on other sites

I had made the assumption (perhaps wrongly) that "normalised difference from the mean" didn't have any implied extra calculation.

IE it is simply a division of the individual's difference from the mean by the maximum difference from the mean.

If this is really applied, then the arguments about higher Nabla scores being more likely in harder scenarios (where one player is able to achieve something others can't) don't quite hold water. That player's "difference from the mean" will be "normalised", so his outstanding achievement due to the fact that the situation was really tough is normalised away. The most outstanding player in another situation, where it was easy and all players got a raw score of >90, so only 10 points separates them, still should get the same Nabla score as the most outstanding player in a scenario where some people got 10 points and he managed 60.

Rambling thinking out loud!

GaJ.

Link to comment
Share on other sites

Originally posted by GreenAsJade:

This can be interpreted as saying "for all 5 scenarios, the 'allied' side just happened to be assigned to the better players".

This is impossible for several reason.

First, the program Cpl Carrot created randomly assigned the 72 players to their respective sections. The only manipulation I did in this process was to shift a few players around based on their ROW experience, so as to ensure we had a good mix of vets and noobs across the board.

Second, assigning who played what side was also done randomly. Take a look at the spreadsheet (which I can e-mail to anyone who is interested) and note how I organized the various matchups. Group 1 / section 1 was my template. Once I made sure all 6 players in that section played each person and scenario once I copied it, then changed the names to those in G1/S2, and repeated it for the other 10 sections. There was no attempt on my part to make sure player X got to play side Y in scenario Z.

More on this in a bit...

Link to comment
Share on other sites

Originally posted by GreenAsJade:

I had made the assumption (perhaps wrongly) that "normalised difference from the mean" didn't have any implied extra calculation.

IE it is simply a division of the individual's difference from the mean by the maximum difference from the mean.

I refered back to the reference document on nabla, and found that it said first:

"The difference from the median will then be divided by the standard deviation resulting in the 'normalized difference from the median'."

So that's the normalized difference--no mention of maximum difference from mean.

It then said:

"The normalized difference from the median is then assigned a Nabla score for the scenario. This is done with a formula created by Nabla that is at work inside the scoring program."

That forumula is in the document, I just noticed, but it is way beyond my math background, or at least, way beyond what I remember. However, I don't see anything in that formula that would take this critical element into account.

As long as the average of all scores is around 50% per side, it's no problem, but the average score for the Allies was around 30% in two scenarios. I need to look back at the data, but if winning as the allies at 100% results in a higher nabla score than winning as the axis at 100%, then I think the scoring system has a problem, since the degree by which the mean varies from 50% would limit how well you could do.

Kingfish, I don't think anyone is suggesting a conspiracy--I am certainly not--or accusing you of favoring anyone with the match-ups. It would have been impossible to do so ahead of time anyway, since it is not possible to know what the mean score is going to be for a given side without the data that the tourney itself provides.

If you could shed some light on the inner workings of the formula, specifically, does it account for large variation in the mean from 50% in it's weighting of the nabla scores, it might clear some things up.

Link to comment
Share on other sites

Originally posted by Malakovski:

If you could shed some light on the inner workings of the formula, specifically, does it account for large variation in the mean from 50% in it's weighting of the nabla scores, it might clear some things up.

There isn't a formula or program per se. I'm simply using the same step-by-step process that I have used from ROW III to date. Here it is copied from the Tournament manual which is posted over at Boots and Tracks:

1. The median score for each side of all the scenarios is determined.

2. The difference between a player's score and the median score for the side he played will be determined for all scenarios.

3. The standard deviations from the median scores are determined for all scenarios. This value will always be the same for both sides of a given scenario due to #1 above.

4. The difference from the median will then be divided by the standard deviation resulting in the "normalized difference from the median".

5. The average of all a player's scores (one for each scenario) is then determined, resulting in the player's final tourney score. The high score in each section is the winner of the section.

I'm not sure I understand exactly what the fuss is all about.

Can someone explain it to me?

Link to comment
Share on other sites

Right, did some more calcs and here's the problem I'm trying to get at. Let's look at Maleme. Note: I'm not picking on anyone in particular here, I'm just looking at high scores and comparing them. ALso note I'm rounding these numbers off to the nearest whole.

WN's Allied score was 90%, or 58% higher than the average. The maximum it was possible to exceed the average for the Allies was 68%.

JonL's Axis score was 96%, or 27% higher than the average. The maximum it was possible to exceed the average for the Axis was 32%.

WN's nabla score was 2.74.

JonL's was 0.72.

WN beat the average by way more, so that's reasonable right?

Well, once we look at these scores versus what was possible, here's what you get:

WN's Allied score of 58 out of 68 possible over the mean is 85% of the possible excess.

JonL's score of 27 out of 32 possible over the mean is 87% of the possible excess.

So we see that JonL actually did slightly better, if you limit the scale to what was possible for him to do--e.g. equating an Axis score +32 over the mean with an Allied score of +68.

It seems that limiting the scores thus is the only fair thing to do, since otherwise to equal WN's nabla score, JonL would have had to score 126%.

In case it isn't obvious, I'm thinking here only of future tournaments. The scoring of this one should stand as is, since they have been published.

It seems that a much simpler system of scoring would simply be to look at each score as a percentage of the possible excess over (or deficit from) the average score.

Link to comment
Share on other sites

Originally posted by Kingfish:

I'm not sure I understand exactly what the fuss is all about.

Can someone explain it to me?

I was composing the last post while you were writing yours, so I didn't read it until after, but I think I summarized the problem pretty well.

I'm not fussing--I think the nabla system is really good--I just think this toruney has revealed an instance in which the fairness of the scoring is questionable, namely, when the average score in a given scenario is greatly different from 50%. It's possible for one side to exceed the average by much more and therefore get a much higher nabla score. That introduces an imbalance into the tourney.

It's a random imbalance, since side assignments are random, but an imbalance that can have significant effects for individual players.

The example in the previous post should be illustrative of the problem.

[[edited for clarity]]

[ June 11, 2005, 06:45 PM: Message edited by: Malakovski ]

Link to comment
Share on other sites

Kingfish,

Let me join the rush to say no-one is suggesting a conspiracy! In fact the opposite: given the *sure knowledge* that the players were assigned randomly, we're looking for an explanation about the strange result of ROW V round 1.

The strange result is that the *fundamental basis* of Nabla is that it is supposed to factor out any influence of scenario balance on the results.

If players are randomly assigned sides in scenarios, you would then expect the results show a random spread of good players across Axis/Allies.

The ROWV Round 1 results seem to show that all the good players were Allies.

That is what the fuss is all about.

Malakovski seems to have argued that

As long as the average of all scores is around 50% per side, it (the Nabla system) has no problem,
But the whole reason for using Nabla is to allow fairness in unbalanced scenarios!

If you were having balanced scenarios, you could just use the raw scores to assess the players.

Thus it would be a great help if someone with the time and ability could assess Malakovski's reasoning & shed some light on this.

GaJ

Link to comment
Share on other sites

Originally posted by GreenAsJade:

But the whole reason for using Nabla is to allow fairness in unbalanced scenarios!

You got my argument exactly. When the average score for a given side in a given scenario is high or low, it affects the maximum possible nabla score for that player in that scenario. That's what the test case comparing WN's and JonL's scores were all about.

WN beat the average for his side by 58, but it wasn't possible for JonL to beat the average for his side by 58. The best he could do was 32.

If you look at their scores in terms of a percentage of the maximum possible excess of their side's average, they are almost identical, with JonL actually a little higher, yet WN's nabla score is almost four times higher than JonL's.

So JonL's drawing Axis for Maleme, though random, was a serious handicap.

In short, the nabla system does not seem to be taking scenario balance into account.

Link to comment
Share on other sites

I might be a moron here but...

Isn't the point with NABLA that by using the median score instead of avarage you eliminate the problem with noob encountring grog. This is possible since you calculate the standard deviation for each scenario. That is the curve we ar following. This in my mind suggests, which might be wrong prob is, that when one has a scenario where the median is say 50% but a lot of people managed to get say 75% you wont get the hight points until you get above that limit. I suggest that that might be the effect we are seeing in the results. If so then NABLA is the answer to all our prayers.

Link to comment
Share on other sites

When it comes to advanced math, I'm as likely to be a moron as anyone, but I think I am on to something in the above reasoning.

The median is just the middle score in an ordered list off all the scores for a side. It shouldn't differ tremendously from the average.

Standard deviation I do not have a lot of experience with, but it seems only to be saying how much the scores varied, but again based on an average (ish) value for each side.

I don't see how either of those is going to address the issue raised here--that it seems possible to get a much higher nabla score playing one side than another.

Nabla is scoring on some sort of curve around a mean/median number, but if that curve extends off the scale of possible scores (0 or 100), your score is limited from the outset.

The more the average for your side differs from 50%, the more pronounced this problem becomes.

At least that's what it seems like to me, liberal arts major who hasn't done any math more advanced that multiplication and division for ten years.

There's clearly a lot going on "under the hood" of nabla, so perhaps I'm mistaken. If I'm not, though, the system needs a little tweaking before ROW VI...

Link to comment
Share on other sites

The way I see it is like this:

We take the median. If we have a group of players, say 17, and 9 of them suck (I mean it) then the median might be say... 17%. Now assume that 3 players are great (say they get 50% out of the scenario) and the other 5 avarage (say 35%) then we want the NABLA to give high scores when you pass 35%, right? That is why we calculate the standard deviation, it will have the same "point sum" but will be "distributed" differently along the axis. So a maximum deviation from the median will give the same score but +-15% will not. At least that is the way it should work. I'm drunk... Sorry.

Link to comment
Share on other sites

Originally posted by Europa:

This in my mind suggests, which might be wrong prob is, that when one has a scenario where the median is say 50% but a lot of people managed to get say 75% you wont get the hight points until you get above that limit.

Ahhh...I think I may see something. Or maybe I'm just getting really tired.

The above is probably true, but only within the scores for a given side, since all the nambla scoring seems to deal only with one side at a time (if I am reading the abstract right).

But it doesn't take account of relating scores from opposite sides of the same scenario, so it seems.

Again I go back to the above example, both WN and JonL did almost as well as they possibly could for their respective sides, but the nabla scores differed by a factor of four.

Reasoning skills faltering...time for bed...

Link to comment
Share on other sites

Ah yes. The deviation from the median can be of 2 reasons:

1. You are great.

2. Your opponent suck.

Or vice versa. Makes 4. Anyway, you get my point.

Bottom line:

I don't think a tournament without a everyone meets everyone system can eliminate that problem.

Link to comment
Share on other sites

Originally posted by Malakovski:

</font><blockquote>quote:</font><hr />Originally posted by Europa:

This in my mind suggests, which might be wrong prob is, that when one has a scenario where the median is say 50% but a lot of people managed to get say 75% you wont get the hight points until you get above that limit.

Ahhh...I think I may see something. Or maybe I'm just getting really tired.

The above is probably true, but only within the scores for a given side, since all the nambla scoring seems to deal only with one side at a time (if I am reading the abstract right).

But it doesn't take account of relating scores from opposite sides of the same scenario, so it seems.

Again I go back to the above example, both WN and JonL did almost as well as they possibly could for their respective sides, but the nabla scores differed by a factor of four.

Reasoning skills faltering...time for bed... </font>

Link to comment
Share on other sites

Originally posted by Europa:

Now if NABAL works as intended 2 players doing as "well" as possible would get the same score. If the points aren't the same well then one player didn't "stand out" as much as the other from the crowd if you catch my meaning.

Ahhh...I follow you now.

However my question reappears at another level. Is it possible to "stand out from the crowd" to the same degree then the average for your side is 70% versus 30%?

If the formula for determining it is based on simple points over the average, the answer would still seem to be no. You can only stand out so many percentage points when you only have 30 to work with versus 70.

Again WN's and JonL's scores illustrate this. Neither of them could have done much better. Perhaps WN stood out more from the crowd, so to speak (I have no idea how to do the math to check), but surely not four times more?

That's the issue--the huge discrepency in doing as well as you could with one side versus doing as well as you could with the other in the same scenario.

I think we need a math major to really answer this one...

Link to comment
Share on other sites

Regarding the scoring system, well I can't really comment as my stats is not the best.

It is possible that there is a mistake in the code. I have asked Kingfish to double check with his spreadsheet, but if anyone else has it they are welcome to do it as well.

The standard deviation is calculated with the following formula:

sd^2 = (SUM((x-median)^2))/n-1

Link to comment
Share on other sites

Europa, you have put your finger on the problem:

Now if NABAL works as intended 2 players doing as "well" as possible would get the same score.
Mal's analysis seems to indicate that this is not the case.

The more unbalanced the scenario, the more you can stand out from the crowd if you're on the "disadvantaged" side, the higher Nabla score you can get.

A highly unbalanced scenario, like Wet Triangle, demonstrated this effect. Put a great player like Walpurgis on the disadvantaged side, and he can exceed the average score by a staggering amount!

GaJ

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...