Jump to content

The "Analysis of Nabla scoring system" thread


Recommended Posts

Originally posted by dieseltaylor:

Nice to see people taking an interest : )

The thinking behind Nabla goes back to 2001 - a search on CMBB will reveal threads. The underlying philosophy is here:

http://www.cs.helsinki.fi/u/hurri/nabla-system/

You can find Nabla's 33-page manual that explains the scoring system on this page as well. It should be noted that the scoring of ROW tournaments changed with ROW III (going from the original to a "modified" method - not sure why it changed). I think the original method used a "scoring curve" that helped address some of the issues that have been raised here. The "scoring cuve" is described in the manual in section 2.4 (page 11).
Link to comment
Share on other sites

  • Replies 212
  • Created
  • Last Reply

Top Posters In This Topic

Originally posted by Ace Pilot:

It should be noted that the scoring of ROW tournaments changed with ROW III (going from the original to a "modified" method - not sure why it changed).

The original program was designed to be run in DOS, and that is something I know next to nothing about. I took over midway through the third ROW (or was it the second?), but couldn't find anyone to help me with the scoring. After reading the manual, and a couple of test runs, I was able to score the tournament using the modified version. I have stuck with it ever since.

In reality, the modified version is identical to the original with the exception of one step (step 6).

Here is the original:

1. It first looks for scores that do not add up to 100 due to contested/unoccupied

VLs, and split the dierence between the players equally (this is for

CMBO, note that in CMBB CM points always total 100). For exam-

ple, a final score of 70-20 would be converted to 75-25. Scores will

always add up to 100 after this adjustment. There is a very good

reason for doing this involving agreements between players designed

to maximize their scores. Perhaps you can figure it out?

2. The median score for each side of all the scenarios is determined.

3. The dierence between a player’s score and the median score for the

side he played will be determined for all scenarios.

4. The standard deviations from the median scores are determined for all

scenarios. This value will always be the same for both sides of a given

scenario due to step 1 above.

5. The dierence from the median will then be divided by the standard

deviation resulting in the “normalized dierence from the median”.

6. The normalized dierence from the median is then assigned a Nabla

score for the scenario. This is done with a formula created by Nabla

that is at work inside the scoring program.

20

7. The average of all a player’s Nabla scores (one for each scenario) is

then determined, resulting in the player’s final tourney score. The

high score in each section is the winner of the section.

You'll note that step 1 is not an issue with CMBB or CMAK, while steps 2 thru 5 can easily be done in excel. The modified scoring version skips step 6 and goes to step 7.

Link to comment
Share on other sites

So the interesting thing is what did Step6 do to assign a Nabla score to the scenario. Is htis the area where the asymetric curves where meant to level out to reduce outlier values?

If anyone knew for sure I suppose they would tell us. If this is the step where the jiggery-pokery took place this would explain ![wot do I know] why there stated aim for steady play outgunning a freak result was meant to cut in. ... Oh Well plenty of time before RoW6 : )

Link to comment
Share on other sites

The formula created by Nabla for determining the Nabla score is complex for any non-Math major. It is also highly customizeable. There are several parameters (values) that can be changed to produce the curve desired by the tournament operator. I played with these parameters for weeks, studying the resulting curves until I found out what values would limit outlier results to my satisfaction.

Having not been involved with ROW for 2.5 years, I have no idea what values are plugged into the formula; but I feel confident my original parameters are not being used. Look at the results of older ROW tourneys if they are available somewhere. I think you will see that scoring was different.

It is true that players playing a side of a scenario with a very high median cannot do much better than that median; but the steepest part of the curve should surround the median. The curve I had in place was almost flat in the region where the top 5 or so scores for a side would fall.

A score of 90 from a side of a scenario where 40 was the median would not score many more Nabla points than a score of 70 from the same side.

The Nabla scoring system was a work in progress. Nabla had even mentioned that he thought the curve might need to be different for every scenario. This is, I think, what we are noticing here with these results.

Perhaps players should always play the same side in a tournament. 36 Allied players, and 36 Axis players. Then have an Allied champ, and an Axis champ. This would alleviate the problems being discussed here.

Treeburst155 out.

Link to comment
Share on other sites

I just read Kingfish's last post. He's not using the actual scoring formula (step 6) created by Nabla. This is my fault. :( Perhaps I can re-familiarize myself with the nitty-gritty of the scoring system, and pass it along to Kingfish. I think the actual DOS scoring program is still available on Nabla's site. My recommended parameters for the formula are on pages 26-28 of the manual.

Treeburst155 out.

Link to comment
Share on other sites

Hi Gents,

This is good stuff and something I am very interested in.

The TCP IP Tourney "Luga Breakthrough" is being run on exactly the basis as described by some here.

I.e. All players stick with one side (Allied or German) all the way through the tourney.

The one change I have made is after each game to assess who should be playing who based on points scored so far.

Obviously round one is random but round two is based on the scores of the players in round one (scenario one). This only works if you have tight timescales and ensure that all players are finished before the next round is issued.

What it should mean is that players are evenly matched and should ensure a close run tourney.

In the discussions here it would not be unreasonable to rate the top Allied player and the top German player for ROW.

If one wanted a final match between them get the last game to be played from each side at the same time. This ruins FOW but is about the fairest way to get a true showdown for the two top players. The designers / playtesters would have to choose something that could work well played this way.

Just some food for thought.

H

Link to comment
Share on other sites

Here are the Nabla scores from highest to lowest for "Across Moltke" as calculated by the original Nabla formula with my recommended values (Page 26-28 of Nabla manual) for the parameters. You'll see that big wins weren't as big. In fact, the corresponding big losers were punished for poor performance, rather than the winner rewarded. For example, Walpurgis beat me 91-9. His Nabla score is 1.17 while my negative score is significantly worse at -1.71.

Walpurgis_Nacht 1.17

Londoner 1.17

GreenHornet 1.08

Bigduke6 1.07

JonS 0.98

Ted 0.94

jbertles 0.91

Platehead 0.91

Sivodsi 0.83

dangerousdave 0.76

ElmarBijlsma 0.76

flammenwerfer 0.72

Flenser 0.72

Larry_Thorne 0.68

stikkypixie 0.64

Cpt_T 0.64

Raketenpanzerbuchse 0.54

MerkinMuffly 0.54

General_Colt 0.54

Sleekit 0.48

Panzertwat 0.48

Malakovski 0.48

Europa 0.48

Gtimthane 0.42

BigDog944 0.42

LT_Bull 0.36

JPS 0.36

tabpub 0.29

simovitch 0.29

dieseltaylor 0.29

kenfedoroff 0.21

Other_Means 0.21

Melnibone 0.21

KanonierReichmann 0.21

Steve_McClaire 0.13

Cuzn 0.13

Renaud -0.13

Dawg_Bonz -0.13

The_Enigma -0.21

StoneAge -0.21

JonL -0.21

JimCrowley -0.21

MickOZ -0.29

JeffWilders -0.29

Heavy_Drop -0.29

yacinator -0.36

SteveS -0.36

a1steaks -0.42

Vadr -0.42

sandy -0.48

The_Capt -0.48

Sripe -0.48

Frenchy -0.48

CombinedArms -0.54

Artavash -0.54

Andrew_Kulin -0.54

peterk -0.64

Nefarious -0.64

Paco_QNS -0.68

mPisi -0.72

FGM_Smashing -0.72

Nestor -0.76

GSX -0.76

BigMik1 -0.84

Michael_Dorosh -0.96

GreenAsJade -0.96

Foxholerob -1.00

JohnO -1.08

Sgian_Dubh -1.28

ded -1.32

Treeburst155 -1.71

Redwolf -1.71

Would the final rankings of this tournament have changed? I don't know. There probably would not have been such a large range of scores. Things would have been packed tighter.

If there is another ROW, I would be willing to take care of the scoring if organizers so desire. I would not play.

Treeburst155 out.

[ June 14, 2005, 04:19 PM: Message edited by: Treeburst155 ]

Link to comment
Share on other sites

You've still given the losing side the negative amount of points of the winning side. The problem with that is that it is very unfair on the other participants of a group in which somebody like WN is playing. Notice how the big scores he racked up meant that everybody in his group scored in negatives? I really don't think that is fair that they end up with awful overall scores just because one person in their group is spectacularly good.

An improved Nabla system has to include a way of scoring points for the losing side that is not an arbitrary 'we'll make it the negative of the winners score'.

[edited for spelling]

[ June 14, 2005, 05:09 PM: Message edited by: Sivodsi ]

Link to comment
Share on other sites

Sivodsi,

Walpurgis did very well against ALL his opponents. The "curve" used awarded him LOTS of points for these performances. The one game everyone in Group 4 played against Walpurgis ruined their average Nabla score. The curve illustrated above really wouldn't do that.

Look at the difference between the highest and lowest score for the scenario above. Compare that with the actual ROW results for "Across Moltke".

Treeburst155 out.

Link to comment
Share on other sites

Hi Treeburst,

I don't think anyone disputes that rewarding a big win is wrong. The problem is that under the Nabla system, the unfavored side in an unbalanced scenarios has the opportunity to rack up the points, thus depriving those playing the other side the chance to advance to the finals.

So, to score big points in this tournament you need:

1) to play the unfavored side in the most unbalanced scenarios

2) be strong enough while your opponent is weak enough to take advantage by winning heavily.

So, by point 2) above the player should be rewarded, but 1) above is just a matter of luck which should not favor one player over another under any circumstances.

Link to comment
Share on other sites

...but if HUGE wins were not heavily rewarded, as in my "Moltke" example above, a strong player would have to be consistently strong to stay ahead of the pack. Note that the top 8 or so players in my example are well within striking distance of Walpurgis. The one huge win doesn't put him way out in front.

BTW, Walpurgis won Group 4 no matter how you look at it. The gap would just have been much less.

Treeburst155 out.

Link to comment
Share on other sites

TB,

What exactly is this new system you ran for Moltke. You say it is the original Nabla - do you still have the program?

Also, doesn't this new system simply just 'tightening up' the numbers, but keeping the same ratio? In other words, if under the old system player X had a 10 point lead over player Y, but in the new one the ratio is now 8 to 4?

Link to comment
Share on other sites

Hi Kingfish,

Nabla pulled his scoring program off his site at one time. At some point, he made it available again. I just discovered and downloaded it, read my own instructions in the Nabla manual, and ran the results for Moltke. I'm fully functional with the Nabla scoring system again. smile.gif I thought it was lost forever. Thanks, Nabla!!

The original system definitely tightens up the numbers if you use my recommended parameters. I really can't say for sure if it would change the final ranking of the players. Those who had one very bad game, or one very good game would not have their final scores affected so severely by the one performance.

Walpurgis would still win Group 4 because he was consistent throughout his games. The gap would have been much closer, with other Group 4 people finishing with higher final Nabla scores.

Treeburst155 out.

Link to comment
Share on other sites

Originally posted by Kingfish:

Also, doesn't this new system simply just 'tightening up' the numbers, but keeping the same ratio?

I don't think so. My score under the ROW V system was 0.46, under the original nabla it's 0.48. So my score actually increased slightly, whiles the highest scores dropped dramatically. Also, scores are no longer inverses, so that ratio should be changing as well.

The formula is complex (i.e. I'd have to do some research just to understand what the hell it is doing exactly), but looking at how the numbers change from the ROW V scoring to that system, it appears to adjust the entire shape of the "curve" representing nabla scores as raw scores increase.

I would have to do some more tinkering, but if no forumla is being applied in ROW V scoring, the nabla points should be linear with increasing raw score, up to great heights if the average for that players side is low.

I'd like to emphasize again (for those who understandably skipped my long post above), this is not really an issue of fairness so long as all players have the same maximum potential score when all scenarios are taken into account.

If that's not a factor, then it's an issue of how you want the tourney to play. Should one big win be enough to swing a section? Or should it be a fight to the bitter end in every scenario, with a chance to battle back from a defeat?

I would also suggest considering arranging the sections a bit differently, perhaps making them larger with more than one finalist. The valid point was raised above: no one in WN's group had a prayer pitted against in one of his huge scores (and penalized by the inverse score), and that has to take away from the fun.

Link to comment
Share on other sites

Malakovski,

The adjustable parameters in the command line all affect the curve. You can have an assymetrical curve or a symmetrical one. You can flatten it out wherever you want. You can punish extremely poor performance as much as you want.

I do not have the actual formula; but Nabla has made the source code available on his site for anyone with a compiler. The formula is in there for sure. Trial and error with sample input and different parameters is all you need to analyze the curve for desireable characteristics.

I could run the entire ROW V tourney through this original system; but putting together the input file would take several hours. Also, if it changes rankings too much there could be ...um...problems.

Treeburst155 out.

Link to comment
Share on other sites

Originally posted by Treeburst155:

See page 30 of the Nabla Manual for the scoring formula!!!

I saw it, but I need a math textbook to decode it!

Further to your previous post, I didn't know the system was so flexible. I thought nabla was a single, semi-ridig formula.

Was I correct in my assumption that it did not directly take into account the amount the average score for a side differed from 50%?

And no, rescoring ROW V should definitely be put off until it's all over. Really this whole discussion should have been, but one a thread gets rolling...

Link to comment
Share on other sites

If a scenario is extremely unbalanced, the ability of good players to excel from the favored side would be reduced because you can't score more than 100. The only thing mitigating this problem is that the steep part of the curve is near the median. So, if the median for a side was 90 the curve would be starting to flatten by the time scores got to 100 anyway.

A good player playing the strong side in several games would have a rougher time winning the tourney; but the problem would be less severe with the original Nabla system.

The real answer is not to use extremely unbalanced scenarios; or to have an Allied champ and an Axis champ, with all players playing only one side in all games.

Also, the problem disappears if the scenario doesn't go much beyond 70-30 in balance. As long as there is 30 or so points possible above the median for a side there is room for the best to excel. More imbalance than that would tend to detract from players' enjoyment of the scenario IMO.

Treeburst155 out.

Link to comment
Share on other sites

I must admit I was a bit surprised by the large differences that emerged with players scores in this latest ROW tournament. I was used to the "old" Nabla formulae which did, as you say, produce much closer overall results between opponents. The old method tended to benefit more consistant players that battled and scrapped their way to a win rather than rewarding perhaps the more flambouyant player who could manufacture spectacular wins but also suffered from occasional losses for taking the risks they did. It comes down to what type of player style should be rewarded which seems dependant upon the type of Nabla system to be used in my opinion.

Regards

Jim R.

Link to comment
Share on other sites

Originally posted by Treeburst155:

Perhaps players should always play the same side in a tournament. 36 Allied players, and 36 Axis players. Then have an Allied champ, and an Axis champ. This would alleviate the problems being discussed here.

Treeburst155 out.

Unfortunately this doesn't address the problems being discussed here at all.

The issue being discussed is the extra scoring range available to the "disadvantaged" people in unbalanced scenarios.

It is not the case that all scenarios will be unbalanced in favour of Axis or Allies.

GaJ

Link to comment
Share on other sites

I think Treeburst's revelation of the missing "step 6" really explains how we got to where we are.

Nabla put his great mathematical mind to coming up with a scoring system that deals with unbalanced scenarios. It seemed hard to believe he would have completely missed this aspect. And indeed, it appears he had a magical means of dealing with it that didn't survive translation from his DOS program to the next tool.

Of course, we don't know how well Nabla's magic formula did deal with the situation, but since it never raised eyebrows, one can assume it was reasonable.

It would be great if someone could look at the forumula and provide some interpretation of what its doing. It might be the right starting point for "the next version".

GaJ

Link to comment
Share on other sites

I don't think there are many spectacular wins that aren't related to a rookie opponent, extremely good luck, or an opponent who gives up or plays half-heartedly.

A player who often takes big risks will lose more than he wins IMO. The Nabla System was designed to reward consistently good play. I don't think the big risk-taker can be consistent.

The really great thing about Nabla's work is that all this is tweakable. My recommended parameters de-emphasize big wins. Instead, the player who loses badly is punished to a certain degree for giving up or being extremely unskilled. Also, by keeping the full range of scores tighter, no single poor performance (or good one) can have a drastic affect on a player's final Nabla score for the tourney.

Having said that, I believe the right people won their sections in this last tourney. It's just the relative placement of others that was a bit hard for people to wrap around their brains.

For example, I did fairly well in three games, but my final Nabla score was atrocious, mainly due to a huge loss in Moltke. Therefore, I feel I did better than my final score suggests. No big deal. There's no way the curve could be tweaked to make me a winner. smile.gif

Treeburst155 out.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...