Jump to content

Will the Rumblings of War tournament (Row) be Returning


Recommended Posts

Interesting Sivodsi, very interesting and food for some thought ....

It's time to make contact with Jarmo Hurri, known as Nabla here, who has a PhD from the University of Helsinki (Thesis: statistical properties of natural image sequences and their implications on early vision.) The reasons are obvious. Or Treeburst155 (Mike Meinecke) for that matter. None of their previous emails are active it seems. Anyone here have an idea where to reach these wallah's?

Link to comment
Share on other sites

So noted Flammenwerfer. All those reporting here and RoW vets will be on the main invite list; they will be 1st in line for an ROW VI invite. Make sure your Battlefront forum email is active and that it can accessed/viewed for an official invite, a few months from now.

Link to comment
Share on other sites

Interesting Sivodsi, very interesting and food for some thought ....

It's time to make contact with Jarmo Hurri, known as Nabla here, who has a PhD from the University of Helsinki (Thesis: statistical properties of natural image sequences and their implications on early vision.) The reasons are obvious. Or Treeburst155 (Mike Meinecke) for that matter. None of their previous emails are active it seems. Anyone here have an idea where to reach these wallah's?

It'll be interesting to get their take on it. Rasch is relatively recent development, and they might not know much about Rasch. I don't think many statistics courses at university level include anything on it. As I said before, Rasch is diametrically opposite in its approach to classical statistics. Where the statistician would be asking "how does our model best fit the data", Rasch looks at the data and says "how well does this data fit the model".

Link to comment
Share on other sites

It'll be interesting to get their take on it. Rasch is relatively recent development, and they might not know much about Rasch. I don't think many statistics courses at university level include anything on it. As I said before, Rasch is diametrically opposite in its approach to classical statistics. Where the statistician would be asking "how does our model best fit the data", Rasch looks at the data and says "how well does this data fit the model".

I am convinced that you guys thoroughly over-engineer this scoring mechanism.

I mentioned the bridge scoring technique before, and I am still thinking that that will give you exactly what you want, using just rank numbers for both sides. Bridge is a serious competition with millions of players world-wide and tournaments often involve large amounts of price money. They wouldn't put up with an unfair system. And it has exactly the same aspects as a CM tournament, with unbalanced sides, and the same scenario (hand) played by many players.

Link to comment
Share on other sites

I am convinced that you guys thoroughly over-engineer this scoring mechanism.

I mentioned the bridge scoring technique before, and I am still thinking that that will give you exactly what you want, using just rank numbers for both sides. Bridge is a serious competition with millions of players world-wide and tournaments often involve large amounts of price money. They wouldn't put up with an unfair system. And it has exactly the same aspects as a CM tournament, with unbalanced sides, and the same scenario (hand) played by many players.

Sounds interesting. I'm unfamiliar with Bridge scoring, but a quick look at the wikipedia page leads me to believe that its not something I would pick up quickly!

How would you apply this to a 72 person tournament?

The basic problem as I see it, is that if you divide people into pools it is very difficult (impossible) to make sure that each group is 'fairly' composed. That is, even if you have 4 extremely skilled players out of 6 in the pool, only two will go on to the finals.

Yet, another pool may have only 1 who is skillful enough to beat all the other members, but nowhere near as good as the 4 in the other group. Yet this person would make it into the finals while the two in the other group who are actually better than him would not. You might be able to get around the problem by seeding somewhat, but the problem is that so many people's CM skills are unknown relative to each other.

So you see, in this kind of setup, it becomes more a matter of luck whether you make it into the next round or not. The proposed Rasch format at least beats the pool system in this respect.

So, can your Bridge scoring system overcome such challenges?

Link to comment
Share on other sites

Sounds interesting. I'm unfamiliar with Bridge scoring, but a quick look at the wikipedia page leads me to believe that its not something I would pick up quickly!

How would you apply this to a 72 person tournament?

The basic problem as I see it, is that if you divide people into pools it is very difficult (impossible) to make sure that each group is 'fairly' composed. That is, even if you have 4 extremely skilled players out of 6 in the pool, only two will go on to the finals.

Yet, another pool may have only 1 who is skillful enough to beat all the other members, but nowhere near as good as the 4 in the other group. Yet this person would make it into the finals while the two in the other group who are actually better than him would not. You might be able to get around the problem by seeding somewhat, but the problem is that so many people's CM skills are unknown relative to each other.

So you see, in this kind of setup, it becomes more a matter of luck whether you make it into the next round or not. The proposed Rasch format at least beats the pool system in this respect.

So, can your Bridge scoring system overcome such challenges?

What I am talking about is not the direct score of a played hand - which is the equivalent of the outcome of a battle, but the score in a pair tournament of bridge.

Everyone plays a number of hands (battles)

Then you compare all results of North-South pairs (allied players) and order them according to score. Worst score gets 0 points, next worst 2 points, next 4 points etc. Equal scores divide up (that is the reason the base score is an even number, that way you can always divide without fractions).

In the same way you can compare all East-West pairs (axis players).

Highest total score - over all battles played - wins.

You can still fiddle with the system regarding who plays against who on what side in what round.

Link to comment
Share on other sites

What I am talking about is not the direct score of a played hand - which is the equivalent of the outcome of a battle, but the score in a pair tournament of bridge.

Everyone plays a number of hands (battles)

Then you compare all results of North-South pairs (allied players) and order them according to score. Worst score gets 0 points, next worst 2 points, next 4 points etc. Equal scores divide up (that is the reason the base score is an even number, that way you can always divide without fractions).

In the same way you can compare all East-West pairs (axis players).

Highest total score - over all battles played - wins.

Thanks for the clear explanation.

You can still fiddle with the system regarding who plays against who on what side in what round

Hmm yes, and how do you do the fiddling, eh? That brings you back to the same problem inherent in the pools system: Who plays who with what side?

The main problem is that this system doesn't take into account the skill involved in getting a certain score in a certain scenario. A person can show greater skill in losing an unbalanced scenario by a narrow margin than a person who pulls off a comfortable win when all the odds are in his favor. In this system it looks like all scenarios need to be of approximately equal difficulty. It seems also, that you must keep axis players separate from allied players.

Still, it has the advantage of being an easy system to understand and put into action.

Link to comment
Share on other sites

Hmm yes, and how do you do the fiddling, eh? That brings you back to the same problem inherent in the pools system: Who plays who with what side?

There are many solutions to that problem, but apart from rounds and pools there is the Swiss system, which couples a player to the next best that he hasn't played yet. Ties are decided by adding the scores of the previous opponents.

After a few rounds the best players bubble to the top and then fight it out amongst themselves.

The main problem is that this system doesn't take into account the skill involved in getting a certain score in a certain scenario. A person can show greater skill in losing an unbalanced scenario by a narrow margin than a person who pulls off a comfortable win when all the odds are in his favor.

There is no scoring system that will discover this effect.

That is just the price for playing a game with random effects.

You can only hope that it averages out.

In this system it looks like all scenarios need to be of approximately equal difficulty. It seems also, that you must keep axis players separate from allied players.

No, on the contrary, difficult or easy, the system just values them all the same.

And there is no need to play the same side each time, as each scenario results in a score for everyone who played it, a score that has the same weight.

Still, it has the advantage of being an easy system to understand and put into action.

And it is used every day by thousands of players. Millions each weekend.

Link to comment
Share on other sites

I am convinced that you guys thoroughly over-engineer this scoring mechanism.
Fear not, nothing will be changed without Nabla's input. What is bandied about are some ideas, and things will become clearer once we have the working Nabla executables and specifically the Nabla Scoring Curve parameters.

It may be, after seeing all the results coming in for a particular scenario/battle, that the Nabla curve will be adjusted for that specific scenario, given that the battle produced great variances in results.

The Nabla Curve could then, for a specific battle, be adjusted so as not to punish extreme losses extremely (flat Nabla graph curve) via the Nabla scoring points after a certain threshold point has been reached, and vice versa. This is in contrast to having a single Nabla scoring curve for all the scenarios in the tourney.

If we can get Nabla's take on this idea mathematically -- which he himself suggested -- we might have an adjusted Nabla scoring method that will cover most eventualities given the already relative robustness of the scoring method already used.

I know too little to allow myself a considered opinion on this (exact) matter, without reverting to the experts in stats/maths to test our assertions/premises.

Link to comment
Share on other sites

There is no scoring system that will discover this effect.

Well this is what Rasch statistics does. You enter the various facets of the data: in CMBN's case that would be: scenario, side, player score. You centre the facets that you are not interested in on 0. This lets the facet that you are interested in float free, in this case 'player'.

In effect you are asking, "Given the side and the scenario being played, what is this players ability?"

Thus, all players abilities are put on the same scale, irrespective of what side is being played, or the difficulty of the scenario.

Yes, the math is complex, and not being a mathematician, I can't explain how it works in more than general terms. I am merely a user of the software, 'facets' and I use it as part of my job and the research that I do into my PhD, which is in the field of language assessment.

Fortunately, all the math goes on under the hood of the software. What I can do, and have done, is encode the data, run it in the software and interpret the results.

And it is used every day by thousands of players. Millions each weekend.

Yes, it is a card game, where each 'scenario' is based on a selection of cards that should be random. I'm not sure whether the human made scenarios in CMBN could be considered exactly analogous to that.

Link to comment
Share on other sites

Fear not, nothing will be changed without Nabla's input. What is bandied about are some ideas, and things will become clearer once we have the working Nabla executables and specifically the Nabla Scoring Curve parameters.

It may be, after seeing all the results coming in for a particular scenario/battle, that the Nabla curve will be adjusted for that specific scenario, given that the battle produced great variances in results.

The Nabla Curve could then, for a specific battle, be adjusted so as not to punish extreme losses extremely (flat Nabla graph curve) via the Nabla scoring points after a certain threshold point has been reached, and vice versa. This is in contrast to having a single Nabla scoring curve for all the scenarios in the tourney.

If we can get Nabla's take on this idea mathematically -- which he himself suggested -- we might have an adjusted Nabla scoring method that will cover most eventualities given the already relative robustness of the scoring method already used.

I know too little to allow myself a considered opinion on this (exact) matter, without reverting to the experts in stats/maths to test our assertions/premises.

Yes, what I said, over-engineered :-).

Link to comment
Share on other sites

Well this is what Rasch statistics does. You enter the various facets of the data: in CMBN's case that would be: scenario, side, player score. You centre the facets that you are not interested in on 0. This lets the facet that you are interested in float free, in this case 'player'.

In effect you are asking, "Given the side and the scenario being played, what is this players ability?"

Thus, all players abilities are put on the same scale, irrespective of what side is being played, or the difficulty of the scenario.

Yes, the math is complex, and not being a mathematician, I can't explain how it works in more than general terms. I am merely a user of the software, 'facets' and I use it as part of my job and the research that I do into my PhD, which is in the field of language assessment.

Fortunately, all the math goes on under the hood of the software. What I can do, and have done, is encode the data, run it in the software and interpret the results.

So you think it can be done because you don't understand the mathematics?

That is also called mumbo-jumbo.

I say it can't be done: a special score can be the result of tactical genius or a fluke of luck, and no algorithm can discover that.

Link to comment
Share on other sites

I am convinced that you guys thoroughly over-engineer this scoring mechanism.

I mentioned the bridge scoring technique before, and I am still thinking that that will give you exactly what you want, using just rank numbers for both sides. Bridge is a serious competition with millions of players world-wide and tournaments often involve large amounts of price money. They wouldn't put up with an unfair system. And it has exactly the same aspects as a CM tournament, with unbalanced sides, and the same scenario (hand) played by many players.

I think you are taking an extremely simple game - bridge - and saying that its scoring system is suitable. There are only 52 cards and after the bidding only 26 are unknown so the play given identical hands between many pairs is likely to be very similar. Subject of course to any silliness in the bidding.

In any decent RoW scenario there is the complete unknown from a possibly inadequate briefing, multiple units, different terrain features, and different results possible each time two tanks meet. Trump Ace always top King - not in the CM world.

Therefore the bridge scoring system may be adequate as the variables are actually quite limited and the range of results ditto. And I suspect that a much larger number of hands are played - typically 28 or more in a session. In serious bridge play they may well play over three days and at least 64 hands. So over that many games a point difference may well emerge. In five roundsin RoW I doubt that one would feel particularly secure that the results were fair.

Anyway given the simpler game and the larger number of events to let people show how good they are how does duplicate bridge resolve it:

Matchpoint scoring

The most common form of pairs scoring is by matchpoints. (See Bridge scoring for the scoring method of individual deals.) On each board, a partnership scores two matchpoints for each other partnership that scored fewer points with the same cards, and one point for each other partnership that scored the same number of points. Thus, every board is weighted equally, with the best result earning 100 percent of the matchpoints available, and the worst earning no matchpoints; the opponents receive the complement score, e.g. an 80% score for a N-S pair implies a 20% score for their E-W opponents. Colloquially, a maximum matchpoints score on a board is known as a "top", and a zero score is a "bottom". The terms "high board" and "low board" are also used.

Note 1: in the United States, scoring is one point for each pair beaten, and ½ point for each pair tied.

Note 2: The rule of 2 matchpoints for each pair beaten is easy to apply in practice: if the board is played n times, the top result achieves 2*n-2 matchpoints, the next 2*n-4, down to zero. When there are several identical results, they receive the average. However, complications occur if not every board is played the same number of times, or when an "adjusted" (director-awarded) score occurs. These cases can result in non-integer matchpoint scores – see Neuberg formula. These matchpoints are added across all the hands that a pair plays to determine the winner. Scores are usually given as percentages of a theoretical maximum: 100% would mean that the partnership achieved the best score on every single hand. In practice, a the result of 60% or 65% is likely to win the tournament or come close. In a Mitchell movement (see above) the overall scores are usually compared separately for North-South pairs and for East-West pairs, so that there is one winner in each group (unless arrow-switching has been applied - see above).

Historical Note: At some time in the past, both North-South and East-West might be awarded the same matchpoint score. Using this arrangement, the lowest East-West Score would be the winner.

In Board-a-match team game, the matchpoints are calculated using a similar principle. Since there are only two teams involved, the only possible results are 2 (won), 1 (tied) and 0 (lost) points per board.

[edit] IMP scoring

In IMP (International Match Points) scoring, every individual score is subtracted from another score, and the difference is converted to IMPs, using the standard IMP table below.[3] The purpose of the IMP table, which has sublinear dependency on differences, is to reduce results occurring from huge score differences ("swings").

The score that is being compared against can be obtained in the following ways:

  • In team events, it is the score from the other table;
  • In pair events, it can be:
    • The datum score, most often calculated as the average score on board, excluding a number of top and bottom results. Sometimes, the median score is used instead.
    • In "cross-imps" or "Calcutta" scoring, every score on board is compared against every other score (sometimes excluding top and bottom results) and IMPs summed up (and possibly averaged, to reduce "inflation").
    • Example of averaged cross-imp scoring: Assume that you are one of five pairs who play Board 2 as N/S (you are vulnerable). On this board you bid and make 4 Spades, scoring +620, while the other four N/S pairs score -100, -100, -300, and +650. To figure out your averaged cross-imp score, you would create a table like the Cross-IMP example table below: after writing in the scores of you and your opponents in the bolded areas, you would then fill in the Score Delta row by subtracting each of the opponent's scores from yours. Then, you would fill in the IMPS Gained row by converting each of the score deltas into IMPS via the standard IMP table (e.g. 720 equates to 12 IMPs, because it falls in the range of 600 to 740 in the IMP table). You then add up those numbers and get 37 IMPS. To turn the 37 IMPS into a averaged cross-IMP score, you simply divide that number by the number of competitors (37 IMPs divided by 4 competitors) to arrive at 9.25 as your averaged cross-IMP score.

[edit] Cross-IMP example table

Our score on Board 2 is 620

Scores of the four other North/South pairs -100 -100 -300 650

Score Deltas 720 720 920 -30 IMPS

Gained 12 12 14 -1

[edit] IMP table

Point difference IMPs

Point difference IMPs

Point difference IMPs from to from to from to 0 10 0 370 420 9 1750 1990 18 20 40 1 430 490 10 2000 2240 19 50 80 2 500 590 11 2250 2490 20 90 120 3 600 740 12 2500 2990 21 130 160 4 750 890 13 3000 3490 22 170 210 5 900 1090 14 3500 3990 23 220 260 6 1100 1290 15 4000 or more 24 270 310 7 1300 1490 16 320 360 8 1500 1740 17

Please see the Wiki piece for how this is laid out - I cannt spend the time doing it her : )

http://en.wikipedia.org/wiki/Duplicate_bridge

[edit] Scoring and tactics

The type of scoring significantly affects a pair's (team's) tactics. For example, at matchpoints, making one more overtrick than everybody else on a board gives the same result (the top) as making a slam that nobody else bid, whereas at IMP scoring, the difference comes down to 1 IMP (30 points) in the first case, but 11 or 13 IMPs (500 or 750 points) in the second case. In general, matchpoint scoring requires a more "vivid" and risk-taking approach, while IMP scoring requires a more cautious approach (sometimes referred to as "cowardly" by those who dislike it). The main features of the tactics are:

  • Matchpoints
    • Overtricks are important
    • Safety play is often neglected in the hunt for overtricks
    • Thin games and slams are avoided
    • Sacrifices are more frequent; e.g. going down 500 points on a doubled contract is a good result if the opponents can score 620 points for a game.
    • Doubles are more frequent, as they increase the score for the penalty. For example, "the magic 200" refers to the situation when a pair beats the vulnerable opponents one trick doubled — the obtained score of 200 will likely outscore all partial contracts played on other tables.
    • Playing in higher-scoring denominations (notrump or major suits) is important, as it may lead to an extra 10 or 20 points.
    • Due to the above, it is often unclear to the defence, and even to the offense what their goals are.[4] Thus mastering matchpoints play requires additional skills (sometimes referred to as "not bridge" by those who dislike it) to those required to playing IMPs.

    [*]IMPs

    • Overtricks are not important, as it's not worth the risk of losing e.g. game bonus (300-500 points = 8-11 IMPs) for a potential 1-IMP gain for an overtrick
    • Safety play is very important, for the same reason
    • Thin games and slams are often bid. Bidding a game with 40 percent probability of success vulnerable and 45 percent nonvulnerable, or a small slam with 50 percent probability, is worth the risk, and anything over that increases the probability of a positive IMP score in the long run.
    • Sacrifices are less frequent, as they may be risky.
    • Doubles are less frequent, as they may be risky. Often, when an opponents' contract is doubled, it turns declarer's attention to the bad lie of cards, and may induce him to take a successful line of play that he wouldn't take otherwise.
    • The contract itself sets a clear goal for both the defence and declarer, frequently allowing a deeper level of counter-plays between them.

Somehow I thought it would be simpler : ). Funnily enough I recognise some of the Nabla thinking in the article. Perhaps Nabla is actually a bridge derived but tweaked system!

Link to comment
Share on other sites

I think you are taking an extremely simple game - bridge - and saying that its scoring system is suitable. There are only 52 cards and after the bidding only 26 are unknown so the play given identical hands between many pairs is likely to be very similar. Subject of course to any silliness in the bidding.

If you say bridge is a simple game then you don't know what you are talking about.

There are 52! / (13!)^4 different possible hands, which is more than 2^40, which is infinite for all practical purposes.

Anyway, then you continue to quote the match score for bridge, which is not what I am talking about. Match score is used in a knock out system where teams (of two pairs) play a match of many hands against each other. Again you show that you do not know what you are talking about.

I was talking about pair score, WHICH I MENTIONED EXPLICITLY, which is suitable for a tournament where many pairs play the same hand, like in the ROW tournament.

Whatever the conditions of a game, if it is a hand of cards, or a CM scenario, for scoring purposes you want to know if somebody did better than someone else. That is what pair score gives you.

Link to comment
Share on other sites

Erik. I am surprised at your answer as you do not appear to have comprehended what I have written.

But firstly the possible permutations in a pack are completely irrelevant to the position where 4 people sit down and play pre-arranged hands. They only have 13 cards to play and there is no getting away from that position. And given they have to follow suit if they can and actual card-playing sense there is an even smaller chance of variation.

Now to get to this "pair score". I read and comprehended what you said.

Everyone plays a number of hands (battles)

Then you compare all results of North-South pairs (allied players) and order them according to score. Worst score gets 0 points, next worst 2 points, next 4 points etc. Equal scores divide up (that is the reason the base score is an even number, that way you can always divide without fractions).

In the same way you can compare all East-West pairs (axis players). Highest total score - over all battles played - wins. You can still fiddle with the system regarding who plays against who on what side in what round.

However you did not extend your example to the required conclusion. Seventy-two people play and that might mean, assuming that they all recorded different scores that the chief Axis and chief Allied player could receive 72 points. Now the remaining games may be quite drawish, or with small variation so that all the games bunch up.

Now you state that all players who draw share the points. I am not clear what this may mean when you have 36 players and 6 have 72,70,68,66,64 and then the next ten have an identical percentage score how does that work out? And what happens to those beneath them?

I have two theories how it might work but rather than work them perhaps you would just tell us.

Obviously if you play a rabbit and score 72 points then with only say 5 rounds that score should immediately mean you are favourite to go through to any next session of the best. The average score is 36 so you should end on 4*36+72 = 216.

If B is a better player than A but fights better enemy players he may well score averagely throughout the five rounds giving 180. Allowing the fortunate player twice as many as the average player seems far too extreme - however I await to see how you suggest tied scores are dealt with.

Also bear in mind that outliers more often show a mismatch of player skill than anything else. The second most common cause is a scenario where there is a make or break situation, it may be a simple as a tank duel which will tend to make the scenario play heavily either one way or another. And of course there is always the chance of an outright freak result - where someone discovered tanks could fall off a bridge onto a road below and then drive around the enemy back line. : )

Link to comment
Share on other sites

So you think it can be done because you don't understand the mathematics?

That is also called mumbo-jumbo.

I say it can't be done: a special score can be the result of tactical genius or a fluke of luck, and no algorithm can discover that.

Actually you can, and it forms one of the principles that underlies Rasch.

Rasch works from probability, working from expected and unexpected results. If you get a string of good results from someone, then it is reasonable to assume that this is because the person has a high degree of skill. However, if one person gets a string of low results, and then one high one, then is it not reasonable to assume that this was a 'lucky' result?

The person who gets the lucky result is still credited with it, but Rasch will signal it by calling it a 'misfitting' result, because it is outside expectation.

As I said before, I am no mathematician, but I do understand how this extremely useful method of statistical analysis works.

Link to comment
Share on other sites

I have been reading up o Wiki and elsewhere on Rasch. However there are some quite complicated[ to me] exceptions. Judging by the literature and caveats it does look like one could use a week thinking about it.

This seems to link to a freebie

http://onbiostatistics.blogspot.com/2010/01/rasch-analysis.html

to here

http://www.estat.us/id111.html

Link to comment
Share on other sites

I have been reading up o Wiki and elsewhere on Rasch. However there are some quite complicated[ to me] exceptions. Judging by the literature and caveats it does look like one could use a week thinking about it.

This seems to link to a freebie

http://onbiostatistics.blogspot.com/2010/01/rasch-analysis.html

to here

http://www.estat.us/id111.html

For the RoW data you should really use this free software: Minifac

rather than winsteps because it requires more than two levels of data (scenario, player, side).

Edit to add: besides, if you do it in minifac first, it has a function that automatically produces a winsteps control file.

The free version has all the features of the full version (which I have) but is limited to 2000 data points, and I'm pretty sure that it is sufficient to run RoW.

If you're interested I'm happy to send you the control file I did for the last RoW.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...