Jump to content

Maastrictian

Members
  • Posts

    374
  • Joined

  • Last visited

Posts posted by Maastrictian

  1. I'm definitely enjoying the game as well, especially after being away from the CM series for years.

    I may need to be stripped of my grog credentials, but I am also really enjoying real time play. If anything, it adds a sense of realism to me. I never liked the perfectly coordinated movements that I could achieve in WeGo. Its also refreshingly hard! I ignore a squad for a minute while I'm directing something else and find that they have been decimated.

    What I don't enjoy is the pathing issues, which are really bad. Even micromanaging movement doesn't seem to prevent Strykers taking joyrides down IED filled alleys when they could have just taken the straight main road like I asked them too. This is the big "must resolve" issue for me, and if its not correct this will be the last CM game I purchase :(

    But I'd rather have the game now than be waiting on bug fixes. Now to figure out the (very complex) scenario editor...

    --Chris

  2. A little off topic for CMAK, but I'm looking for information about the 21st Panzer's attack between Juno and Sword beaches on D-Day.

    Any info the grogs could spew would be much apprecited. Even better would be pointers to some books on the subject.

    I'm currently working from Six Armies in Normandy (John Keegan) which mentions the attack, and its lack of success, but doesn't really say much else.

    Thanks much!

    --Chris

  3. Michael Emrys --

    I *think* the AC's wheel is on the ground, but its a few meters ahead of that crater, which makes things look odd.

    Monty --

    The Maastrictian epoch in geology is named after a rock formation near that town. Its the last epoch of the Cretaceous, the last time period in which the dinosaurs lived. So my nickname comes from my interest in the extinction of the dinosaurs, and, kind of, from that town.

    --Chris

  4. Here are some photos I "found" in the atic. Funny... I didn't know I had relatives in the USSR in the 40s....

    OfficerandRiver.jpg

    A Soviet Officer over sees an oppossed river crossing in the summer of '44.

    IS2.jpg

    An IS2 in heavy action in the Courland Peninsula late in the war.

    T34andAC.jpg

    A T34 drives past a knocked out armored car outside Karkov in '43.

    Incidentally -- Some photoshop filters I've found useful for... um... "improving" these atic photos include motion blur (3 or 4 pixels) and add noise (3%).

    It also helps to erase little bits out of the edge of the picture after you place it in the frame. You can also rotate the entire image before you crop it to put it in the frame, photographers under fire have a hard time holding the camera perfectally level.

    Finally, I tend to think that photos from camera level 1 look best as that is the highest a real photographer could have gotten.

    --Chris

    [ August 16, 2003, 10:11 PM: Message edited by: Maastrictian ]

  5. Here is Treeburst155's latest numbers (the 600 trial)

    Observed AI hits: 228

    Observed Human hits: 196

    Expected: 212

    Chi Square: 2.4150943

    P: 0.1202

    So still not significant smile.gif . But getting closer. I'd be very curious to see your scenario Treeburst155 so that maybe we can see what difference there is, if any, between yours and Cameroons. After 300 trials Cameroons seems to show no bias or hint of bias, but with 600 trials I have to defer to your work at this point.

    Regarding Steve's post: Yea, as has been said no one thinks that BTS is lieing too us, if there is a difference it is due to a bug, nothing more. On the other hand if at the end of the day we find no discrepency then that will certainly silence BTS's detractors. Think the AI cheats? Well look at this thread where more than 1500 tests were run (and counting).

    --Chris

  6. Ok, more data.

    This is using Cameroon's scenerio he posted a while back (page 3 or 4??). This is testing first hit when the german side is controled by the AI or a human. No FOW is used, and only the Germans have any ammo, and its all AP or HE.

    AI hits 5

    Human hits 7

    AI hits 4

    Human hits 5

    AI hits 7

    Human hits 6

    AI hits 5

    Human hits 8

    AI hits 4

    Human hits 5

    AI hits 7

    Human hits 5

    AI hits 4

    Human hits 7

    AI hits 5

    Human hits 4

    AI hits 2

    Human hits 2

    AI hits 5

    Human hits 4

    AI hits 8

    Human hits 5

    AI hits 3

    Human hits 6

    AI hits 6

    Human hits 8

    AI hits 6

    Human hits 6

    AI hits 5

    Human hits 6

    AI hits 6

    Human hits 3

    AI hits 5

    Human hits 5

    AI hits 4

    Human hits 7

    AI hits 3

    Human hits 8

    AI hits 4

    Human hits 5

    So that's 200 trials on each side. Total results are 98 hits for the AI and 112 hits for the Human. Combining this with Cameroon's data with 100 trials on each side gives us:

    Expected = 157

    Observed AI hits = 153

    Observed Human hits = 161

    Chi Sqare = 0.2038

    P = 0.6516

    So this shows no signifcant results. More importantly this shows the Human edging out the AI, which is something I don't belive we've seen before, and sets my mind at ease to some extent.

    Later today or tonight I'll combine the data Warren got with my scenario with the date I got with it before and see what the combined results give us.

    --Chris

  7. Originally posted by Treeburst155:

    Playing the firing Germans in my test above 200 times, I scored 73 first round hits.

    Letting the AI play the firing Germans there were 82 first round hits out of 200 tests.

    Very interesting.

    Rather then simply declare that this is or is not signifigant (sorry Sgt. Emren, but rather than simply making a categorical statment its best to use statistical analysis) I will take a look with chi square.

    Expected = 77.5

    Observed 1 = 82

    Observed 2 = 73

    Chi Square = ((82 - 77.5)^2/77.5) + (73-77.5)^2/77.5) = .522

    P (one degree of freedom) = .470

    So this is not significant.

    Incedentally, if you do about 1000 of these tests and see the same ratio then P will be just under .05, and therefor significant.

    The weird thing about all these tests is that we always see some small advantage to the AI. I have no good explanation for this, nor have we definitiavely seen this to a significant P value, but that is why I'm still conducting tests and am interested in seeing other's tests. Its just so odd.

    --Chris

  8. It seems that no one is taking up my sugestion of everyone testing on the same scenario. That's fine, I forgive you all ;)

    But, I will request the following. When you post data please post it in as detailed a fashion as possible and give us the following:

    1) Results of each trial

    2) FOW setting used

    3) Targeting orders including any cover arcs

    4) Details of how you came to any P values or Chi Squre values, if you provide them.

    5) If the tests are independant (seperate firing lanes, no movement)

    6) If you eliminated special rounds (non HE / AP)

    7) What you think you are testing smile.gif

    8) What vehicles are being tested.

    9) A link to the scenario.

    Wow, I'm sure I've forgotten something.

    I'm happy to provide hosting space for scenarios if people don't have them.

    [Edit: Cameroon provided a link and sent me his scenario. I just completely missed it till now, sorry! Here is another link to his scenario, and another link to mine for anyone who wishes to replicate either test.

    Maastrictian

    Cameroon

    ]

    I never mentioned what FOW I was using in my test above, it was extreme.

    I strongly agree that there could be a random number problem. Any problem that there may be could crop up in any number of situations, so testing first shot percentages, time to fire percentages, and kill percentages are all valuable. Just keep in mind that because someone has ruled out one of those (and I don't currently think we have enough data to rule out any of those yet) does not mean the others may not be broken.

    Ok, I'm going to bed. I will get back to this when I can, hopefully tomorrow night.

    Thanks to all who are doing tests.

    --Chris

    [ October 31, 2002, 12:06 AM: Message edited by: Maastrictian ]

  9. Originally posted by Lt. Kije:

    2. Maastrician offers data from his test bed, but there is not widespread agreement on whether it supports Theory A (the AI does nothing we cannot, nor does it do anything we do with superhuman effectiveness) or Theory B (the AI is doing something different; it has some advantage over humans). It seems a much clearer analytic framework will be needed to make sense of data coming from the test bed. Perhaps I need to set up a simple chi-square two by two table for people to place their data into? It won't help if 100 people submit 100 idiosyncratic data reports.

    I'm happy to do the math if people will just post their data.

    -- Lt. Kije

    Scorekeeper and Historian

    smile.gif Thanks for summing up things so far. That really clarifies the thread.

    Pascal DI FOLCO

    Maastrician you did a very good job . Can't you rerun the H vs AI test with targeting orders issued ?

    I'd love to but I don't have time right now. I will tomorrow.

    CameroonI used identical units and units that couldn't fire back to test the first shot hit percentage. That was the original remark that WP made, that there was an "enhancement" for the first shot for the AI. I believe that my results and tests indicate that this is not true.
    Your tests look very good. I agree with your conclusions after doing the chi square, at least for your 100 trials. Don't forget that 100 trials is too low, as even Madmatt says.

    Chi Square works out as:

    Chi Square = ((55-52)^2/52) + ((49-52)^2/52) = .346

    Which gives a P (for one degree of freedom) = .556

    Which looks really good and sugests there is no difference between AI and Human performance.

    Can you e-mail me (dinosaur@noct.net) your scenario or simply put it online so others can exhaustively test it?

    I also want to point out that if there is no difference between the to hit chance of a human controled tank and an AI controled one there still may be a difference in target aquisition speed or some other factor. This is a very good test, but I also think "to the death" tests should be done as well as there are other places the AI could have an advantage.

    --Chris

  10. Originally posted by Cameroon:

    Ok, I see what you're saying, but I still believe that using the same vehicle will result in more 'sound' results. Controlling as many variables as possible smile.gif

    That works for me. It will probably quiet some of the detractors who want one more thing to pick at too smile.gif . If you have time to make the scenario I'll run it until the cows come home, and I hope others will too.

    I'd encourage you to:

    1) Use as many tanks as possible. On the order of 50.

    2) Make sure none of them have smoke or T or any other "special" ammo.

    3) Put all of them in rough ground so they don't move.

    4) Remove any flags so those aren't influencing the battle somehow

    5) Use T-34s as they don't have smoke dischargers.

    6) Add a bunch of units away from the battle so global morale plays a minimal role. Pillboxes are good as they have very few polygons.

    If you need hosting space, or a mirror I can put it up on my site. E-mail it to me at dinosaur@noct.net.

    --Chris

  11. Originally posted by Cameroon:

    One flaw that I see in the majority of these test scenarios (I commented on it above, but in an edit so it may be missed), is that the German vehicles have better optics.

    That's why captured vehicles should be used. smile.gif

    It actually will not matter because we are looking how the AI and a human perform in the exact same situation. We could look at T-26s vs. King Tigers and still get comparable results.

    (ok, ok, we couldn't. Really both tanks have to be able to kill each other to get meaningful results.)

    --Chris

  12. I also, am quite studiously not claiming anything smile.gif In fact, based on the 176 test I've run I'd say any difference is pretty small, if present at all.

    Warren -- try right clicking on the link I gave and choosing "save as". That works for me at least.

    Cameroon -- You are welcome to do whatever tests you want and post the results, but I think we wil all get farther if we all concentrate on using the same testing scenario, varrying only by the targeting orders we give. That way we can combine all our results and see what we get, rather than having 20 different tests testing different things none of which is significant by itself.

    I strongly agree with Madmatt's comment that we need more than 100 tests to get anywhere, preferably, in my mind, more than 1000. I can perform that many, but it will take me a week. It will be much quicker if everyone simply runs my scenario once or twice.

    (note, I'm not at all hung up about whose scenario gets run. Its just that I'm the only one whose posted a scenario for others to run at this point.)

    --Chris

  13. Originally posted by Cameroon:

    That is why, when we give a manual targeting order, the discrepancy goes away. Or when we do a hotseat game, since we can leave it all up to the TacAI.

    It should be noted that Warren's initial numbers that shows a discrepency are *with* a targeting order for the human player.

    I encourage you to download my scenario, run some tests with it using targeting orders and see what you get. I've spent an hour today on this issue, I don't have any more time to devote. But if you think we've forgotten something when doing our tests then do them yourself and tell us what you get!

    --Chris

  14. Originally posted by Warren Peace:

    Maastrician:

    Your results seem consistent with my initial observations. But you have extended them with the all important hot seat experiment.

    Seems to support an AI advantage.

    Warren

    Actually, my results do not suport yours as I find no significant difference when using the average of the AI vs. human battles as the expected result. The only significant finding I made was looking at the AI's advantage over a human vs. human battle.

    I'm very interested to see others test using the same battle and posting their results. If everyone who has posted to this thread runs the battle once as the Allies, once as the Axis and once as hotseat we will have destroyed more tanks than the Germans had at Kursk smile.gif And with a few thousand trials we will certainly be able to push the p values into the thousands place... if that is where they want to go.

    I also want to make clear to the general board here what I (at least) mean by "AI cheating". There is no reason to assume, nor am I assuming, that BTS (BFC, whatever) has made any concious decisions about the performance of the AI vs. the performance of a human player. The differences that are being seen (or are not being see in some cases) could easily be the result of minor programming error. I pursue this not because I want to show that BTS is a liar or something silly like that but because I want to improve the game.

    --Chris

  15. I've collected some data as well. I set up 44 fireing lanes each 20 meters wide. Lanes were seperated by tall pines so each test was independant. At the end of each I put a T-34/85 model 1944 and a Stug IIIG (middle). The tanks were 752 meters apart and were in rough so they couldn't move. Each tank was in rough, and so was imobalized. Further, each tank had only AP and HE rounds, I manually removed smoke and T rounds. Finally, I threw in a bunch of pillboxes for both sides such that nobody could see the pillboxes, this may reduce the effects of Global Moral. I issued no orders to my troops. I've put the scenario up on line here. I encourage everyone to take a look and to do your own tests with it and post the results here so we have as much data as possible.

    I ran 2 tests in each of the following situations:

    Human = Allies. Axis lose 11, Allies lose 33 and Axis lose 12, Allies lose 35

    Human = Axis. Axis lose 18, Allies lose 28 and Axis lose 17, Allies lose 30.

    (that's a total of 88 duels playing each side, 176 trials total)

    I also ran one test in each of the following situations:

    Human = both (hotseat). Allies process turns. Axis lose 14, Allies lose 33

    Human = both (hotseat). Axis process turns. Axis lose 10, Allies lose 35

    So, I'll calculate Chi Square in two different ways.

    First I'll calculate the Chi Squre using the average of all Human vs. Ai trials as the Expected value. So:

    Allied Expected loses = 63

    Axis Expected loses = 29

    When AI is playing Axis, observed Axis loses = 23

    When AI is playing Axis, observed Allies loses = 68

    When AI is playing Allies, observed Axis loses = 35

    When AI is playing Allies, observed Allied loses = 58

    So Chi Squre of AIs performence difference from the Expected performence = 1.638

    And Chi Square of Human's performence difference from the Expected performence = 1.638

    So p = .201 in both cases. In other words *not* significant.

    Now I will make similar calculations, but for the expected value I will use the average of both trials that were played hotseat. This should, theoretically, eliminate all variables except the AI and Human issue. I'm playing the exact same scenario with the exact same orders (none) so the only variable is AI control of troops.

    Allied Expected loses = 68

    Axis Expected loses = 24

    When AI is playing Axis, observed Axis loses = 23

    When AI is playing Axis, observed Allies loses = 68

    When AI is playing Allies, observed Axis loses = 35

    When AI is playing Allies, observed Allied loses = 58

    So Chi Squre of AIs performence difference from the Expected performence = 5.041

    And Chi Square of Human's performence difference from the Expected performence = 1.512

    So p = .219 in the Human's case. In otherwords, a human vs. the AI seems to take the same losses as a human vs. another human. But p = .025 in the AI' case. In otherwords, the AI seems to have the advantage over a human, as oposed to a human vs. human situation.

    Comments to others

    -----------------

    Mannheim Tanker -- You said I've confused significance with the p value. I don't belive this is the case. Acording to the website I reference above (http://faculty.vassar.edu/lowry/webtext.html) they are the same. Can you explain what you belive the difference to be and how one converts from one value to the other?

    L.Tankersley -- I get a p value of .416 with your data. This is pretty close to what you got, but I'm currious what the difference is.

    Finnaly let me encourage everyone to run a set of tests with the scenario I link to. The more data we have the more accurate our conclusions will be. It only takes 15 minutes smile.gif

    --Chris

×
×
  • Create New...