Jump to content

A bit of bad TCP/IP news


Guest Big Time Software
 Share

Recommended Posts

Well, you guys probably already thought of this, but what the heck....

Does the C compiler you use generate inline FP instructions or does it call ( or INT to) library routines? -- if library routines are used, then you could modify them alone to force uniform "rounding" without needing to alter the entire CM code base.

...or...

You might force this situation by setting the compiler to use software FP and then substituting your (homemade) hardware FP routines for the software library at link time (or DLL load, etc.)

Dan

CM "floats," but the results are real.

[This message has been edited by Dr Dan (edited 11-07-2000).]

Link to comment
Share on other sites

  • Replies 136
  • Created
  • Last Reply

Top Posters In This Topic

Software FP calculations are very slow, so that's probably not an option.

The idea of masking the least significant digit(s) in a defined way doesn't work too because rounding differences can cause more significant bits to flip too (0.9999 to 1.0000 for example).

You should consider that there really isn't a quick solution to solve this problem.

Dschugaschwili

Link to comment
Share on other sites

<BLOCKQUOTE>quote:</font><HR>Originally posted by Dschugaschwili:

Software FP calculations are very slow, so that's probably not an option.

The idea of masking the least significant digit(s) in a defined way doesn't work too because rounding differences can cause more significant bits to flip too (0.9999 to 1.0000 for example).

You should consider that there really isn't a quick solution to solve this problem.

Dschugaschwili<HR></BLOCKQUOTE>

I did NOT suggest using software FP, but rather forcing the compiler to generate code using the software FP option, replacing the software FP routines with specially coded HARDWARE ones when the program is linked.

There are a number of ways to force the desired rounding effect, several of which have been mentioned earlier.

Actually, this may or may not be a quick fix depending on the information available to the programmers regarding the FP libraries, etc. For example, some compiler manufacturers supply library source with their distributions, making the task fairly easy, if a bit tedious.

Dan

CM is captivating.

Link to comment
Share on other sites

OK. I wasn't thinking (sheepish grin). Here's a quote from an Intel manual that might actually contribute to the discussion:

"The rounding control (RC) field of the FPU control register (bits 10 and 11) controls how the results of floating-point instructions are rounded. Four rounding modes are supported: round to nearest, round up, round down, and round toward zero. Round to nearest is the default rounding mode and is suitable for most applications. It provides the most accurate and statistically unbiased estimate of the true result."

I can only assume the the other CPU manufacturers provide similar configurable options for thier FPUs. (It strikes me that maybe one of the ways AMD gets more speed out of their FPU is to use a quicker rounding mode as default ???)

Link to comment
Share on other sites

Just out of curiosity...how many decimals are we talking about here?

Are we talking about 0,01 or 0,0000000000001 numbers, before there is any difference?

A long time since I did programming...thank god wink.gif

And why isnt it possible to round it up to say 5 decimals, then the results would surely be the same if the computations used 100 decimals...?

It must be the same if the turret rotates at 1,00001, even though it "should have" rotated at 1,0000100000000000000000001...or 1,0001000000000000000002(on another system)?

Anyway, think we all would have a game that is slower and correct, than slightly faster and incorrect.

Tnx for keeping us updated wink.gif

Link to comment
Share on other sites

<BLOCKQUOTE>quote:</font><HR>Originally posted by Big Time Software:

degreesK, the problem was first discovered between an Intel PIII and a AMD chip. If you know of some way to synch the two please give Charles an email about this (charles@battlefront.com).

<HR></BLOCKQUOTE>

The project i've been working on for the last year has been using the method of syncronisation you were trying.

Nothing special need be done to sync AMD w/Intel in my experience. I just double checked using a chaotic function on both an Intel Celeron 400 and a AMD Thunderbird 800. Results will very depending on whether you use float or double, optimisations on or off, set the _controlfp(...) rounding or precision, etc.

The results are always consistant from between CPUs however. ie: both the Thunderbird and Celeron give the same result using an executable compiled with the same options.

This is under Windows 2000 using Visual C++ 6.

Link to comment
Share on other sites

Steve of BTS, you are correct. I spent about 30 seconds rattling around those memories and came up with the following;

Aces Over Europe did your integer math. This created occasional problems in the flight model. I remember writing up a bug that involved the plan flipping 90 degrees when at the top of a loop in some circumstances. This was a rounding error it turns out, due to the math.

The other game was a sports game, and had no need for anything more complex than integer math.

Sorry for the confusion, my mind is going as I get further into management and away from work.

Thog

Link to comment
Share on other sites

<BLOCKQUOTE>quote:</font><HR>Originally posted by Buzzer:

And why isnt it possible to round it up to say 5 decimals, then the results would surely be the same if the computations used 100 decimals...?<HR></BLOCKQUOTE>

Yes, by this method it is possible to get the same results, but it probably is too hard to add the rounding code to all necessary places.

The solution would be to use double precision floating point (I don't think that double precision is any slower than single precision), and when comparing values, add a very small number to one of them.

For example, 1.00001 is less than 1.00002 but if you add 0.0001 to the first, then it is greater. I believe that when CM calculates if we have a penetration or not, it eventually does a comparison like this.

Again, it may be too hard to add code like this to the all right places.

Link to comment
Share on other sites

Fuerte:

<BLOCKQUOTE>quote:</font><HR>For example, 1.00001 is less than 1.00002 but if you add 0.0001 to the first, then it is greater. I believe that when CM calculates if we have a penetration or not, it eventually does a comparison like this.<HR></BLOCKQUOTE>

No, the whole point of the original system was to have no comparisons between systems. Each would do its thing all on its own. That is where the speed came from. So if one machine turns up 1.00001 and the other comes up with 1.00002 there is no way for the first machine to know that it must have 0.00001 added in order to be the same as the other.

The problem is very basic here. Generally, computations do not need to be predictable and exact to so many places left of the decimal. But the real problem here is that we can not ensure that each machine comes up with the same exact numerical values.

Something like the SETI program is, if I am guessing correctly, a sort of distributed prorcessing system. One machine creates some values and sends them to the host. The host is NOT independently processing the same information on its own, and therefore takes the guest's data "as is" and uses that. This is sorta the way PBEM works. And it is now how TCP/IP works for CM.

Yup, things have been totally recoded to work more like PBEM. We are testing it internally for at least the next week. Around Friday or so I will give you guys a new update on when we expect to release it.

Thanks,

Steve

Link to comment
Share on other sites

Looking forward to it! biggrin.gif

------------------

"I may disagree with what you have to say, but I shall defend, to the death, your right to say it."

Link to comment
Share on other sites

<BLOCKQUOTE>quote:</font><HR>Originally posted by Big Time Software:

So if one machine turns up 1.00001 and the other comes up with 1.00002 there is no way for the first machine to know that it must have 0.00001 added in order to be the same as the other.<HR></BLOCKQUOTE>

You didn't quite get what I was meaning, but it really doesn't matter. I'm sure that the new TCP/IP play which resembles the current PBEM play is the correct way of doing it.

But if you want to know what I meant:

If BOTH computers add 0.00001 to 1.000001 before comparing it to 1.000002, then BOTH computers think that the first number is greater, and we have a penetration (or something) on BOTH machines.

After thinking a bit, I think that my logic is flawed. The only way of doing the comparison right is to round all numbers to some precision before the comparison, as someone suggested.

[This message has been edited by Fuerte (edited 11-11-2000).]

Link to comment
Share on other sites

<BLOCKQUOTE>quote:</font><HR>Originally posted by Fuerte:

The only way of doing the comparison right is to round all numbers to some precision before the comparison, as someone suggested.

<HR></BLOCKQUOTE>

That's not likely to work either, as the deviations due to machine differences can propagate up if you do multiple calculations with a number.

I'm coming into this a little late (was out of town without a laptop), but I find the whole thing very interesting. It's going to result in some entertaining trips to the library...

------------------

Slayer of the Original Cesspool Thread.

Link to comment
Share on other sites

Ok I am just a little confused. I play tons of games online. And as I understand it you can play just about anything online. How is this TCP/IP problem different from that on any other game?

Number crunching happens in every game, and if you have a RTS, or any real time game, how is the "floating point" or "double" problem which was described any different from that in any other game?

I say; If you have to, copy. We won't tell anybody. Borg never tell. =)

Link to comment
Share on other sites

This may have been answered earlier in the thread, but exactly how much slower is the new code compared to the defective one as far as game play is concerned?

Thanks in advance.

------------------

The counter-revolution,

people smilling through their tears.

Who can give them back their lives, and all those wasted years.

Link to comment
Share on other sites

OK once again I am confused...

We have a hypothetical situation here, an infantry squad runs from point A to point B in a CM scenario:

IFF computer user X

AND

computer user Y

both start their same infantry squad in the same scenario at point A and both end at point B, are you telling me that these two infantry squads might actually wind up at different points on the map?

now:

IFF computer user X and computer user Y have different CPUs which do the number crunching, will this make a difference?

now:

IFF computer user X and computer user Y are playing an online game, and computer user X has a sherman that, according to user X's CPU's calculations, destroys a tiger, that tiger may not actually be destroyed on user Y's screen, correct? And if not destroyed on user Y's machine, all those tangent calculations related to un-destroyed tiger are rapidly diverging from user X's calculations?

now:

IFF user X and user Y should not have to share information, becasue if they do and their CPU does not do the calculations on its own, and arrive at the same conclusions, then what we actually have happening is two different games...?

(And we only want one game)

now:

IFF user X and user Y are playing a PBEM game, and user X sends user Y his/her file, user Y's CPU is doing the number crunching while user X's CPU is doing nothing? now: user X completes his part of turn 1 of the PBEM game and moves an inf squad from point A to point B on his map. When this file is sent to user Y, user Y's cpu is given the value for the point B at which the infantry squad ended up?

Or, user Y's CPU does its own calculation, disregarding what user X's CPU came up with (or would have come up with) as a point B and then uses this calculation as the actual game data in which it does the movie that we see...?

so:

user X's CPU computes different point B at which an infantry squad ends up, and therfore, this infantry squad is out of view from a 75mm inf gun, so the 75mm inf gun does not open fire, and therefore remains unseen from the four m4 shermans on a hill which would immediatly open fire on the 75mm inf gun, it is easy to see how much a minor calculation can have on the outcome of a battle.

however:

Now, my question is this:

If one computer makes different computations than another, why not let each computer independantly make its own calculations for speed's sake, then, when it comes time to combine the two calculations to make the movie, simply use ONE of the two computers to make the movie. That movie should be the same, and the both user X and user Y will get the same output on their screen. (???)

Link to comment
Share on other sites

<BLOCKQUOTE>quote:</font><HR>Originally posted by Ghengis Jim:

OK once again I am confused...

If one computer makes different computations than another, why not let each computer independantly make its own calculations for speed's sake, then, when it comes time to combine the two calculations to make the movie, simply use ONE of the two computers to make the movie. That movie should be the same, and the both user X and user Y will get the same output on their screen. (???) <HR></BLOCKQUOTE>

My impression from the little that I've read is that the initial plan was to only pass the orders files (where essentially no calculations are made (actually LOS and arty time to delivery get calculated, but those would get passed in the orders file anyway)).

Then someone would generate a seed for the random number generator, which would be synched in both machines. Since the code is the same, and the inputs (orders and random numbers) are the same, the outputs should be the same, right? Unfortunately, floating point processors are prone to small errors (well described in the Sun URL above) that are different from processor to processor, depending on internal architecture. I wouldn't be surprised if different releases of the allegedly same processor had minor variations (or major ones: the Pentium FP bug) In many cases this doesn't matter, especially in the 7th or 8th or 15th decimal place, but when comparing numbers (as CM must frequently do do determine things like LOS, hits, whether a squad breaks, etc.) a tiny difference in the numbers can make the difference between a hit and a miss.

There are ways to deal with this, but they may involve substantial bits of recoding (or at least careful code substitution) that isn't worth the effort relative to the payoff. I.e. BTS could get the patch out in a couple more weeks if they do TCP/IP much like PBEM, where all the computation for generating movies is done on one machine, since the code is very solid already, or they can mess around with a lot of floating point stuff, probably take a speed hit since the FP might be forced into software, introduce a few small and hard to find bugs, and delay the release of the patch several months, by which point everyone will have optical fiber to their homes and it won't matter whether you're sending 300K or 30K, and the only result will be that TCP/IP was delayed to spare users a few bps of bandwidth.

------------------

Slayer of the Original Cesspool Thread.

Link to comment
Share on other sites

<BLOCKQUOTE>quote:</font><HR>Originally posted by chrisl:

That's not likely to work either, as the deviations due to machine differences can propagate up if you do multiple calculations with a number.<HR></BLOCKQUOTE>

It would work. If all calculations would be done with double precision floating point (precision 15 digits), and all comparisons with single precision (7 digits, rounded correctly), for example, then it would work.

This is all theoretical, of course there is no reason to do it this way.

Link to comment
Share on other sites

 Share


×
×
  • Create New...