Jump to content

A bit of bad TCP/IP news


Guest Big Time Software

Recommended Posts

<BLOCKQUOTE>quote:</font><HR>Originally posted by aka_tom_w:

Is there any chance these suggestions could be part of the solution for the simultaneous crunch, on one common complier?

Fuerte sounds like he may know about these things.<HR></BLOCKQUOTE>

I'm not really an expert, but considering that the Seti@Home client works on several different platforms (even Unix/Linux), it should be possible to use such floating point routines which work exactly the same on all platforms.

Link to comment
Share on other sites

  • Replies 136
  • Created
  • Last Reply

Top Posters In This Topic

Fuerte wrote:

]From the numbers you gave it seems that you are using single precision (32-bit) floating point numbers. If that is so, then the fix is easy: use double precision (64-bit).

Doesn't work. You don't need to have many floating point operations before even double precision numbers start to deviate.

The suggestion of using BCDs could work, or using bignums. (I know that for some people those terms mean the same, but I like to differentiate them). The problem here is that floating point computations are done by hardware and simulating them with software causes a pretty large slowdown. Also, plugging a bignum library to an existing codebase can be a very large effort. (Been there, should do that).

- Tommi

Link to comment
Share on other sites

<BLOCKQUOTE>quote:</font><HR>P.S. Charles and I did the first test. Unfortunately, we both have the same CPU so this problem wasn't noticed until we widened the testing group.<HR></BLOCKQUOTE>

There's the solution, we all need to buy the same CPU that Steve and Charles are using. Problem solved.

------------------

...This is Romeo-Foxtrot, shall we dance?...

Link to comment
Share on other sites

<BLOCKQUOTE>quote:</font><HR>Originally posted by Kingfish:

There's the solution, we all need to buy the same CPU that Steve and Charles are using. Problem solved.

<HR></BLOCKQUOTE>

Or perhaps release a special edition of the TCP/IP patch. Costs $1000, but includes a computer built to a common CPU standard! wink.gif

However, BTS has once again shown that customer relations is well supported by providing us with this latest update!

Keep up the excellent work guys (*Sigh*, yes even you, Madmatt).

Mace

Link to comment
Share on other sites

Guest Madmatt

All I know is that in all my test battles I was kicking ass left and right! The other guys kept saying "Hey, I Killed that Tiger!" and I would reply with "It's FOW baby, SUCK IT UP!".

Once I compared saved game files though, it became apparent that luck, was not with me so much.... frown.gif

One other thing to consider though. Normal QB files are not that large and even at 28.8k modem speeds the time to transfer the battles will not be too bad. With Operations though you could be looking a 1 meg file and that will be a delay of several minutes.

Also, while we appreciate all the techies coming out from under rocks and caves, there really is no solution to this issue. In order to implement any other math technique (integers, shared data key sets etc..) would require a whole overhaul of the existing code structure and that is NOT an option. The system that Charles had setup had the beauty of incorporating the current code resolution schemes on each system but as indicated above it simply will not work properly due to slight variations in Floating Math computations prevalent in different chips.

The new method that Charles is working on plans to allow for the faster system to handle the number crunching which will help a little, but you still have to send the data back and forth and that is where the delay will be encountered. As soon as we get a chance to test the newest version we will report back and give you some more info to chew on

Madmatt

[This message has been edited by Madmatt (edited 11-06-2000).]

Link to comment
Share on other sites

I think they should go back to the reliable pbem style method even if it means a turn takes slightly longer. We wouldn't want to find out a month from now that there was something unforseen and have to go through all the waiting again. And even if it worked perfect (which we might not know right off) some questions still might be raised in an online game if a glitch occured. The pbem way will at least rule out the tcp/ip play method as the cause.

------------------

Thanks for Athskin!

Link to comment
Share on other sites

Guest Big Time Software

Aaronb,

Thanks for trying to help, but as I said... a solution is not to be found. The sync of the randoms isn't the problem; it's the internal floating point math. It simply is not the same from chip to chip.

In other words, one chip says 2 + 2 = 4.000000001 and the next one says 2 + 2 = 3.99999999.

Fuerte, double percision would be very slow and it still is likely to have the same problems as we have now.

Sigh... no easy solution except to go with the slower method that works no matter what frown.gif

Steve

Link to comment
Share on other sites

Guest AggroMann

Thank god you found this problem! I'd much rather play a slower TCP/IP game than a TCP/IP game that is inaccurate

------------------

AGGRO-MANN

Link to comment
Share on other sites

<BLOCKQUOTE>quote:</font><HR>Originally posted by Big Time Software:

Aaronb,

Thanks for trying to help, but as I said... a solution is not to be found. The sync of the randoms isn't the problem; it's the internal floating point math. It simply is not the same from chip to chip.

In other words, one chip says 2 + 2 = 4.000000001 and the next one says 2 + 2 = 3.99999999.

Fuerte, double percision would be very slow and it still is likely to have the same problems as we have now.

Sigh... no easy solution except to go with the slower method that works no matter what frown.gif

Steve<HR></BLOCKQUOTE>

Ok

This one has me thinking....

(dangerous I know smile.gif )

Does anyone else on the board know how other programs or institutions deal with this?

I know this is Crazy but doesn't NASA have to sync computers on the ground and on the space shuttle (Doh! I guess they just buy the same cpu.)

I am aware that floating point math problems have been issues with computers used in scientific precision and accuracy in the past, BUT I would NEVER have guessed this, ever so minor, deviation would effect combat results in CM. It is hard for me to imagine that Charles and Steve have had to literally invent the wheel on this issue? Do no other mulitplayer games have this problem? Is CM really so ground breaking that it has now run up against this minor deviation in the way different CPU's handle Math that it has crippled the simultaneous multiplayer crunch?

And no other software or video game company has tried this before?

I am in awe!

Totally, because that sure must be have been one MOTHER of a ground breaking attempt to do something REALLY new and different with that simultaneous multiplayer crunch proposal. smile.gif

Anyway thanks for all the detailed updates all us of and especially the computer geeks here, really appreciate all the news, even if it is not so good.

-tom w

Link to comment
Share on other sites

Guest Andrew Hedges

<BLOCKQUOTE>quote:</font><HR>Originally posted by aka_tom_w:

It is hard for me to imagine that Charles and Steve have had to literally invent the wheel on this issue? Do no other mulitplayer games have this problem? Is CM really so ground breaking that it has now run up against this minor deviation in the way different CPU's handle Math that it has crippled the simultaneous multiplayer crunch?

And no other software or video game company has tried this before?

-tom w<HR></BLOCKQUOTE>

I'm not a programmer, so this example may have little to do with the current problem and more to do with z-tex globulets smile.gif -- but...in Baldur's Gate Multiplayer (which I never played, but I read about it in the manual), it mentioned that sometimes people playing multiplayer would see different things on the screen, or have slightly different encounters, even though everything was supposed to be happening simultaneously. I think the manual explained this by saying that it happened because the connection was "asynchronous." Although it could just as well have said that it happened because the connection used manna, for all it meant to me.

Anyway, that sort of sounds like this problem, although I think BG used a master/slave arrangement. According the the manual, the occasionally different encounters wouldn't affect the outcome; I'm still not sure what they meant by that.

Link to comment
Share on other sites

Guest AL the red

<BLOCKQUOTE>quote:</font><HR>Originally posted by Big Time Software:

The problem has to do with different famalies of CPUs, and perhaps even different speed ranges within a family of chips. What happens is each CPU has a slightly different way of rounding floating point numbers. Even if a value is off by 1/1000ths it can cause the game to "diverge" and thus each player will see a different game play out than the other guy.

[This message has been edited by Big Time Software (edited 11-06-2000).]<HR></BLOCKQUOTE>

Yeah,i thought you`d have problems with that.

; wink.gif

Link to comment
Share on other sites

I would have thought that using the floating-point control register to explicitly set the rounding mode would have been able to solve those problems. I'm disturbed that it doesn't work as I was hopeing to use a simular method in a program that i'm working on.

------------------

----

To download my scenarios: go to http://www3.telus.net/pop_n_fresh/combatmiss/index.htm

[This message has been edited by Disaster@work (edited 11-06-2000).]

Link to comment
Share on other sites

Guest Duri Price

Well, many many moons ago I tested on Aces over Europe, and the mission recorder used the same random seed locking/synching they describe. Though this did actually lead to a bug that was caused by the time of day (a testers wet dream), it generally worked quite well. Save the seed, record the JS input, use that as a multiplier. You could send it from one machine to the next and it would work on both.

About four years ago I pried my copy from the bottom of a box and installed it. After getting it to run on a P166 (which was a challenge), I tried the recorded missions that came with the game. They worked. Recorded on a 386, they worked on P166.

Soooo.... dunno. I know I worked on one other game that used a similar approach for modem/network game play. We had problems, but it was because most of the data on individual objects was being calculated on each machine seperately and then transmitted to the other. Then the turn would execute in realtime, using a synched seed, and it would perform the same. The problem was that one datafield was being truncated and would turn over mid-way through the game, but only if you used a specific object on a full moon and sacrificed your own virginity. Still, we tracked it down and killed it, and it worked from there on out. Don't ask me why they exchanged datasets instead of executing them seperately with the same seed as the did in turn execution; I just don't know. This was about five years ago.

Soooo... again, dunno. To my knowledge it's been done at least twice in years past. Maybe there's something I'm unaware of in how it was done, or maybe it's the floating point resolution, or it could be the flying squirrle again. I believe BTS when they say their beating their heads against a wall. It may be something they could get a workaround going for, but if so, it might take -much- more effort than it's worth.

Thog

Link to comment
Share on other sites

There could be a silver lining to this.

Does this different floating point computation thingy mean that some processors provide you with a little bit of an advantage over others when you generate the turn? Sort of like using a set of dice that tend to roll low in Squad Leader where the good results for the attacker,IIRC, are at the low numbered end of the combat resolution tables, then slipping the other guy a set of dice that rolled high.

If this is true (or even makes sense) it could mean that some processors have a disadvantage and that I am not as big of a tactical dufus as I thought. I'm just using the wrong proc. It's been my Athy all along!

It could lead to a whole new generation of excuses! "Err...I would have kicked your butt, but my machine dropped a decimal place and my Tiger imploded."

What is an unfortunate setback for tpc/ip could have a profound positive effect on the morale of my tactically feeble brethern everywhere.

Link to comment
Share on other sites

I think the variance is likely between PowerPC and AMD/Intel. I know it is possible to do syncronization between AMD and Intel x86 platforms. PowerPC must do something different. IEEE-754 doesn't specify the internal precision of operations only the precision of fetch/store and the rounding mode. Therefor, two

implementations could both be IEEE-754 compliant and still give different results.

Link to comment
Share on other sites

Although it would probably be too much of a hassle, could it be possible to have a "fast" and a "compatible" mode of TCP/IP play ?

It would probably only work if it could be determined which CPUs vary from each other in their floating point rounding (i.e. - A Pentium, stepping 7 has a different FP rounding from a Pentium 2 stepping 3, etc.). Quite possibly more work than would be desired, but it would be nice to have the option to play with the "fast TCP/IP".

Again, thanks to BTS for keeping us posted on developments it really is appreciated.

Link to comment
Share on other sites


×
×
  • Create New...