Ticket #1136 (closed bug: fixed)
FD_SET crash in multiplayer: 2.3 beta 3 (during checkSockets(), aka select(2))
| Reported by: | anonymous | Owned by: | |
|---|---|---|---|
| Priority: | normal | Milestone: | unspecified |
| Component: | other | Version: | unspecified |
| Keywords: | crash multiplayer | Cc: | |
| Blocked By: | Blocking: | ||
| Operating System: | GNU/Linux |
Description
I'm not ot actually sure what was my last action before this crash. Gentoo GNU/Linux, the game was compiled from source in my home directory and wasn't "make install"'ed
Attachments
Change History
comment:1 Changed 2 years ago by anonymous
Another crash, and i don't know if it is connected to this. Here's the console outputs:
FIRST CRASH
error |01:13:17: [recvDroidMove] Packet from 0 refers to non-existent droid 217, [Human : p0] error |01:14:22: [recvGroupOrder] Packet from 4 refers to non-existent droid 164876, [Human : p4] error |01:14:22: [recvDroidInfo] Packet from 4 refers to non-existent droid 164876, [Human : p4] error |01:14:50: [recvGroupOrder] Packet from 2 refers to non-existent droid 165554, [Human : p2] error |01:14:50: [recvDroidInfo] Packet from 2 refers to non-existent droid 165554, [Human : p2] error |01:14:51: [recvGroupOrder] Packet from 2 refers to non-existent droid 165642, [Human : p2] error |01:14:51: [recvDroidInfo] Packet from 2 refers to non-existent droid 165642, [Human : p2] error |01:17:04: [recvDroidMove] Packet from 4 refers to non-existent droid 163684, [Human : p4] error |01:23:28: [recvGroupOrder] Packet from 3 refers to non-existent droid 166907, [Human : p3] error |01:23:37: [recvGroupOrder] Packet from 3 refers to non-existent droid 166283, [Human : p3] error |01:23:48: [recvGroupOrder] Packet from 3 refers to non-existent droid 166651, [Human : p3] error |01:23:48: [recvGroupOrder] Packet from 3 refers to non-existent droid 165915, [Human : p3] error |01:23:48: [recvDroidMove] Packet from 3 refers to non-existent droid 165915, [Human : p3] error |01:23:55: [recvDroidInfo] Packet from 3 refers to non-existent droid 166251, [Human : p3] error |01:24:08: [recvDroidMove] Packet from 2 refers to non-existent droid 167346, [Human : p2] error |01:28:44: [recvDroidInfo] Packet from 4 refers to non-existent droid 167892, [Human : p4] error |01:29:51: [recvDroidInfo] Packet from 4 refers to non-existent droid 167004, [Human : p4] error |01:31:27: [recvDroidInfo] Packet from 3 refers to non-existent droid 168747, [Human : p3] error |01:31:27: [recvDroidMove] Packet from 3 refers to non-existent droid 168747, [Human : p3] error |01:31:27: [recvDroidInfo] Packet from 3 refers to non-existent droid 168747, [Human : p3] error |01:31:30: [recvDroidInfo] Packet from 3 refers to non-existent droid 169123, [Human : p3] error |01:31:30: [recvDroidMove] Packet from 3 refers to non-existent droid 169123, [Human : p3] error |01:31:30: [recvDroidInfo] Packet from 3 refers to non-existent droid 169123, [Human : p3] last message repeated 2 times error |01:31:30: [recvDroidMove] Packet from 3 refers to non-existent droid 169123, [Human : p3] error |01:31:30: [recvDroidInfo] Packet from 3 refers to non-existent droid 169123, [Human : p3] error |01:31:33: [recvDroidInfo] Packet from 3 refers to non-existent droid 169587, [Human : p3] error |01:31:33: [recvDroidMove] Packet from 3 refers to non-existent droid 169587, [Human : p3] error |01:31:33: [recvDroidInfo] Packet from 3 refers to non-existent droid 169587, [Human : p3] last message repeated 2 times error |01:31:33: [recvDroidMove] Packet from 3 refers to non-existent droid 169587, [Human : p3] error |01:31:33: [recvDroidInfo] Packet from 3 refers to non-existent droid 169587, [Human : p3] last message repeated 2 times last message repeated 1 times (total 3 repeats) error |01:31:39: [recvDroidMove] Packet from 3 refers to non-existent droid 168947, [Human : p3] error |01:31:39: [recvDroidInfo] Packet from 3 refers to non-existent droid 168947, [Human : p3] error |01:31:43: [recvDroidMove] Packet from 3 refers to non-existent droid 168835, [Human : p3] error |01:31:43: [recvDroidInfo] Packet from 3 refers to non-existent droid 168835, [Human : p3] last message repeated 2 times last message repeated 1 times (total 3 repeats) error |01:31:47: [recvDroidInfo] Packet from 3 refers to non-existent droid 168931, [Human : p3] error |01:31:53: [recvDroidInfo] Packet from 3 refers to non-existent droid 168859, [Human : p3] error |01:33:43: [recvGroupOrder] Packet from 4 refers to non-existent droid 169348, [Human : p4] error |01:33:43: [recvDroidInfo] Packet from 4 refers to non-existent droid 169348, [Human : p4] error |01:35:01: [recvDroidMove] Packet from 2 refers to non-existent droid 168762, [Human : p2] error |01:35:10: [NETbcast] Failed to send message: Bad file descriptor error |01:35:10: [socketClose] Failed to close socket: Bad file descriptor Saved dump file to '/tmp/warzone2100.gdmp-jP0KV3' If you create a bugreport regarding this crash, please include this file. Segmentation fault.
SECOND CRASH
error |02:01:06: [recvDroidMove] Packet from 3 refers to non-existent droid 164483, [Human : p3] error |02:02:20: [recvGroupOrder] Packet from 1 refers to non-existent droid 171001, [Human : p1] error |02:02:25: [recvDroidInfo] Packet from 1 refers to non-existent droid 171497, [Human : p1] error |02:02:25: [recvDroidInfo] Packet from 1 refers to non-existent droid 171033, [Human : p1] error |02:02:25: [recvDroidInfo] Packet from 1 refers to non-existent droid 171497, [Human : p1] error |02:02:25: [recvDroidInfo] Packet from 1 refers to non-existent droid 171033, [Human : p1] error |02:04:01: [recvGroupOrder] Packet from 1 refers to non-existent droid 172665, [Human : p1] error |02:04:01: [recvDroidInfo] Packet from 1 refers to non-existent droid 172665, [Human : p1] error |02:04:34: [NETbcast] Failed to send message: Connection reset by peer Saved dump file to '/tmp/warzone2100.gdmp-glK2dX' If you create a bugreport regarding this crash, please include this file. Segmentation fault.
comment:2 Changed 2 years ago by anonymous
Both times it happened when i was attacking and therefore moving units frequently.
comment:3 Changed 2 years ago by anonymous
And this time it was too quickly, in the very beginning of the game.
error |02:18:02: [NETbcast] Failed to send message: Bad file descriptor error |02:18:02: [socketClose] Failed to close socket: Bad file descriptor Saved dump file to '/tmp/warzone2100.gdmp-ZfsHtR' If you create a bugreport regarding this crash, please include this file. Ошибка сегментирования
comment:4 Changed 2 years ago by anonymous
Ошибка сегментирования = Segmentation fault (russian locale here;)
comment:5 Changed 2 years ago by hao
The host says he hang/crashed on the 2nd and 2rd times. But there was another player who said he just saw the host leaving but the game was still responding.
comment:6 Changed 2 years ago by hao
More crashes. It was a 8player map but only 5 humans and 1 ai were present.
error |02:37:53: [NETbcast] Failed to send message: Bad file descriptor error |02:37:53: [socketClose] Failed to close socket: Bad file descriptor Saved dump file to '/tmp/warzone2100.gdmp-X16BJB' If you create a bugreport regarding this crash, please include this file. Segmentation fault.
comment:7 Changed 2 years ago by hao
LC_ALL="en_US.UTF-8" ~/Downloads/warzone2100-2.3_beta3/src/warzone2100 error |02:55:44: [NETbcast] Failed to send message: Socket operation on non-socket error |02:55:44: [socketClose] Failed to close socket: Bad file descriptor Saved dump file to '/tmp/warzone2100.gdmp-Q3JYzW' If you create a bugreport regarding this crash, please include this file.
This has happened just when all my opponents have left, and i saw the last one leaving, and then it crashed.
comment:9 Changed 2 years ago by Cyp
Don't know if it's the same crash exactly, but mine keeps crashing at netplay.c:642.
Program terminated with signal 11, Segmentation fault.
[New process 5186]
[New process 5191]
[New process 5193]
[New process 5192]
#0 0x000000000059cf9b in checkSockets (set=0x2a0d820, timeout=0) at netplay.c:642
642 netplay.c: No such file or directory.
in netplay.c
(gdb) bt full
#0 0x000000000059cf9b in checkSockets (set=0x2a0d820, timeout=0) at netplay.c:642
tv = {tv_sec = 0, tv_usec = 0}
ret = <value optimized out>
fds = {fds_bits = {0 <repeats 16 times>}}
count = 1
i = 0
maxfd = 43880960
__FUNCTION__ = "checkSockets"
#1 0x000000000059fe20 in NETrecv (type=0x7fff01f32f3f "") at netplay.c:2592
received = 685640
size = 1
current = 0
__FUNCTION__ = "NETrecv"
#2 0x000000000051a8c0 in recvMessage () at multiplay.c:610
type = 0 '\0'
__FUNCTION__ = "recvMessage"
#3 0x000000000051b6c1 in multiPlayerLoop () at multiplay.c:297
joinCount = 0 '\0'
__FUNCTION__ = "multiPlayerLoop"
#4 0x00000000004f2db5 in gameLoop () at loop.c:266
psCurr = <value optimized out>
psNext = <value optimized out>
psCBuilding = <value optimized out>
psNBuilding = <value optimized out>
psCFeat = <value optimized out>
psNFeat = <value optimized out>
i = <value optimized out>
widgval = <value optimized out>
quitting = <value optimized out>
intRetVal = INT_INTERCEPT
clearMode = <value optimized out>
__FUNCTION__ = "gameLoop"
#5 0x00000000004f4e7d in main (argc=<value optimized out>, argv=<value optimized out>) at main.c:659
__FUNCTION__ = "main"
comment:10 Changed 2 years ago by hao
And this time I was hosting, and it crashed again. I was moving the map with right mouse button on the minimap.
error |04:02:36: [socketListen] Failed to set up IPv6 socket for li stening on port 2100: Address already in use error |04:02:36: [NEThostGame] Cannot connect to master self: Addre ss already in use error |04:02:43: [socketListen] Failed to set up IPv6 socket for li stening on port 2100: Address already in use error |04:02:43: [NEThostGame] Cannot connect to master self: Addre ss already in use error |04:02:55: [socketListen] Failed to set up IPv6 socket for li stening on port 2100: Address already in use error |04:02:55: [NEThostGame] Cannot connect to master self: Addre ss already in use error |04:03:00: [socketListen] Failed to set up IPv6 socket for li stening on port 2100: Address already in use error |04:03:00: [NEThostGame] Cannot connect to master self: Addre ss already in use error |04:03:07: [readLobbyResponse] Lobby error (406): Game unreac hable, failed to open a connection to: [87.255.17.177]:2100 error |04:03:07: [socketAccept] accept failed: Invalid argument last message repeated 2 times last message repeated 2 times (total 4 repeats) last message repeated 4 times (total 8 repeats) last message repeated 8 times (total 16 repeats) last message repeated 16 times (total 32 repeats) last message repeated 32 times (total 64 repeats) last message repeated 22 times (total 86 repeats) error |04:24:37: [recvDroidMove] Packet from 2 refers to non-existe nt droid 162002, [Human : p2] info |04:25:21: [seq_Play] unable to open 'sequences/victory.ogg' for playback error |04:25:36: [readLobbyResponse] Lobby error (406): Game unreachable, failed to open a connection to: [87.255.17.177]:2100 error |04:25:36: [socketAccept] accept failed: Invalid argument last message repeated 2 times last message repeated 2 times (total 4 repeats) last message repeated 4 times (total 8 repeats) last message repeated 8 times (total 16 repeats) last message repeated 16 times (total 32 repeats) last message repeated 30 times (total 62 repeats) error |04:48:16: [NETbcast] Failed to send message: Connection reset by peer >-(/-/Saved dump file to '/tmp/warzone2100.gdmp-GF1Qdm' If you create a bugreport regarding this crash, please include this file. Segmentation fault
comment:11 Changed 2 years ago by hao
I have a reason to believe that these crashes were caused by someone's leaving the game.
comment:12 Changed 2 years ago by hao
At least it almost always crashes when someone is leaving. I don't remember many "[...] has left" messages in 2.3 beta 3 AND 4. <--- the problem still exists!
comment:13 Changed 2 years ago by Giel
- Summary changed from crash in multiplayer: 2.3 beta 3 to FD_SET crash in multiplayer: 2.3 beta 3 (during checkSockets(), aka select(2))
I'm hoping this patch will aid in nailing down the cause. At the very least it should make the backtraces more useful.
-
lib/netplay/netplay.c
diff --git a/lib/netplay/netplay.c b/lib/netplay/netplay.c index 2b3a074..ae71917 100644
a b static int checkSockets(const SocketSet* set, unsigned int timeout) 739 739 for (i = 0; i < set->len; ++i) 740 740 { 741 741 if (set->fds[i]) 742 FD_SET(set->fds[i]->fd[SOCK_CONNECTION], &fds); 742 { 743 const SOCKET fd = set->fds[i]->fd[SOCK_CONNECTION]; 744 745 FD_SET(fd, &fds); 746 } 743 747 } 744 748 745 749 ret = select(maxfd + 1, &fds, NULL, NULL, &tv);
comment:14 Changed 2 years ago by Giel
comment:15 Changed 2 years ago by Giel
comment:18 Changed 2 years ago by Git SVN Gateway <gateway@…>
(In [401723958cb61f2110e28ecf20b9622b2c84d3c9]) Make backtraces regarding FD_SET related crashes (see #1136) more useful
git-svn-id: https://warzone2100.svn.sourceforge.net/svnroot/warzone2100/trunk@9950 4a71c877-e1ca-e34f-864e-861f7616d084
comment:16 Changed 2 years ago by Per
- Status changed from new to closed
- Resolution set to fixed
(In [9954]) 2.3: Fix SIGBUS errors when linux box is hosting and players disconnect. Patch reviewed by Buginator and Giel. Closes ticket:1136
