00:02:02 well, if I run the GC first thing in my image, it breaks already. ~ 00:03:19 even w/o parameter tweak, (ccl::gc) hoses my image. 00:03:38 whatever gc-unsafe operation happened, it did before the image was saved. 00:03:42 :-~ 00:26:13 -!- alms_ [~alms_@209-6-130-32.c3-0.bkl-ubr1.sbo-bkl.ma.cable.rcn.com] has quit [Quit: alms_] 00:27:23 alms_ [~alms_@209-6-130-32.c3-0.bkl-ubr1.sbo-bkl.ma.cable.rcn.com] has joined #ccl 00:38:53 -!- PuffTheMagic [uid3325@gateway/web/irccloud.com/x-amtifxrjqrsmhfka] has quit [Read error: Operation timed out] 00:39:19 PuffTheMagic [uid3325@gateway/web/irccloud.com/x-kvctxjynbktghrps] has joined #ccl 01:05:55 Might try enabling GC checks during compilation/loading. 01:16:33 -!- sellout- [~Adium@c-98-245-92-119.hsd1.co.comcast.net] has quit [Read error: Connection reset by peer] 01:17:23 sellout- [~Adium@c-98-245-92-119.hsd1.co.comcast.net] has joined #ccl 01:41:20 -!- segv- [~mb@dslb-188-102-168-176.pools.arcor-ip.net] has quit [Remote host closed the connection] 02:02:00 interestingly, I tried: (1) dumping the image without running the pre-dump hooks (2) running the hooks one by one in the new session and see where it breaks -- it didn't break. 02:02:26 next I can try having a (break) before dump, and see what causes the break (if anything) 02:02:52 I will clean up some of the redundant crap in that part of our code, and try again later tonight. 02:03:04 so... there's something fishy at work. 02:03:49 I don't believe we run any FFI or multithreaded code before this corruption happens. 02:04:16 -!- DataLinkDroid [~DataLinkD@1.146.119.134] has quit [Quit: Bye] 02:05:14 The integrity-checking stuff that I mentioned earlier can be really helpful. It'll often notice problems long before they manifest themselves as lisp errors. 02:11:37 ok 02:11:41 thanks a lot 03:33:07 erikc [~erikc@CPE00222d53fe78-CM00222d53fe75.cpe.net.cable.rogers.com] has joined #ccl 04:32:59 -!- erikc [~erikc@CPE00222d53fe78-CM00222d53fe75.cpe.net.cable.rogers.com] has quit [Quit: erikc] 05:46:51 meh. Some twiddling I did made the problem disappear... only to happen at a different place later. Also a corruption in a slot typechecker, though. 05:47:34 Vector at 0x302004a3a8dd has bogus header: 0x0 05:51:29 if I run a (ccl:gc) in a fresh image from slime, I get a connection closed. 05:53:18 that's at the slime repl. In the inferior lisp buffer, (ccl:gc) works 05:59:22 and all the lisps in my test farm eventually succumbed to this untimely death :-/ 06:10:52 -!- Fare [fare@nat/google/x-jspzlbgpuuuqslvb] has quit [Quit: Leaving] 06:41:04 -!- gz [~gz@setf.clozure.com] has quit [Quit: Movin' on] 06:41:32 gz [~gz@setf.clozure.com] has joined #ccl 07:07:19 -!- rme [~rme@50.43.190.179] has quit [Quit: rme] 07:58:46 DataLinkDroid [~DataLinkD@CPE-121-217-7-89.lnse1.cht.bigpond.net.au] has joined #ccl 10:39:40 -!- |3b|` is now known as |3b| 11:22:29 hydan [90a0e235@gateway/web/freenode/ip.144.160.226.53] has joined #ccl 11:30:05 -!- DataLinkDroid [~DataLinkD@CPE-121-217-7-89.lnse1.cht.bigpond.net.au] has quit [Quit: Bye] 12:48:07 Fare [fare@nat/google/x-ykxnduexfmcgirpu] has joined #ccl 12:49:36 interestingly, I have another image where I seem to be able to trigger the failure deterministically 12:50:15 (ccl:gc) is OK when we start, OK when we start the server, and dies when we run that test 12:54:14 I'll try to dichotomy the test into locating a triggering event 12:54:25 (also bug in a CTYPE, but in a different class) 13:21:25 interesting... the saved image doesn't bug out on a gc, but the type is already hosed on that particular class 13:30:54 problem being, since the corruption already happened, it's too late to dichotomize when it happened. 13:32:55 billstclair [~billstcla@unaffiliated/billstclair] has joined #ccl 13:33:43 let's try wholly disabling all our CLOS optimizations and see if that help 13:34:11 when rearranging the code, I may have enabled a bad optimization or changed some variable driving it 13:48:55 is there a way to test all these type predicates and find whether one is bad? 13:49:06 I'll find it 14:02:02 so I'll try funcalling all of them 14:57:31 Disabling the CLOS optimizations doesn't help 14:58:18 triggering the bug isn't fully deterministic (or deterministic based on factors modified by my twiddling around) 14:58:43 but I can see how type-predicate slots are getting corrupted 14:59:16 I can't swear that they are the first thing to be corrupted, of course, but they are getting corrupted. 15:00:15 not always for the same classes, but some are more likely than others to be corrupted 15:01:17 erikc [~erikc@209.20.28.194] has joined #ccl 15:15:26 -!- sellout- [~Adium@c-98-245-92-119.hsd1.co.comcast.net] has quit [Quit: Leaving.] 15:18:22 Blkt [~user@82.84.159.26] has joined #ccl 15:54:31 rme [~rme@50.43.190.179] has joined #ccl 16:34:38 -!- hydan [90a0e235@gateway/web/freenode/ip.144.160.226.53] has quit [Ping timeout: 245 seconds] 18:43:23 -!- Fare [fare@nat/google/x-ykxnduexfmcgirpu] has quit [Quit: Leaving] 18:55:05 Fare [fare@nat/google/x-nxrqgopafodzvroi] has joined #ccl 19:52:24 -!- Fare [fare@nat/google/x-nxrqgopafodzvroi] has quit [Ping timeout: 264 seconds] 20:44:52 dioxirane [~ln@unaffiliated/dioxirane] has joined #ccl 20:55:10 -!- dioxirane [~ln@unaffiliated/dioxirane] has quit [Quit: leaving] 21:07:00 dioxirane [~dioxirane@unaffiliated/dioxirane] has joined #ccl 21:36:36 -!- erikc [~erikc@209.20.28.194] has quit [Quit: erikc] 21:37:01 -!- dioxirane [~dioxirane@unaffiliated/dioxirane] has quit [Quit: leaving] 22:18:59 DataLinkDroid [~DataLinkD@1.149.190.29] has joined #ccl 22:49:56 -!- dmiles_afk [~dmiles@c-71-237-234-93.hsd1.or.comcast.net] has quit [Ping timeout: 256 seconds] 23:08:11 erikc [~erikc@CPE00222d53fe78-CM00222d53fe75.cpe.net.cable.rogers.com] has joined #ccl 23:17:38 sellout- [~Adium@75-147-19-61-NewEngland.hfc.comcastbusiness.net] has joined #ccl 23:45:24 -!- sellout- [~Adium@75-147-19-61-NewEngland.hfc.comcastbusiness.net] has quit [Read error: Connection reset by peer] 23:46:33 sellout- [~Adium@75-147-19-61-NewEngland.hfc.comcastbusiness.net] has joined #ccl