00:10:18 -!- huangjs [~huangjs@190.8.100.83] has quit [Read error: Connection reset by peer] 00:50:20 echo-area [~user@182.92.247.2] has joined #sbcl 01:57:41 kanru`` [~user@61-228-148-2.dynamic.hinet.net] has joined #sbcl 02:03:15 gko [~gko@114-34-168-13.HINET-IP.hinet.net] has joined #sbcl 02:18:49 -!- attila_lendvai [~attila_le@unaffiliated/attila-lendvai/x-3126965] has quit [Quit: Leaving.] 02:41:47 huangjs [~huangjs@190.8.100.83] has joined #sbcl 03:02:15 -!- kanru`` [~user@61-228-148-2.dynamic.hinet.net] has quit [Ping timeout: 260 seconds] 03:58:24 -!- specbot [~specbot@pppoe.178-66-8-192.dynamic.avangarddsl.ru] has quit [Disconnected by services] 03:59:10 specbot [~specbot@pppoe.178-66-69-164.dynamic.avangarddsl.ru] has joined #sbcl 04:01:42 -!- stassats` [~stassats@wikipedia/stassats] has quit [Ping timeout: 252 seconds] 04:01:52 -!- stassats [~stassats@wikipedia/stassats] has quit [Ping timeout: 272 seconds] 04:32:05 stassats [~stassats@wikipedia/stassats] has joined #sbcl 04:50:15 -!- echo-area [~user@182.92.247.2] has quit [Read error: Connection reset by peer] 05:13:18 angavrilov [~angavrilo@217.71.227.190] has joined #sbcl 06:18:45 tcr [~tcr@84-72-21-32.dclient.hispeed.ch] has joined #sbcl 07:07:12 -!- huangjs [~huangjs@190.8.100.83] has quit [Remote host closed the connection] 07:14:32 -!- tcr [~tcr@84-72-21-32.dclient.hispeed.ch] has quit [Ping timeout: 246 seconds] 07:15:38 tcr [~tcr@84-72-21-32.dclient.hispeed.ch] has joined #sbcl 07:29:27 -!- tcr [~tcr@84-72-21-32.dclient.hispeed.ch] has quit [Quit: Leaving.] 07:40:34 nikodemus [~nikodemus@87-95-9-21.bb.dnainternet.fi] has joined #sbcl 07:40:34 -!- ChanServ has set mode +o nikodemus 07:41:39 good morning 07:42:04 nikodemus: morning, any progress on that gc thing? 07:42:16 stassats: i was just about to start looking at it 07:42:32 do you have a good test-case? 07:42:45 ok, i have to run (sb-ext:gc :full t) periodically now 07:42:58 i'll try to sketch something 07:43:20 thanks 07:44:04 i was (am!) sort of hoping the yesterday's size_t -> ssize_t fix would have taken care of this as well :) 07:50:51 it did not 07:58:55 just running (defun test (n) (loop repeat n count (evenp (length (make-list 100000))))) repeatedly with some high n, like 10000 does it for me 08:00:08 and bytes-consed-between-gcs is 400M 08:14:14 sdemarre [~serge@91.176.60.75] has joined #sbcl 08:23:59 homie` [~levgue@xdsl-78-35-190-135.netcologne.de] has joined #sbcl 08:26:48 -!- homie [~levgue@xdsl-84-44-179-12.netcologne.de] has quit [Ping timeout: 252 seconds] 08:29:30 thanks 09:07:22 hm, can't replicate on OS X, i'll need to head to office to test on linux 09:30:18 -!- nikodemus [~nikodemus@87-95-9-21.bb.dnainternet.fi] has quit [Quit: This computer has gone to sleep] 10:41:34 nikodemus [~nikodemus@cs27100107.pp.htv.fi] has joined #sbcl 10:41:34 -!- ChanServ has set mode +o nikodemus 10:51:44 just for kicks tried it on CCL, the RES memory stays low and constant, even better than what we had before the regression 10:55:00 attila_lendvai [~attila_le@178-164-242-247.pool.digikabel.hu] has joined #sbcl 10:55:00 -!- attila_lendvai [~attila_le@178-164-242-247.pool.digikabel.hu] has quit [Changing host] 10:55:00 attila_lendvai [~attila_le@unaffiliated/attila-lendvai/x-3126965] has joined #sbcl 10:56:15 *stassats* got a hunch and eagerly awaits for the rebuild 11:02:43 stassats: http://paste.lisp.org/display/129200 11:02:54 does that show it for you? 11:03:26 RES stays below 500m for me 11:03:46 hm. x86 or x86-64? 11:03:53 the latter, of course 11:04:35 does it make RES climb for you? 11:05:08 it does 11:06:05 strange. i've had it running for 5min now, and RES hovers between 400m and 480m or so 11:06:12 and below 500m isn't a really good indicator, before it was just around 100M 11:06:23 it just passed 1024M here 11:06:36 Linux solipsist 3.0.0-17-generic #30-Ubuntu SMP Thu Mar 8 20:45:39 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux 11:06:55 if i leave bytes-consed-between-gcs out, it stays around 100m 11:07:32 i don't think that linux matters here, since i inserted a print statement into where it releases pages, and it never executes 11:07:38 SBCL 1.0.56.51-1a104ef 11:07:43 so, the logic is borked somewhere 11:08:12 i believe that, but something funky is going on since i can't reproduce it 11:08:33 *nikodemus* calls out for volunteers with fresh sbcls to run that test 11:08:49 what's your dynamic-space-size? 11:09:51 8G 11:10:07 let me try with that 11:10:21 4G on the laptop, same thing 11:10:31 bytes-consed-between-gcs is untouched, computed from that 11:11:48 ok, now i see it going up! 11:12:21 time to instrument the runtime 11:13:20 good to hear 11:16:40 the release happens here: https://github.com/sbcl/sbcl/blob/master/src/runtime/gencgc.c#L3859 11:17:03 so, apparently, generations never get too old for it to be triggered 11:17:48 yeah, it never goes beyond gen 1 in that loop 11:18:34 did you bisect this to the "more aggressive" commit, or was that an assumption? 11:18:38 what if the memory was released every time? how big a hit would it be 11:19:25 i checked the commit before it, it was fine, and the commit before the tune up, which didn't work 11:19:32 ok 11:19:34 didn't do the proper bisection 11:20:08 i suspect it might not cost too much if we did release every time, but i want to figure out what causes the regression first 11:20:24 is zeroing pages the only way to release memory? 11:20:43 i wonder how ccl does it 11:21:35 so commit before "gencgc: reclaim space more aggressively" was fine? 11:21:42 yes 11:22:34 and the one before "tune the recent "more aggressive GC" changes" was bad 11:22:58 there's a possibility that something broke it in-between, but those two seem like the ones to cause it 11:24:50 and i remember that oom killed my sbcl on april 16, so there was no commits touching gencgc before those dates 11:28:38 well, i could just test it in the same time it took to type all this out 11:30:19 i'm bisecting while reading code and thinking 11:31:12 i think i have a clue 11:31:18 *nikodemus* gets a fresh tree 11:31:20 ok, the commit before it is right, the commit itself is wrong 11:32:18 the commit messes up the logic surrounding 11:32:19 gen_to_wp = gen; 11:32:52 -!- attila_lendvai [~attila_le@unaffiliated/attila-lendvai/x-3126965] has quit [Quit: Leaving.] 11:40:09 always reclaiming pages would make sense for people on rented servers where memory is limited 11:40:48 but even on a regular desktop, allowing OS to use all free memory never hurts 11:41:48 except that it makes nursery collections more expensive 11:42:33 doing it for all non-nursery collections might be OK, though 11:43:28 i usually have several sbcls running, 100M here, 100M there, and you don't have enough memory for caches and stuff 11:43:41 (i guess i should stop complaining and install 32G of memory) 11:44:02 or else i won't be able to justify it 11:56:57 saschakb [~saschakb@p4FEA0AC8.dip0.t-ipconnect.de] has joined #sbcl 12:04:28 -!- froydnj_ [nfroyd@people.mozilla.com] has quit [Ping timeout: 272 seconds] 12:20:34 wtf? i instrument all the behavioural changes from those commits, but none of them occur 12:20:46 so how does it change things? 12:21:04 *nikodemus* sense the presence of something deeply bogus 12:31:12 -!- homie` [~levgue@xdsl-78-35-190-135.netcologne.de] has quit [Quit: ERC Version 5.3 (IRC client for Emacs)] 12:32:51 -!- nikodemus [~nikodemus@cs27100107.pp.htv.fi] has quit [Ping timeout: 244 seconds] 12:33:25 nikodemus [~nikodemus@dsl-hkibrasgw4-fe53dc00-32.dhcp.inet.fi] has joined #sbcl 12:33:25 -!- ChanServ has set mode +o nikodemus 12:34:26 OOM... 12:40:33 froydnj [nfroyd@people.mozilla.com] has joined #sbcl 12:43:05 homie [~levgue@xdsl-78-35-190-135.netcologne.de] has joined #sbcl 13:10:50 Kryztof [~user@81.174.155.115] has joined #sbcl 13:10:55 stassats: i need another test-case to find the regression 13:11:25 eg. 26d8f7707843aba4ba2a071b3b2d4c91e8c0d798 already shows RES going over 500m using the test-case i have 13:12:22 (i'll fix the issue as show by this test-case anyways, but i still want to figure out what i broke...) 13:20:50 saschakb_ [~saschakb@p4FEA0AC8.dip0.t-ipconnect.de] has joined #sbcl 13:39:17 -!- gko [~gko@114-34-168-13.HINET-IP.hinet.net] has quit [Ping timeout: 265 seconds] 13:52:20 milanj [~milanj_@93-87-100-199.dynamic.isp.telekom.rs] has joined #sbcl 13:52:41 as i thought: remapping on every GC is clearly more expensive 13:53:37 slows down my benchmark from 320k iterations / sec to 270k/s 13:55:18 kwmiebach [~kwmiebach@164-177-155-66.static.cloud-ips.co.uk] has joined #sbcl 14:08:56 gko [~gko@114-34-168-13.HINET-IP.hinet.net] has joined #sbcl 14:08:57 -!- ASau` [~user@95-25-227-191.broadband.corbina.ru] has quit [Ping timeout: 244 seconds] 14:12:31 ASau` [~user@95-25-227-191.broadband.corbina.ru] has joined #sbcl 14:19:27 LiamH [~healy@pool-74-96-18-66.washdc.east.verizon.net] has joined #sbcl 14:20:20 stassats: doing it for every third nursery collection is better, but still a 3% performance hit 14:22:15 stassats: did you have another test case? something that doesn't just cons things that are immediately garbage? 14:45:03 -!- tsuru` [~charlie@adsl-98-87-45-242.bna.bellsouth.net] has quit [Read error: Connection reset by peer] 14:48:42 ok. i think the way to fix this is to keep a count of pages eligible for handing back to the OS, and trigger on that count directly. "size of GC", "N GCs", "bytes released" are all usable heuristics, but not ideal 14:48:55 i don't think keeping such a count should be too expensive 14:54:26 -!- saschakb_ [~saschakb@p4FEA0AC8.dip0.t-ipconnect.de] has quit [Remote host closed the connection] 15:04:05 leuler [~user@p5490505E.dip.t-dialin.net] has joined #sbcl 15:23:07 -!- milanj [~milanj_@93-87-100-199.dynamic.isp.telekom.rs] has quit [Quit: Leaving] 15:26:36 lichtblau [~user@port-92-195-61-68.dynamic.qsc.de] has joined #sbcl 15:29:49 huangjs [~huangjs@190.8.100.83] has joined #sbcl 15:42:08 -!- scymtym [~user@2001:638:504:2093:226:b9ff:fe7d:3e1f] has quit [Ping timeout: 245 seconds] 15:45:45 Quadrescence [~quad@unaffiliated/quadrescence] has joined #sbcl 15:50:11 -!- Quadrescence [~quad@unaffiliated/quadrescence] has quit [Read error: Operation timed out] 15:52:14 -!- nikodemus [~nikodemus@dsl-hkibrasgw4-fe53dc00-32.dhcp.inet.fi] has quit [Ping timeout: 255 seconds] 16:05:27 nikodemus [~nikodemus@cs78186070.pp.htv.fi] has joined #sbcl 16:05:27 -!- ChanServ has set mode +o nikodemus 16:29:29 gabnet [~gabnet@ACaen-257-1-73-141.w86-220.abo.wanadoo.fr] has joined #sbcl 16:54:44 -!- gabnet [~gabnet@ACaen-257-1-73-141.w86-220.abo.wanadoo.fr] has quit [Quit: Quitte] 17:11:17 -!- LiamH [~healy@pool-74-96-18-66.washdc.east.verizon.net] has quit [Ping timeout: 246 seconds] 17:46:47 LiamH [~healy@pool-74-96-18-66.washdc.east.verizon.net] has joined #sbcl 17:57:54 -!- homie [~levgue@xdsl-78-35-190-135.netcologne.de] has quit [Read error: Connection reset by peer] 17:58:31 homie [~levgue@xdsl-78-35-190-135.netcologne.de] has joined #sbcl 18:08:26 nikodemus: my original test case is (mapcar 'require '(closer-mop cl-ppcre cxml cxml-stp closure-html drakma named-readtables iterate cffi trivial-garbage bordeaux-threads chipz trivial-gray-streams conium prepl osicat command-line-arguments cl-pdf cl-typesetting postmodern alexandria csv-parser ironclad cl-json ht-simple-ajax hunchentoot local-time vecto simple-date cl-who cl-jpeg salza2)) 18:28:45 thanks 18:29:07 i'll give that a whirl tomorrow 18:31:19 well, just compiling as many systems as you have 18:43:15 is there a way to allocate an lisp object in foreign address and not garbage collected? 18:43:51 huangjs: use FFI 18:44:35 huangjs: if you're content with simple-arrays there's https://github.com/sionescu/static-vectors/ 18:46:02 stassats: well, i want to store things like structs and cons as well 18:46:16 why do you want to do that? 18:46:38 stassats: shared memory with other sbcl processes 18:57:06 stassats: i'm trying to load balancing incoming web requests to several sbcl processes, to increase the GC scalability, the problem is that to share the data between processes i need to modify my code to use the FFI (it's short work though :)... i'm wondering if there's a easier way to hack into the allocator and GC. 19:00:14 huangjs: use a DB 19:00:58 -!- Kryztof [~user@81.174.155.115] has quit [Read error: No route to host] 19:01:05 fe[nl]ix: i'm trying to do something with ultimate latency (~ns) :) 19:18:52 huangjs: (1) use stack allocation where possible (2) save a core. objects loaded from the core file aren't moved, and if you have multiple instances they automatically share that memory 19:19:15 with CoW? 19:19:20 yeah 19:19:26 nikodemus: thanks for the advice :) i'm using both techniques 19:19:37 i thought huangjs wanted to use it for communication 19:19:37 nikodemus: but i need some writable data to share 19:19:41 yeah 19:20:03 -!- angavrilov [~angavrilo@217.71.227.190] has quit [Ping timeout: 245 seconds] 19:21:02 huangjs: then if at all possible, represent the data in unboxed form, in vectors of unsigned-byte 8 or similar -- and those you can easily share, or use shared mmaps, etc 19:21:21 so the area is PROT_READ | PROT_WRITE 19:21:48 huangjs: if you need to refer to lisp objects from such unboxed storage, have a unique id per object, and store that id in the unboxed stoage 19:21:50 nikodemus: the problem is structs... 19:23:19 (one of the things that kills GC performance is long-lived objects that keep being modified -- no problem if you just have a few of them, but massive simple-vector that keep getting rewritten are poison) 19:25:17 the problem with having a shared memory between two processes is that if you but (cons foo bar) in there, then both FOO and BAR need to be moved there as well... and if you do (setf (car that-cons) quux) then QUUX needs to be migrated there too 19:26:21 nikodemus: yeah, you're right 19:26:54 but if you can figure out a packed/unboxed representation, things become very easy 19:27:07 so byte arrays are the way to go 19:27:21 extra benefit: modifying long-lived unboxed objects doesn't cause hurt GC at all 19:27:59 huangjs: as long as array-element-type is not t, it's unboxed in sbcl 19:28:02 why rewriting simple-vector are poison? 19:28:55 imagine you have a simple-vector taking up, say 100 pages, that's survived all the way to the oldest generation 19:29:10 initially it starts out write-protected 19:29:32 ah, i didn't know that 19:29:34 then you write to it, and that page is unprotected 19:30:12 now every GC -- even nursery collections -- needs to scan all those unprotected pages, in case they've gained pointers to young objects 19:30:45 only when that old generation is collected, does it get write protected again -- because then we know that it won't have any references to young objects 19:30:48 this only occurs if it's simple-vector (simple-array t) right? 19:31:06 any object capable of containing pointers, really 19:31:24 but long-lived hash-tables and simple-vectors are typical cuprits 19:31:26 i see. 19:32:26 there's always going to be /some/ old generation scavenging going on when doing a nursery collection, but if you keep dirtying old objects, it can make for pretty expensive nursery collections 19:32:43 easy way to tell how badly you're being hurt by this: 19:32:55 first run (gc) to clean up nursery 19:33:12 then run (time (gc)) to see how much an empty nursery collection takes 19:33:27 then run (gc :full t) to get everything write-protected again 19:33:31 yeah, some nursery collection latency is about 200ms ~ 500ms in my app, too long for web server. i'll check that 19:33:46 then run (time (gc)) to see how much things improve 19:34:26 so i had a plan to disable gc for every 10 requests and do a manual gc :), the load balance only need to be aware which sbcl process is availabe :) 19:34:57 thanks for the good advice! 19:35:08 don't disable it, just put bytes-consed-between-gcs high 19:35:17 without-gcing is evil 19:35:23 ok 19:36:53 it really bugs me that we're at the point where the best we can do is offer workarounds. we'd really need one or two new GCs for different domains -- one that doesn't stop-the-world, at least, and maybe an incremental one as well 19:37:54 i was asking about hugepage support in sbcl yesterday, so keep oldest gen small page makes more sense? 19:38:17 incremental sounds very attractive. 19:38:17 i missed that 19:38:36 incremental doesn't typically have a great throughput, though 19:39:01 but there's at least one realtime gc for java whose paper looked interesting 19:39:05 hairy, but... interesting 19:39:31 yeah, i was reading pkhuong's blog yesterday... and blogs of Vitaly Mayatskikh, just got curious :) 19:39:44 the DB idea might work out as well, actually, if allows you to write modifications to young objects, serialize them back to disk (or memcached, or whatever), and let them become garbage 19:39:48 about the TLB cache improvement using huge page 19:40:43 paul also has some tentative code for software write barriers, which could help with the cost of modifications to old objects 19:41:08 we should clone paul, really 19:41:18 :) 19:42:17 i moved from Japan to Chile 3 weeks ago btw , now back to lisp hacking. and now more time on irc and mailing list :) 19:42:53 were you working at MSI ? 19:42:57 yeah 19:43:05 what's up in chile? 19:43:41 i'm doing startup with friends, there's a platform for startups here. 19:44:59 neat :) 19:47:00 -!- nikodemus [~nikodemus@cs78186070.pp.htv.fi] has quit [Quit: This computer has gone to sleep] 20:14:06 -!- slyrus_ [~chatzilla@99-28-161-110.lightspeed.miamfl.sbcglobal.net] has quit [Ping timeout: 252 seconds] 20:55:48 nikodemus [~nikodemus@cs78186070.pp.htv.fi] has joined #sbcl 20:55:48 -!- ChanServ has set mode +o nikodemus 20:58:26 attila_lendvai [~attila_le@178-164-242-247.pool.digikabel.hu] has joined #sbcl 20:58:26 -!- attila_lendvai [~attila_le@178-164-242-247.pool.digikabel.hu] has quit [Changing host] 20:58:27 attila_lendvai [~attila_le@unaffiliated/attila-lendvai/x-3126965] has joined #sbcl 20:58:47 -!- sdemarre [~serge@91.176.60.75] has quit [Ping timeout: 246 seconds] 21:19:29 homie` [~levgue@xdsl-84-44-153-69.netcologne.de] has joined #sbcl 21:22:25 -!- homie [~levgue@xdsl-78-35-190-135.netcologne.de] has quit [Ping timeout: 246 seconds] 21:39:31 scymtym [~user@2001:638:504:2093:226:b9ff:fe7d:3e1f] has joined #sbcl 21:44:20 -!- scymtym [~user@2001:638:504:2093:226:b9ff:fe7d:3e1f] has quit [Read error: Connection reset by peer] 21:44:35 scymtym [~user@2001:638:504:2093:226:b9ff:fe7d:3e1f] has joined #sbcl 21:50:53 -!- homie` [~levgue@xdsl-84-44-153-69.netcologne.de] has quit [Quit: ERC Version 5.3 (IRC client for Emacs)] 21:52:43 -!- leuler [~user@p5490505E.dip.t-dialin.net] has quit [Quit: ERC Version 5.1.2 $Revision: 1.796.2.6 $ (IRC client for Emacs)] 21:57:01 homie [~levgue@xdsl-84-44-153-69.netcologne.de] has joined #sbcl 21:57:35 -!- LiamH [~healy@pool-74-96-18-66.washdc.east.verizon.net] has quit [Ping timeout: 246 seconds] 22:13:51 LiamH [~healy@vpn219118.nrl.navy.mil] has joined #sbcl 22:18:29 -!- LiamH [~healy@vpn219118.nrl.navy.mil] has quit [Ping timeout: 248 seconds] 22:20:17 prxq [~mommer@mnhm-5f75cb68.pool.mediaWays.net] has joined #sbcl 22:22:51 nikodemus: incremental gc can be another kickstarter project 22:29:28 the gc has so many implicit assumptions, it's not easy to touch it 22:29:57 maybe it should be written in a more clear manner 22:30:06 or i just can't read C 22:30:55 the lack of REPL, arglists and M-. certainly doesn't help 22:31:17 a gc that plays better with concurrency would be great 22:31:32 a precise gc wouldn't hurt either 22:34:18 hi 22:35:16 I'm running sbcl 1.0.56, and I've got problem with defpackage: on 1.0.55 functions in the package I wrote are found with find-symbol, in 1.0.56 not all symbols are found 22:36:15 LiamH [~healy@pool-74-96-18-66.washdc.east.verizon.net] has joined #sbcl 22:36:57 i doubt it's an sbcl problem, ask in #lisp 22:52:21 -!- kwmiebach [~kwmiebach@164-177-155-66.static.cloud-ips.co.uk] has quit [Read error: Connection reset by peer] 22:52:53 -!- LiamH [~healy@pool-74-96-18-66.washdc.east.verizon.net] has quit [Ping timeout: 246 seconds] 22:53:06 kwmiebach [~kwmiebach@164-177-155-66.static.cloud-ips.co.uk] has joined #sbcl 23:09:19 -!- prxq [~mommer@mnhm-5f75cb68.pool.mediaWays.net] has quit [Quit: Leaving] 23:24:37 -!- nikodemus [~nikodemus@cs78186070.pp.htv.fi] has quit [Quit: Leaving] 23:39:26 -!- kwmiebach [~kwmiebach@164-177-155-66.static.cloud-ips.co.uk] has quit [Read error: Connection reset by peer] 23:39:46 kwmiebach [~kwmiebach@164-177-155-66.static.cloud-ips.co.uk] has joined #sbcl