00:25:01 -!- flip214 [~marek@unaffiliated/flip214] has quit [Ping timeout: 240 seconds]
00:29:38 flip214 [~marek@unaffiliated/flip214] has joined #sbcl
02:38:00 -!- loke [~elias@bb115-66-85-121.singnet.com.sg] has quit [Ping timeout: 240 seconds]
02:41:09 loke [~elias@bb115-66-85-121.singnet.com.sg] has joined #sbcl
04:12:39 -!- loke [~elias@bb115-66-85-121.singnet.com.sg] has quit [Read error: Operation timed out]
04:15:53 loke [~elias@bb115-66-85-121.singnet.com.sg] has joined #sbcl
06:24:49 sdemarre [~serge@91.176.28.90] has joined #sbcl
08:39:54 Blkt [~user@89-96-199-46.ip13.fastwebnet.it] has joined #sbcl
09:03:43 nikodemus [~nikodemus@cs181056239.pp.htv.fi] has joined #sbcl
09:03:43 -!- ChanServ has set mode +o nikodemus
09:36:54 <Blkt> good morning everyone
10:21:59 akovalen` [~anton@95.73.122.127] has joined #sbcl
10:23:28 -!- akovalenko [~anton@95.73.216.147] has quit [Ping timeout: 244 seconds]
10:45:39 -!- Blkt [~user@89-96-199-46.ip13.fastwebnet.it] has quit [Quit: brb]
10:51:18 Blkt [~user@89-96-199-46.ip13.fastwebnet.it] has joined #sbcl
11:19:07 -!- nikodemus [~nikodemus@cs181056239.pp.htv.fi] has quit [Quit: This computer has gone to sleep]
11:22:37 nikodemus [~nikodemus@cs181056239.pp.htv.fi] has joined #sbcl
11:22:37 -!- ChanServ has set mode +o nikodemus
11:51:23 -!- akovalen` is now known as akovalenko
13:09:41 -!- akovalenko [~anton@95.73.122.127] has quit [Quit: ERC Version 5.3 (IRC client for Emacs)]
13:13:15 akovalenko [~anton@95.73.122.127] has joined #sbcl
13:43:55 nyef [~nyef@c-174-63-105-188.hsd1.ma.comcast.net] has joined #sbcl
13:47:35 <Krystof> so one of the things I discovered at eclm was that there is a thriving Russian SBCL community, all of whom basically curse Western SBCL hegemonists on Russian web forums
13:47:37 <Krystof> true? :-)
13:48:24 *akovalenko* is probably a member of "Russian SBCL community", whatever it is, but never cursed Western SBCL hegemonists _publicly_ :)
13:49:25 *nyef* is ready to curse Western SBCL hegemonists. Today, mostly over dynamic-extent support.
13:50:17 <akovalenko> apparently, nyef understands what I mean...
13:50:51 <nyef> All of this cool stuff, and none of it suitable for running on non-x86oid targets.
13:51:35 <nyef> It really seems like linux on x86oids is the ONLY target that gets anything resembling real attention.
13:51:59 <nyef> (And even that is dubious at times, given the warnings emitted while compiling the runtime.)
14:03:30 lichtblau [~user@port-92-195-43-239.dynamic.qsc.de] has joined #sbcl
14:04:18 <nyef> It also seems to me that two major build-time options added to the x86-64 backend would allow for far more comprehensive testing of cross-platform-dependent behavior on such boxes.
14:04:49 <nyef> One is the alpha-model 32-bit boxed words.
14:05:22 <nyef> And the other is a partitioned register set and separating the C stack from the control stack.
14:06:20 <nyef> That might actually be three options.
14:13:27 <nikodemus> they all sound good to me
14:15:14 <nyef> Mmm. One of the attractions for me is that I have an x86-64 that builds about five times faster than my PPC.
14:16:23 <nyef> Still, a lot of work.
14:18:23 <nyef> As soon as you split the control stack, for example, you need to sort out D-X. And add a pile of stuff to cover having an actual number stack. And then sort out a lot of reader conditionals that are just plain wrong...
14:18:25 homie [~levgue@xdsl-78-35-146-223.netcologne.de] has joined #sbcl
14:45:30 <redline6561> If you could market it correctly, you might be able to get crowd funding. :) Though you might need to wait til nikodemus' patches start to land.
14:45:55 <nyef> I don't have /time/ for much of this.
14:46:23 <redline6561> Ah.
14:47:17 <nyef> And it's largely stuff to make it easier to support some of the more unusual build targets without having more developers testing on the real hardware.
14:48:39 <nyef> It wouldn't surprise me in the least to find that a fifth or more of the commits for the next release were fixing problems with unusual build targets like PPC Linux, x86oid OSX, and the like.
14:49:45 <nyef> And we know of one bug that is latent on most x86oid boxes, and actually triggered on one of mine.
14:53:54 <nyef> In lutexes, are the "gen" and "live" slots treated as boxed fixnums or as unboxed integers?
14:55:23 <nikodemus> not sure. lutexes are all gone here :)
14:56:02 <nyef> No lutexes at all?
14:56:14 <nyef> Meaning no more SLAD problems on osx?
14:56:31 <nikodemus> no lutexes at all. no SLAD trouble
14:56:36 <nyef> Sweet.
14:56:49 <nyef> That just leaves instances and bignums as my worry cases.
14:57:15 <nikodemus> no fairness either, though. my forward port of the fair spinlocks has a bug that i haven't found yet, so i'm resting that for a while in order to look at it with fresh eyes
14:57:43 <nyef> Careful. It's sometimes taken me /years/ to spot something with that approach.
14:57:52 <nyef> (Happened with the win32 port, for example.)
15:00:06 <nikodemus> remains to be seen...
15:00:47 <nikodemus> if i don't find it, happily enough there is little enough code that i can rewrite it from scratch or implement another kind of fair locks without taking forever
15:01:10 <nyef> That's good.
15:02:37 leuler [~user@p54902030.dip.t-dialin.net] has joined #sbcl
15:19:05 <nyef> Two more of those damned #!+#.(cl:if (cl:= 64 sb!vm:n-word-bits) ...) conditionals down!
15:23:26 <lichtblau> nikodemus: where can I read about the algorithm that you have implemented?
15:25:38 <nikodemus> it's the same as in linux kernel
15:26:14 <akovalenko> i.e. "use the source, Luke" :)
15:26:27 <nikodemus> https://lwn.net/Articles/267968/ # has a short piece
15:27:32 <nyef> Can anyone explain the logic involved in dx-combination-p and known-dx-combination-p?
15:27:56 <lichtblau> So there is no other published information on this approach?
15:28:44 <lichtblau> I was hoping for something with nice graphics.  And pseudo-code.  And even more graphics! :-)
15:30:51 <Krystof> Western SBCL hegemony needs no graphics
15:31:32 <nikodemus> https://github.com/nikodemus/SBCL/commit/324ca2a3225bd311f13b81bb04c9960b13bcef0e#src # has it in my words
15:32:31 <nikodemus> nyef: let me see if i can page that back in
15:32:41 <nyef> Nevermind, I think I got the critical bit.
15:32:52 <nyef> I'm a lot happier with the DX infrastructure now.
15:32:55 <lichtblau> I suppose it's likely that my understanding of what you are doing is entirely flawed, so please bear with me when I'm asking dumb questions...
15:33:36 <lichtblau> I always thought that when waiting on locks/mutexes from userland, you need to interact with the kernel's scheduler in some way or another, whereas the kernel's requirements are different (it knows what it's doing, and what it currently isn't scheduled to do), so it knows when use of spinlocks is OK, and the only remaining question are the details of that approach.
15:34:55 <nikodemus> lichtblau: i implement waiting using nanosleep: spin a bit, if the lock remains busy, start sleeping for small amounts of time, increasing length slowly to some limit
15:34:59 <lichtblau> So what 1. latency and 2. what CPU overhead can I expect when (say) using the new mutexes in favour of futexes on Linux?  There has to be a noticeable change in latency, no?
15:35:10 <akovalenko> nikodemus: PAUSE / REP NOP in a spin loop could be beneficial
15:35:52 <akovalenko> anyway, having a vop for those won't hurt (if there is no such thing yet)
15:36:14 <nikodemus> we have a vop for them. spin-loop-hint
15:36:32 <lichtblau> And you mention trying benchmarks that indicate only improvements (on Darwin); are those benchmarks available?  (Or if not, can you describe what they are testing?)
15:37:07 <nikodemus> haven't benchmarked against futexes -- and linux build will by default keep using them for sure -- but in my tests fair spinlocks + nanosleep backoff outperform darwin pthread_mutexes consistently... so clearly it isn't too terrible
15:37:24 <nikodemus> i demoed them at sbcl 10
15:37:34 <nyef> ... and have you upgraded to xcode 4 on darwin yet?
15:38:11 <nikodemus> let me see where they are...
15:38:46 <lichtblau> I'm not using Darwin at all, much less SBCL on Darwin, so I have no idea whether the existing pthread/lutex-based behaviour isn't just insanely bad. :-)  It's a comparison that doesn't mean much to me.
15:38:51 *akovalenko* used to think that MONITOR+MWAIT will stop all user-level spinlock efforts forever. Now I suspect I was wrong..
15:41:07 <nikodemus> well, if we could assume SSE3... maybe
15:41:32 <nyef> ... backend-subfeature?
15:45:30 <nikodemus> lichtblau: can't find them just now -- i haven't been running them recently so it might i don't actually have a copy on this machine. mostly they were pretty simple: make a bunch of threads and set them contending on a bnch of locks. different scenarios: no contention vs low contention vs heavy contention (many more threads than cores) and different tasks: algorithmic tasks of various duration, and some syscalls
15:45:39 <nikodemus> then all pairs on those lists, pretty much
15:46:54 antgreen [~user@bas3-toronto06-2925098629.dsl.bell.ca] has joined #sbcl
16:17:06 -!- Blkt [~user@89-96-199-46.ip13.fastwebnet.it] has quit [Quit: cya]
16:18:13 <lichtblau> I don't think I understand things. :-)  Please explain.
16:18:26 <lichtblau> Suppose thread A waits for a mutex held by thread B, which releases the mutex after 60 seconds.  During that minute, thread A would be scheduled (on average) every 2.7 ms, a total of 21 thousand times.
16:18:39 <lichtblau> Is that calculation right, or am I misunderstanding the algorithm entirely?
16:37:34 <nikodemus> you're leaving out the bit where the thread goes to sleep when the lock is busy
16:39:44 <nikodemus> (i have a version where ticket numbers are used to estimate how long to sleep, but even just the fairly stupid backoff in that version i linked above means the busy waiter is really sleeping most of the time)
16:40:57 <nikodemus> inferior to linux OS-based scheduling using futexes, obviously
16:49:17 <lichtblau> hmm, I thought I had counted the sleeping properly.  It starts at 1e4 ns, goes up exponentially to 1e7 ns, and then cycles back to 1e4 ns.  That's 4 sleeps each 1.111e7 ns, or 360 times the thread get scheduled per second, giving the numbers I quoted above.
16:49:41 <lichtblau> Those context switches (ideally between threads, possibly between processes if other stuff is going on) have got to hurt, no?
16:54:24 <nikodemus> oh, i wasn't looking at the exact times. the backoff is pretty different in my current tree even without using ticket numbers for estimates
16:54:31 Qworkescence [~quad@unaffiliated/quadrescence] has joined #sbcl
16:54:44 <lichtblau> okay...
16:54:51 <nikodemus> since you didn't refer to sleeping i though you were estimating just the number of regular scheduling
16:55:56 <lichtblau> yeah, no, I meant the number of times it would get scheduled in that situation, because each block of TRYs would fail, and it would go back to sleep before being scheduled again after sleeping.
16:56:16 <lichtblau> My aserve instances typically have 100 threads that usually idle around, and wait for notification that they suddenly ought to do something.
16:56:44 <lichtblau> Now, I wouldn't ever run aserve on SBCL at all, because its code is lousy on anything except allegro :-), but if I did, some thread would get scheduled every 0.027 ms. -- That's with numbers from your older code though, I suppose.
16:58:57 <nikodemus> yes, IIRC in my current tree the wait maxes out at 100ms at so, and doesn't shrink back down
17:00:17 <nikodemus> but using the ticket number to estimate is way better ... once i get it to work right: thread checks the number of tickets that have been released since it last woke, and uses that to estimate how long to sleep
17:01:13 <pkhuong> yeah, if you're fair, people tell me proportional backoff works better for throughput
17:01:29 <pkhuong> and that you likely want a queue lock too ;)
17:02:20 -!- nikodemus [~nikodemus@cs181056239.pp.htv.fi] has left #sbcl
17:02:31 nikodemus [~nikodemus@cs181056239.pp.htv.fi] has joined #sbcl
17:02:31 -!- ChanServ has set mode +o nikodemus
17:02:43 <nikodemus> bah, closed the wrong tab
17:07:22 <pkhuong> other option: I have some code that only handles sleeping for a while/waking up and it looks like it's safe wrt interrupts and signals.
17:07:56 <nikodemus> the event-thing?
17:08:24 <pkhuong> yeah.
17:08:34 <nikodemus> i keep meaning to read it properly
17:09:05 <nikodemus> so, without using estimates my current tree consumes around 0.3% CPU per sleeping thread. not terrible, but not great if there are 300 threads idling...
17:09:35 <pkhuong> how much address space is that?
17:09:43 <nikodemus> i'll see how low i can crank it with the proportional backoff
17:09:51 <nikodemus> depends on control-stack-size
17:10:00 <pkhuong> default settings?
17:10:17 <pkhuong> Around 3 GB? Not huge enough not to care (:
17:10:20 <nikodemus> 600mb and change
17:10:55 <pkhuong> really? with guard page, the specials table, etc?
17:10:59 <akovalenko> dynamic + alien + control stacks...
17:11:13 <akovalenko> by default, 2M each, IIRC
17:11:38 <nikodemus> control stack default to 2Mb, the rest can't more than double that i assume -- so 1.2Gb would be my conservative estimate
17:12:46 <lichtblau> being able to customize at least the control stack size is quite important for 32 bit processes; not so much for 64 bit...
17:13:05 <nikodemus> actually the option was added so it could be _increased_
17:13:26 <nikodemus> someone had a C function that wanted allocate a couple of hundred megs on stack...
17:13:46 <nikodemus> well, maybe it was only a dozen or so, but still
17:14:29 <lichtblau> I suppose SBCL's defaults are pretty reasonable in the first place, but different apps have different requirements.
17:14:36 <lichtblau> ISTR that Allegro's default thread stack size on Windows is 100 MB, so in that case a decrease was in order. :-)
17:14:56 <akovalenko> even different threads might have different requirements, actually
17:15:51 <nikodemus> very true
17:17:55 <pkhuong> nikodemus: we can count ourselves lucky. Not so long ago, Ruby's VM had such a function.
17:18:15 <pkhuong> Didn't make for happy stack scans.
17:18:51 <homie> are the thread abstractions always on low level ?
17:18:52 <nikodemus> anyways, 300 threads idling in my current setup is going to mean ~90% CPU usage, which isn't quite good enough. while i don't have a strict limit in mind as to acceptable, i do hope to bring it down a fair bit
17:19:15 <homie> or is there maybe an oop way to get threads up ?
17:19:24 <nikodemus> say 1k threads taking around 25% CPU would make me happy
17:20:10 <nikodemus> but, now i am called to go forth and hunt^Wshop for dinner
17:20:25 <antifuchs> it's dangerous to go alone, take this
17:20:37 *antifuchs* hands over a shopping bag
17:21:13 *nikodemus* takes a shopping bag
17:21:26 *nikodemus* looks inside the shopping bag
17:21:34 *nikodemus* examines the shopping bag
17:21:45 *nikodemus* searches the shopping bag
17:21:53 *nikodemus* shakes the shopping bag
17:22:00 <akovalenko> shopping bags over ip would be a useful extension to rfc 1149..
17:22:11 *nikodemus* concludes it is empty and safe to roll up
17:23:01 -!- nikodemus [~nikodemus@cs181056239.pp.htv.fi] has quit [Quit: This computer has gone to sleep]
17:24:03 prxq [~mommer@mnhm-4d013924.pool.mediaWays.net] has joined #sbcl
17:39:43 <pkhuong> we've got some messed up code corruption going on.
17:39:59 <nyef> Uh-oh. How so?
17:40:09 pchrist_ [~spirit@gentoo/developer/pchrist] has joined #sbcl
17:40:39 <pkhuong> emarsden's bug report on MOD.
17:41:16 <pkhuong> if I (disassemble (compile ...)), the disassembly at the end of codegen differs from the one on the final result
17:41:38 <nyef> Oh dear.
17:42:37 -!- pchrist [~spirit@gentoo/developer/pchrist] has quit [Ping timeout: 240 seconds]
17:43:43 -!- Qworkescence [~quad@unaffiliated/quadrescence] has quit [Quit: Leaving]
17:43:53 <pkhuong> now, if it were only the error trap disappearing, I'd suspect an issue with concatenating segments, but that's not all.
17:44:39 <nyef> I see two error traps here...
17:45:57 <nyef> Hrm. It does die that way, though.
17:47:56 <pkhuong> http://paste.lisp.org/display/125516
17:52:56 <nyef> Only differences I'm seeing here are that the assembler-routine fixups have been applied, and the function header is missing because you used disassemble instead of disassemble-code-component.
17:53:16 -!- redline6561 is now known as redline6561_nop
17:54:19 <pkhuong> nyef: argh. Then it's worse.
17:54:19 <pkhuong> ;      0F6:       E80C000000       CALL L2
17:54:20 <pkhuong> ;      0FB:       48894DF0         MOV [RBP-16], RCX
17:54:20 <pkhuong> ;      0FF:       48C745E828000000 MOV QWORD PTR [RBP-24], 40
17:54:20 <pkhuong> ;      107: L2:   8F4508           POP QWORD PTR [RBP+8]
17:56:43 <nyef> What's this?
17:57:53 <pkhuong> it's from the middle of the disassembly, after the type checks and the call to generic-<.
17:58:11 <nyef> Mmm. Something seems odd about that.
17:59:43 <pkhuong> there's an empty block in there.
18:00:32 <nyef> Hunh.
18:01:22 <pkhuong> ok. Looks like a bug in ir2 dead code elimination.
18:02:33 <pkhuong> oh wow. nice. Interaction between fall-through elimination and the insertion of trampolines in-place.
18:03:04 <nyef> Umm... why is VOP GENERIC-< loading to RCX instead of R11?
18:06:41 <pkhuong> got it. Should have thought of that one :|
18:07:16 <nyef> Hmm?
18:07:22 *nyef* is completely lost.
18:07:41 <pkhuong> we have a pass to eliminate jumps that just fall through
18:09:21 <pkhuong> and we have logic to insert local call trampolines inline.  The trampoline logic must know whether the previous block expects to fall through or not, otherwise it won't emit a jump over the one-instruction trampoline.
18:11:46 <pkhuong> all the trampolines are known very early in IR2, so we can just never insert fall through to local-call targets.
18:12:32 <pkhuong> s/insert fall through/eliminate jumps that look like fall through/
18:13:23 <nyef> Hunh. Okay, I found where the RCX comes from. Still seems weird, though.
18:17:22 <akovalenko> won't it be great to have annotations in disassembly, like ";;; yes, we mean that. that's okay, trust me."
18:18:02 <nyef> Heh.
18:18:11 <nyef> I was more thinking a map of code areas to VOPs.
18:55:23 <Krystof> that we basically have in the trace-files
18:55:49 <nyef> Sure. But we don't always have trace-files available.
19:28:49 pchrist [~spirit@gentoo/developer/pchrist] has joined #sbcl
19:31:49 -!- pchrist_ [~spirit@gentoo/developer/pchrist] has quit [Ping timeout: 258 seconds]
20:01:11 slyrus [~chatzilla@adsl-99-39-234-229.dsl.pltn13.sbcglobal.net] has joined #sbcl
20:12:50 <nyef> Hello slyrus.
20:13:03 <slyrus> hey nyef, seems like you've been busy with SBCL lately :)
20:13:17 <nyef> Yeah, flushed my patch queue and then some. (-:
20:13:22 <slyrus> nice
20:13:36 <slyrus> how's nikodemus coming along with the indiegogo-sponsored work?
20:14:00 <nyef> He said something about having a tree with no lutexes.
20:14:48 <slyrus> oh, nice
20:15:05 <nyef> Indeed. Means no more SLAD problems on OSX.
20:17:08 <nyef> Throw in FP traps and most of the building-with-xcode-4 problems fixed, and it's almost usable.
20:19:29 -!- homie [~levgue@xdsl-78-35-146-223.netcologne.de] has quit [Quit: ERC Version 5.3 (IRC client for Emacs)]
20:26:36 <nyef> What's the best feature expression to indicate conservative stack scavenging? My current best guess is (and gencgc c-stack-is-control-stack).
20:27:28 <akovalenko> the best is :conservative-stack-scavening (add this one to *features* when appropriate)
20:27:37 <nyef> Well, yes.
20:28:05 <akovalenko> nearly all complex conditions, including C ifdef stuff, are pain in SBCL..
20:28:16 <nyef> And then we'd have :partitioned-register-set to go with, I'm sure.
20:32:52 <nyef> Still, without adding more stuff to *features*, do we have any better ideas?
20:38:15 <nyef> ... Anyone have an SBCL/MIPS and want to try to cause a GC assertion failure? Stack-allocate a structure with raw slots. Under some circumstances, with various slot values and subsequent operations and whatnot, it'll blow up in GC.
20:38:21 <nyef> Almost guaranteed.
20:39:36 <akovalenko> well, I've got mipsel-linux in my router, but even if I succeed in installing real glibc and sbcl...
20:39:52 <nyef> Right, too much effort, isn't it?
20:41:50 <nyef> ... the hell? MIPS does stack-allocatable-vectors?
20:42:30 <nyef> I think I'm going to need to experiment on a MIPS SBCL at some point.
20:46:05 <akovalenko> hmm. can I have a swap file in Linux on NFS or SMB?..
20:46:19 <nyef> NFS certainly.
20:46:23 <nyef> Don't know about CIFS.
20:46:27 <akovalenko> [anti-offtopic: that's for eventually giving SBCL a try on my router]
20:47:44 <nyef> Okay, I think my time is up for today.
20:47:46 <fe[nl]ix> nyef: are you sure ? I thought that only disk-backed files can be used as swap
20:48:10 <nyef> fe[nl]ix: How else would a diskless workstation have a swap file?
20:48:14 <akovalenko> I was thinking of how good it could be to have a USB external _RAM_ (oddly, the first computers I worked with had such a thing..)
20:48:53 <nyef> Right, I'm really gone this time.
20:48:56 -!- nyef [~nyef@c-174-63-105-188.hsd1.ma.comcast.net] has quit [Quit: G'night all.]
20:49:30 <antifuchs> akovalenko: thunderbolt might bring you closer to that dream (it runs at pci express bus speeds right now; maybe something even faster comes along someday)
20:50:18 <pkhuong> akovalenko: RAM over TCP (aka memcached)
20:56:28 <foom> yea, swap on NFS works.
20:56:31 <foom> I've done it before.
20:57:09 <foom> The kernel devs hate it, because the network stack allocates memory, and having that work while you're trying to swap is complicated and sometimes broken.
20:59:19 <fe[nl]ix> wow
21:16:10 -!- leuler [~user@p54902030.dip.t-dialin.net] has quit [Quit: ERC Version 5.1.2 $Revision: 1.796.2.6 $ (IRC client for Emacs)]
21:30:55 -!- sdemarre [~serge@91.176.28.90] has quit [Ping timeout: 248 seconds]
21:39:35 -!- prxq [~mommer@mnhm-4d013924.pool.mediaWays.net] has quit [Quit: good night]
22:02:17 tsuru`` [~charlie@adsl-74-179-250-19.bna.bellsouth.net] has joined #sbcl
22:03:36 -!- tsuru` [~charlie@adsl-98-87-45-21.bna.bellsouth.net] has quit [Ping timeout: 240 seconds]
22:42:24 -!- tsuru`` is now known as tsuru
23:45:33 nyef [~nyef@64.134.124.204] has joined #sbcl
23:46:26 <nyef> Does anyone know why HPPA has support for stack-allocatable vectors?
23:47:30 <nyef> (Oh, and I'm almost certain that I can crash a MIPS instance with a carefully-crafted stack-allocated vector. I haven't found anything to say that I can't.)
23:58:26 <nyef> Hunh. HPPA got some attention in 1.0.24.x. Neat.