00:19:22 -!- loke [~elias@bb119-74-154-54.singnet.com.sg] has quit [Quit: Leaving] 01:35:42 -!- hargettp [~hargettp@pool-71-174-132-222.bstnma.east.verizon.net] has quit [Quit: Leaving...] 01:38:10 sheep [~peter@pjstirling.plus.com] has joined #sbcl 01:38:36 -!- sheep is now known as Guest63343 02:18:15 -!- cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has quit [Ping timeout: 246 seconds] 03:40:41 -!- Guest63343 [~peter@pjstirling.plus.com] has quit [Ping timeout: 240 seconds] 03:55:37 cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has joined #sbcl 04:37:27 andy [~user@70-56-177-37.phnx.qwest.net] has joined #sbcl 04:38:17 -!- andy [~user@70-56-177-37.phnx.qwest.net] has left #sbcl 04:41:06 quoptic [~user@70-56-177-37.phnx.qwest.net] has joined #sbcl 04:44:57 how does one use vop's in sbcl? i'm writing some code that involves heavy numeric lifting, and i'd like to take advantage of sse instructions like addpd 04:47:24 quoptic: do you have an example? 04:47:53 there's no centralised quick documentation for that, but it's not really hard either. 04:48:02 so, really, examples are probably best. 04:48:27 pkhuong: it's a macro, used like so 04:48:29 (make-stencil ex (/ (+ (* 4.0d0 (@ % %1 %2)) 04:48:29 (@ % %1 (+ %2 1)) 04:48:29 (@ % (+ %1 1) %2) 04:48:32 (@ % %1 (- %2 1)) 04:48:36 (@ % (- %1 1) %2)) 04:48:39 5.0d0)) 04:48:46 what's this %, %1 %2? 04:49:09 pkhuong: indexes along the array % 04:49:25 pkhuong: was just about to post the expansion 04:49:32 please use a paste service. 04:49:51 but what sort of function do you have that you'd like to replace with an assembly sequence? 04:49:55 pkhuong: ok 04:51:17 pkhoung: http://goo.gl/XFGaA 04:52:10 pkhuong: typo, idk if you got this: http://goo.gl/XFGaA 04:53:01 so, what sort of function would you like to replace with an assembly sequence? 04:53:54 my question is more along the lines of how to replace the inner part of the loop with vectorized code 04:54:37 joshe [~joshe@opal.elsasser.org] has joined #sbcl 04:54:50 there's no autovectorisation. If you want vectorised code, you'll have to figure out how you want it vectorised first. 04:55:55 pkhuong: ok, so i'd like to have it vectorised so that (aref array i j) and (aref array i (+ j 1)) are computed simultaneously 04:56:43 pkhuong: well, write the assembly that way, not have it automatically vectorised 04:56:44 That's implicit if you work with SSE registers. 04:57:26 pkhuong: it is? but if i disassemble the function, i see add, not addpd 04:57:43 right. 04:57:47 It's not vectorised. 04:58:05 pkhuong: how do i work with sse registers, then? 04:58:25 You figure out what asm sequence you'd like to generate. 04:59:31 in your case, though, you could likely get away with just abusing complex arithmetic. 05:00:14 also, you can probably get a bigger speed-up than pretty much anything from multiplying by .2d0 instead of dividing by 5d0 05:01:20 pkhuong: alright, looking at this code here http://webcache.googleusercontent.com/search?q=cache:3uNIztz35jsJ:paste.lisp.org/display/115655+mandelbrot+sse+sbcl&cd=1&hl=en&ct=clnk&gl=us&source=www.google.com 05:01:29 pkhuong: paste.lisp.org is down 05:04:26 pkhuong: how does one determine the cost of a new vop? 05:06:49 you don't need to. 05:07:15 pkhuong: ok, just looked at the source, it's an estimate 05:07:23 It's only used by the codegen when multiple variants could be generated. 05:08:50 pkhuong: ok, makes sense. looking at the example, complex-single-reg was used. what is the sse equivalent? 05:09:03 you want complex-double-reg. 05:09:18 Complex arithmetic is implemented with vectorised SSE instructions on x86-64 05:10:02 since you only add values and multiply/divide by constant scalars, you could get away with punning complexes. 05:12:31 pkhuong: alright, should cover everything i'll use it for. 05:13:26 You still have to grab complex double float values from your complex array. 05:13:33 *from your double float array 05:27:42 quoptic: Something like 05:28:53 pkhuong: wow, thanks a lot! 05:29:42 it emits fairly naive code for the array accesses, and there's no bound checking. 05:30:00 but you're doing this for speed, and the indexing code overhead is probably noise compared to the FP arithmetic 05:30:58 you should be able to figure things out better from this smaller example. 05:31:16 pkhuong: yeah, it looks a lot more understandable. 05:31:55 pkhuong: i'll look over this later. thanks again for your help! 05:32:13 -!- quoptic [~user@70-56-177-37.phnx.qwest.net] has left #sbcl 06:27:29 flip214 [~marek@2001:858:107:1:7a2b:cbff:fed0:c11c] has joined #sbcl 06:27:29 -!- flip214 [~marek@2001:858:107:1:7a2b:cbff:fed0:c11c] has quit [Changing host] 06:27:29 flip214 [~marek@unaffiliated/flip214] has joined #sbcl 08:06:52 hlavaty [~user@91-65-223-81-dynip.superkabel.de] has joined #sbcl 09:07:51 -!- Krystof [~csr21@csrhodes.plus.com] has quit [Ping timeout: 276 seconds] 09:54:11 Krystof [~csr21@158.223.51.76] has joined #sbcl 09:54:11 -!- ChanServ has set mode +o Krystof 10:08:08 hargettp [~hargettp@pool-71-174-132-222.bstnma.east.verizon.net] has joined #sbcl 10:35:23 -!- hargettp [~hargettp@pool-71-174-132-222.bstnma.east.verizon.net] has quit [Quit: Leaving...] 10:38:10 hargettp [~hargettp@pool-71-174-132-222.bstnma.east.verizon.net] has joined #sbcl 11:27:46 -!- hargettp [~hargettp@pool-71-174-132-222.bstnma.east.verizon.net] has quit [Quit: Linkinus - http://linkinus.com] 11:41:21 hargettp [~hargettp@pool-71-174-132-222.bstnma.east.verizon.net] has joined #sbcl 12:19:09 -!- hargettp [~hargettp@pool-71-174-132-222.bstnma.east.verizon.net] has quit [Quit: Leaving...] 12:53:13 misterncw [~misterncw@82.71.241.25] has joined #sbcl 13:19:50 -!- hlavaty [~user@91-65-223-81-dynip.superkabel.de] has quit [Ping timeout: 260 seconds] 13:31:06 cmm- [~cmm@bzq-79-176-200-19.red.bezeqint.net] has joined #sbcl 13:34:52 -!- cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has quit [*.net *.split] 15:55:23 -!- flip214 [~marek@unaffiliated/flip214] has quit [Quit: Leaving] 16:07:59 -!- misterncw [~misterncw@82.71.241.25] has quit [Remote host closed the connection] 16:58:43 homie [~levgue@xdsl-87-79-194-117.netcologne.de] has joined #sbcl 17:35:55 -!- Krystof [~csr21@158.223.51.76] has quit [Ping timeout: 260 seconds] 17:37:42 -!- pchrist [~spirit@gentoo/developer/pchrist] has quit [Quit: leaving] 17:38:18 pchrist [~spirit@gentoo/developer/pchrist] has joined #sbcl 17:50:09 Krystof [~csr21@158.223.51.76] has joined #sbcl 17:50:10 -!- ChanServ has set mode +o Krystof 18:08:39 -!- cmm- [~cmm@bzq-79-176-200-19.red.bezeqint.net] has quit [Ping timeout: 240 seconds] 18:09:48 cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has joined #sbcl 18:16:08 -!- cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has quit [Remote host closed the connection] 18:16:25 cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has joined #sbcl 18:25:59 -!- Krystof [~csr21@158.223.51.76] has quit [Ping timeout: 246 seconds] 18:43:00 wow. self builds take 75% as much time in 32 bit :\ I knew there was a difference, but I never expected it to be so significant. 19:26:11 -!- cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has quit [Ping timeout: 246 seconds] 19:27:24 cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has joined #sbcl 19:29:22 Krystof [~csr21@csrhodes.plus.com] has joined #sbcl 19:29:22 -!- ChanServ has set mode +o Krystof 19:35:02 superjudge [~superjudg@c83-250-198-227.bredband.comhem.se] has joined #sbcl 19:48:16 -!- homie [~levgue@xdsl-87-79-194-117.netcologne.de] has quit [Quit: ERC Version 5.3 (IRC client for Emacs)] 20:17:03 -!- cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has quit [Ping timeout: 240 seconds] 20:18:14 cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has joined #sbcl 20:30:24 -!- superjudge [~superjudg@c83-250-198-227.bredband.comhem.se] has quit [Quit: superjudge] 21:03:56 prxq [~mommer@mnhm-4d0133cc.pool.mediaWays.net] has joined #sbcl 21:33:55 -!- cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has quit [Ping timeout: 260 seconds] 21:44:52 cmm [~cmm@bzq-79-176-200-19.red.bezeqint.net] has joined #sbcl 22:22:30 -!- prxq [~mommer@mnhm-4d0133cc.pool.mediaWays.net] has quit [Quit: Leaving]