Meeting on irc.gnome.org:#gtk-devel Meeting started November 23 2004 16:01 EST (21:01 UTC) In attendence: Owen Taylor (owen), Jonathan Blandford (jrb), Manish Singh (yosh), Matthias Clasen (mclasen), Tor Lillqvist (tml), John Ehresman (jpe), hey I don't know if we have much to discuss today; I have been a bit distracted from gtk+ bugs for the last few days... One thing we could conceivably discuss is my milestorming mail (BTW, the rotated text issue that I said I'd resolve a few weeks ago one way or the other is obviously resolved in favor of not ripping PangoRenderer out) ah, right. that is the bugzilla discussion you promised some weeks ago, right ? you have ensured that the 2.6 release will be screenshot-worthy... * mclasen tries to reread owen milestone mail...evolution hangs... I didn't write *that* much on it ;-) tell evolution that ok, I guess I have to declare your mail officially lost in evolution... mclasen: mail.gnome.org heh, yes, found it there i should have googled to begin with... so, I like the idea to have a somewhat more finegrained classification a la big feature - small feature - simple fix - need analysis but I'm not following on your number of 12 milestones... I only count 9 or 10 mclasen: The 3 at the end are abbrevations for two ... one for api, one for feature another thing I agree on is that we shouldn't leave bugs with feature-adding patches in limbo just because the bug isn't entirely right, or we are not sure... ah, that explains it. small/medium/big in those names is an estimate for the amount of work, right ? Actually "Big Feature" is unlikely to appear for GTK+, though it's real for Pango (script addition, mostly) mclasen: Yeah. owen: cairo support, print dialog...no big features ? mclasen: that seemed to be to be the most relevant toplevel classification when trying to figure out what to pull of the stack and do for a particular release. mclaesen: cairo support, print dialog are "big api" mclasen: "Feature" in my scheme was features without API oh, seems I haven't completely mastered the new system yet... but you could conceivably have a combination of big feature/little api, no ? mclasen: Well, the idea is "Big API" isn't "Lot's of new API" but rather "Bug with API keyword that is lots of work" as in cairo support, where it is conceivable to just add a handful of new apis... mclasen: It's quite likely that for Big/Medium the Feature/API distinction isnt' relevant ah of course. maybe naming it Big/API would make it clearer or Big+API mclasen: For Small it does determine whether a backport to the stable branch is possible. But maybe it's best killed altogether. Backports to stable can always be handled piece by piece s/piece by piece/case by case/ owen: your sample rotated text looks odd jody: you mean the one in testgtk with a texture ? jody: morten told me that too. The problem is that "Rotate" is in italic mclasen: I think he means my screenshot of the rotated label mclasen: yes jody: Oh, the one in testgtk is I'm pretty sure a server bug. jody: It worked for me for a bit, then started showing random bits of the root window background. owen: sorry, you were right the rotated label oh, yes, the amount of italicness seems to vary with angle... Was the Hello World in italics too ? jody: I think it's just an optical illusion The l's in hello had distinct baselines jody: i think thats just lack of subpixel positioning jody: What mclasen said maybe we can get that too for 2.6, owen ? :-) mclasen: how slow do you want to go? mclasen: (I mean text rendering speed, not 2.6 development...) In the end, really only 90 and 270 are interesting angles for GtkLabel oh, we can compensate for that by speeding up utf8_validate() some more... hand written assembly, baby! but subpixel positioning would conceivably also improve horizontal text, no ? mclasen: Actually, tends to go along with fuzziness. Good fidelity to print, poor readability http://people.redhat.com/otaylor/grid-fitting/ has screenshots so, coming back to the milestone discussion... I think the problem of "doesn't look like milestone to reporter" can be solved by some canned comments, which explain what the milestone means I'm thinking - for the question of the "not sure" bugs, maybe we just need to handle them here every week, discuss them, flip a coin, make a decision thats what I was about to propose...add a regular agenda item to discuss maybe 5 of those... nominations could be sent in during the week, or maybe we could just always discuss the 5 oldest ones... mclasen: I'd suggest we should probably try to discuss incoming new ones every week as well to make sure that we are making forward progress. mclasen: I'm also wondering about the question of a REJECTED_RFE if we start getting more aggressive about rejecting marginal issues, so we can go back and revisit (RESOLVED/WONTFIX is pretty closed to REJECTED_RFE, but not exact) (409 total RESOLVED/WONTFIX bugs for GTK+...) owen: WONTFIX on an enhancement request is a pretty good approximation of REJECTED_RFE tml: Any thoughts on the proposed bug process from a win32 perspective? owen: umm, not really oh, while we are talking about backends, we may also spend a minute on the framebuffer backend, maybe tml: the idea is sort of that bugs when they come in can be assigned to a milestone and left there, so hopefully a little less work mclasen: 'cvs rm' is somewhat tempting should we declare it as officially unmaintained in 2.6 ? I'm not even sure it compiles mclasen: Well, doesn't at the moment, I'm sure, because I added a new backend function with the rotated text stuff at some point it looked as if the direct-fb backend may be a suitable replacement, but that seems never-forthcoming... owen: i can't be more specific yet, but i assume i will be able to do much more work on glib>k win32 stuff next year mclasen: Yeah. I guess the main point in having the linux-fb in CVS is the vague hope that someone will pick it up. ok, if nobody objects, I'll probably add some form of deprecation note to the 2.6 release notes, asking for some fearless hacker to keep it alive... owen: in your milestone proposal, we would still have per-release 2.6.something milestones, which would just be much less populated, right ? mclasen: Yes, but the point is that 2.6.somtehing milestones would be explicitley populated. ok, so if anything is left on 2.6.x at release time, it goes back to the pools mclasen: Anything that didn't get fixed in the previous milestone would get thrown back into the general pool. (Presumably because it was something someone volunteered for and didn't end up having time to wokr on) I would like to try the "pool milestones" approach for the 2.8 release cycle. How do others judge it ? deafening silence... well, it almost has to be an improvement from what we do now :-) ok, accepted by 0/0 voices Another topic-- do we have any idea how much the new widgets have been tested? Should we send out an explicit poll to, say, desktop-devel-list, which asks people which of the new widgets they have used? I have no idea, but we do have the docs with the migration chapters online, and people know that the next gnome will be based on 2.6, so they should be starting to port stuff to the know apis I know anders was doing that in libgnomeui last week... GtkIconView is probably the biggest concern and gimp and nautilus have been ported to uimanager... owen: agreed, icon view is the weak spot i have this strange situation in which i am using a better gtk than everyone else what is going on with that patch of Sven's? * owen sends something to desktop-devel-list what patch ? the gobject one ? has he sent beer to Hamburg yet ? heh, let me check :) honestly, I don't know whats up with gobject patches blocking on review by tim i should have found the bug number before i mentioned it maybe we can make a start with the "discuss bugs" agenda item next week by discussing some gobject patches the one to speed up big signal emitters has been made less urgent by my change to make gtkimagemenuitem/gtkbutton not connect one handler per instance, though mclasen: It would be nice to have the bugs to discuss in the agenda for the week (people could follow up and propose others) owen: sure. anyway, I have to pack my things now. Regarding releases: I want to do both 2.4.x and 2.5.x releases around mid of next week... see you mclasen: Sounds good. later. OK, ============= meeting adjourned ========================= Meeting ended November 23, 17:02 EST (22:02 UTC) wait, i found the bug report andersca: We moved hte meetings up an hour http://bugzilla.gnome.org/show_bug.cgi?id=143668 i have never seen the meeting actually adjourn before I'd also like to see signal emitters sped up; signal emission does show up as a hotspot in profiles carol: Yeah, that should be fairly unimportant for GIMP now with Matthias's GTK+ fix jpe: 143668 is connections not emissions oh, ok jpe: I have some ideas how emission could be sped up, but they are a fair bit harder to implement i just feel bad when i am one of the few people using something that is better owen, ok owen: you have put down those ideas at any place? carol: If you want to feel better, Matthias's fix will make an even bigger difference, so there are some people other there using something even better than what you have wasn't there a patch that bypassed the emission machinary iff there's only a single handler? rambokid: Not recently, may have discussed them at some point with you. One thing is to make gvaluecollector.h (as used in gsignal.c anyways) directly recognize fundemental types. That's a good chunk of signal emission time owen: i don't know that idea. i'd apprechiate if you could elaborate on that (preferrably via email) jpe: It didn't (as I recall) quite keep all the gsignal semantics the same, which was a problem jpe: e.g., it didn't ref objects that passed as parameters. But it's a vague memory now... owen, do you remember if the problem could be fixed? jpe: It didn't look like a productive line of attack to me jpe: Basically, cluttered the caller a lot. jpe: I think there is some fertile ground for speeding up the current mechanism before working on bypassing it i'm not in favour of changing the GSignal semantics in any way rambokid: If you remember ssp's pathc, it didn't actually change gsignal semantics, it was more along the lines of if (g_signal_only_has_defualt_handler (object, signal)) { call default handler} else { g_signal_emit()} mclasen: the attributes in *.symbols changes broke abicheck.sh owen, I think the bug # is 121027 owen: it was for Gtk events whose emissions aren't speed critical by any measure. rambokid, I see emissions of gtk events showing up as significant in profiles jpe: i'd like to see that profile, and eventually reproduction directives, since that is contrary to any data i've seen so far. couldn't something like this be done without exporting api to code outside of gsignal jpe: No because short-circuiting demarshal/marshal is the big step jpe: And that's very hard to do inside gobject rambokid, it's basically when displaying a previously minimized window with a lot of sub gdk windows / gtk widgets I'm testing on win32, but it's probably cross-platform jpe: gcc's __builtin_apply() *almost* is good enough, but not really. jpe: I have a small test app that is pretty similar, but it's hard to tell how much of the overhead is the signal propagation itself and not the raw stuff. Which profiler were you using? * vektor has suspicions about signal emission speed in these cases but no hard numbers. owen, what is __builtin_apply() missing? I was thinking of using it or something like it vektor, glowcode (www.glowcode.com) I'm using MS VC++ compiled code. jpe: as i said, i need hard data, not a mere suspicion. you need to correctly tell the handler execution time apart from the handling done by the gsignal code. I'm pretty sure it isn't legal to use from a varargs function. Also, you can't, I think, handle the fact that g_signal_emit() has a different signature then the callback - only some args are shared I saw some signal-emission stuff with my little test app show up in 'qprof' but I don't trust it at all. If anyone has any recommendations for an application and not system-level profiler, let me know. vektor: for the most part, the gsignal emission code is pretty straight in terms of executed instructions. i don't think there're many chances of major speedups (for the non-debugging cases that is) short of maybe doing even more inlining... rambokid, I will freely admit I don't have hard data, but I do have a strong suspicion backed up with some profiler data vektor: speedprof (part of memprof in GNOME CVS) is pretty reliable as far as it goes. It doesn't give you much data, but I generally trust what it gives you owen: Thanks, I'll try it out. owen: varargs handling in itself is a slow thing to begin with... owen, I'll need to look at what __builtin_varargs() allows and doesn't. There is also something similiar for VC++. rambokid: OK. We're considering doing our own closure instead of using the C closure in SWT, I'm trying to understand the performance implications of that. rambokid: It's not the slow part of args collection.... varargs is just some pointer arithmetic vektor: you know of gcc pre/post function instrumentation? that could be of use here, i even wrote a small profiling tool once that makes use of this (and used it on gsignal code in the beginning) rambokid: No, any references? hmm, interesting.. rambokid: OK, I think google is giving me enough to go on, thanks. BTW, I agree that there would need to be a very compelling reason to complicate caller code like the rejected patch did vektor: toyprof, the code is in beats's CVS (unused by the project though), thee was even a 0.1 release, announced on gtk-devel-list (with some additional remarks, that might be of help) beast's CVS i mean thanks owen: what is it you want to get at then? owen: and, varargs are expensive because you can't use the usual call optimizations with them, like passing most arguments in registers rambokid: on x96, you *don't* do register passing in the normal case ... it 's not part of the ABI what I want to kill is all the pointer indiection and type lookups in gvaluecollector You could imagine essentially that each signal has a bit of bytecode attached to it that directly describes how to demarshal the args into a value array i don't have the exact details or timing stats at hand, but I convinced myself at one point that it would be a big win owen: i can see hardcoding stuff there for fundamental types implemented by glib. rambokid: yeah, the bytecode would have an "do the dynamic lookup for this type" escape hatch owen: i'm not sure it'd be a big win though, because signals carry lots of non-fundamental types rambokid: you'd hvae to break encapsulation a little bit internally to libgobject and make it know about GObject owen: and you may not do COLLECT(g_type_fundamental(sigargtype)) since derived types can implement different value handling than their fundamentals (via value tables) rambokid: That's why you "compile" the byte code when you create the signal owen: that breakage wouldn't be the problem (just making gvaluecollector know about GObject would be enough even) owen: i'm not sure i get you there. when you say "byte code", i think of java + interpretation. is that what you talking? s/java/the java runtime model/ rambokid: What I mean by bytecode is something that says 'get an int; get a pointer and call g_object_ref_on_it(); get another int' rambokid: A compact representation of the exact operations needed to demarshal the args for that signal that you can have a tight loop over. owen: that sounds overly complicated to me. i'd say, it's good enough to: a) hard code glib fundamentals so no function calls are in volved in collection b) flag signal argument types for whether those hardcoded variants may be used or whether the current collection code has to be used (that uses value tables properly) It might be. Would have to do some code reading to remember the issues the object ref-ing would go into the hardcoded version then. (and should become an inline atomic int operation btw) gonna be fun for gtk#, they read the refcount variable iirc ;) andersca: what's going to change for them if we make inc/dec of the ref-count atomic? rambokid: isn't GAtomicInt different from gint on some architectures? andersca: There isn't a GAtomicInt ah, right sorry andersca: We had it, then realized that g_atomic_* woudln't be useful unless it could operation on 'int' and it could every place we looked owen: ah andersca: implementing the atomic funcs so that tehy can be used for GObject ref counts was one design imperative rambokid: yeah, although ref_count is marked private anyway rambokid: The main blocker for that is just someone implemented a semi-realistic app benchmarks and doing some tests to make sure that we aren't unexpectedly killing performance by making the refcount atomic andersca: obejct bindings i think may have legitimate reasons to read it out though, so for them i'd consider it partly public ;) (On a smp machine, since atomic ops are basically free on UP) we'll probably need a type bytecode interpreter like the one owen describes for the introspection support owen: i've just seen a comparison by stefan, that acquirtering and releasing locks works at essentially the same speed on an UP machine as on an SMP machine jpe: for me, that is too vague of a statement to make sense of it. rambokid: Not locks, atomic operations... my concern is that atomic operations require extra bus operations on an SMP machine owen: is your concern that this *might* be a problem, or have you seen data that this *is* a problem for refcounts? (that wasn't clear to me from your above statement) owen: locks in absence of contention are essentially atomic ops (for the glibc fast-mutex spinlocks, that is) rambokid: It's a statement that I'd like to see some simple benchmarking before making gobject refcounts atomic, because there it *might* be a problem rambokid, introspection support will allow language bindings to call down to C functions without wrappers; see bug # 139486 rammbokid: My expectation is that the cost is smaller than the G_IS_OBJECT() check in g_object_ref() rambokid: But if we don't do the timing, we can't recover in the (admittedly unlikely) case that there is a problem, because it becomes part of the API <1000000;i++) if (foo) i++ else atomic_inc(i); ? rambokid: I'd rather see a GTK+ test... say, the good old 100x100 button grid. jpe: i know what that bug is about. i don't think that being related to hard-coding the fundamental types in signal emissions though. owen: not sure which test you mean something that we did in the GTK+-1.0 days ... create 100x100 button table, show it, let it expose, profile off of thta so, in this case, just see what the effect on the time to create and expose such a table is by making refcounts atomic owen: oh, you rella ythink that could be significant there? i don't guess so, but would be interesting to be proven wrong... ;) rambokid, you're probably right that they are not really related since the types of signal arguments are already known rambokid: I don't really guess so either, but, well, it's reasonably easy to do. Except that SMP boxes are reasonably rare these days. rambokid: for desktops, anyways. yeah, my SMP box is also my server now. i guess i could run Xnest tests on it though... ;) BTW, did a quick valgrind --skin=callgrind test on the button grid, and the top 4 individual functions (totalling over 10%) are in gsignal.c owen: well, hyperthreaded machines... yosh: I don't think they'll show any (hypothetical) atomic locking problems yosh: Because the two "processors" share the same local cache ah, true owen: the 100x100 buttons? can you send that code here? too late. got my own 100x100 button grid ready by now. rambokid: Actuallyl, 100x100 didn't fit on screen and stressed out valgrind, so I went a little slower rambokd: SO I went a little slower I'm timing to g_signal_connect_after (window, "expose-event", G_CALLBACK(gtk_main_quit), NULL) (Yes, that's evil because gtk_main_quit doesn't have a boolean return value :-) s/slower/smaller/ g_idle_add_full (1000000, all_done_handler, NULL, NULL); that handler simply does gtk_main_quit() so you get destruction handling as well. running the show 100x100 buttons test here takes 1.878s actually more like 1.938s with unpatched gsignal.c and it takes exactly the same time with atomic refcounts (on an UP) owen: hm, i don't claim to understand all the output of callgrind_annotate, but i don't see g_signal anywhere near the top for a 2x2 button window (create, show, expose, destroy) rambokid: Let me check my kcachegrind settings, might have not been looking at the right hting i don't have kcachegrind installed in my chroot installment, so doubt that that is significant i.e. just did valgrind --skin=callgrind .libs/simple and callgrind_annotate callgrind.out.23712 where i have just valgrind and valgrind-callgrind I was looking with kcachegrind.... callgrind_annotate is a little obscure. Looks like I wasn't looking at the right thing But when I did, I'm still seeing gsiganl at the top Note that this is with no text in the buttons... if you put buttons in there, then freetype loading-font overhead tends to dominate (for smaller grids, anyways) s/put buttons/put labels/ not text here either callgrind_annotate gives me a bunch of g_signal stuff at the top... or at least some. What's a the top for you? 2,377,792 711,274 62,373 8 1,832 0 8 5 . ???:0x00018950 [/lib/tls/libc-2.3.2.so] /usr/X11R6/lib/libX11.so.6.2, /lib/ld-2.3.2.so are the tops ones but then, i don't quite know what valgrind --skin=callgrind does anyway ;) rambokid: Hmm, that looks like your valgrind is unhappy, really what's "unhappy" ? ;) Basically, it simulates the execution of your program (in the valgrind'ed jit way) and accounts each instruction reference / data reference / etc, to the calltree hierarchically the first glib one is: It gives somewhat distorted data because it doesn't have an accurate processor model, and does't see kernel time, X time, disk IO, etc, but... 260,333 92,843 35,128 22 204 18 8 7 1 gtype.c:IA__g_type_check_instance_cast [/usr/local/lib/libgobject-2.0.so.0.505.2] (i.e. 260k instructions as opposed to 2+1+1+1 million instructions from libc and libdl) Hmm, what I'm seeing first is gsignal.c:handler_insert [/usr/lib/libgobject-2.0.so.0.505.2] - which actually has about 1/3 of the istruction reference hm, i don't have handler_insert at all But this is the version that doesn't have matthias's don't-use-one-GtkSEttings connection per button. Should try with head ah. then you even run into the O(connections^2) complexity of handler_insert rambokid: Basically what I meant by "unhappy" was "not giving accurate data" - without any particular idea why. Spending that much time in libc sounds odd. i was actually working on testing/applying neo's patch today and shoved it out of the way for the button test/this conversation humpf. as much as i'd like to continue this, i have to go to bed to show up at university tomorrow morning. owen: in order to run this test, i just hacked together a quick version of gobject.c to do atomic refcounting. care to review that? (have some issues there) rambokid: Sure, you can mail it to me or whatever. owen: thx, email sent. night. night rambokid