Wednesday, April 8, 2009

Kernel Developer Round Table at LF Collab Summit

Panel: The Linux Kernel: What's Next
Moderator: Jonathan Corbet, Editor at LWN.net

Panelists:
Greg Kroah-Hartman, USB & PCI Subsystem Maintainer
Andrew Morton, Lead Kernel Developer & -mm tree Maintainer
Keith Packard, X.org Project Lead
Ted Ts'o, Chief Technology Officer, Linux Foundation

2.6.30 merge window just closed. Linux noted that about a third
of the code that went in was "crap". A lot of code went into the
staging tree. So, for Greg KH, what is the staging tree really
for?

Greg KH: the staging tree came out of the driver project which
provides a collection point for random drivers, including bad
API usage, bad code, rather crappy code. GregKH is now the
"crap" maintainer, er, ah, staging tree maintainer. So, about
130 drivers were merged into the staging tree, all experimental
code, mostly from drivers that have been out of kernel since
the 2.0 days. Slowly that code is getting cleaned up now that
it is consolidated and being evolved to the point where it can
be merged into mainline.

Some distributed filesystem work, aka Ceph, went in through the
staging tree: Why? GregKH: because the maintainer asked that
it go in through there.

For Keith: what are the current graphics things that Keith is
working on. Drivers used to be done all in user mode but have been
re-educated or have come to the awareness that a number of changes
really need to be in the kernel to support graphics. A number of
new APIs for accelleration, video mode configuration, and memory
management code is now in the kernel and can be used by the X11
graphics drivers. Or rather, the X11 graphics drivers provide just
one of many graphics drivers based on the in-kernel support. This
makes the graphics capabilities more accessible to graphics driver
writers. There are still some problems in the 2.6.29 code base and
2.6.30 is getting better. But most of this stuff is pretty bleeding
edge and probably should have been run through staging.

Graphics are now at a much better level of support in Linux than they
have ever been. The number of supported chipsets is finally increasing
from just Intel chipsets to include a number of the ATI chipsets
more fully supported out of the box. ATI has probably put less
developer dollars into improving the drivers as compared to Intel,
but they are getting a fair bit of help from the community and
are communicating well with the developers. Fedora 11 has shifted
to the nouveau driver for nVidea hardware which in some cases exceeds
the capabilities of the native, binary drivers provided by nVidia.
nVidia is still not working at all well with the Linux community.

The graphics community could still use additional developers and
improved vendor support in general.

Jonathan: Is there anything we can do to make the community more open
and accessible to new developers? Keith: the wayland (sp?) project
is a new windows system (not X11 based) whic his using the new kernel
APIs which would not have been possible without accelleration and basic
configuration support in the kernel. The same is true for the Cairo
project. These new APIs and kernel support should enable an increase
in the velocity of change.

Jonathan: where are filesystems going? Ext4 was just pronounced
"stable". Ted: Two community distros (fedora and ubuntu) will be
shipping with ext4 and possibly even the default filesystem. Ted
has been using ext4 as his primary filesystem for over 6 months now.
ext4 has also atracted new developers and that of course leads to a
few new bugs as the new developers are less familiar with the constaints
and caveats of the ext3/ext4 body of code.

Jonathan: looking beyond ext4, when is btrfs (pronounced Butter FS or
sometimes just Butter) available (a common question for Chris Mason ;-).
Ted: btrfs is an exciting alternative but doesn't yet compare to the
four decades of experience behind the berkeley style ext3/4 family
of filesystems. It still has some work to be ready for production and
will probably be the follow on filesystem after ext4.

Jonathan: Are there too many filesystems? Ted: Some of the filesystems
are somewhat special purposed, e.g. for flash support or other unique
hardware configuration. However only about 7-8 filesystems make up
about 95% of the total customer base of filesytems in use today.

Andrew was key in shepherding in the a fs - but no particular insight
on what benefits it provides, although the code is very cleanly done and
appears as though it will be very well maintained.

Linux-next: is that working out well? akpm: Yes! It is doing a lot
of the work that he used to need to do for integrationn, testing, and
evaluation of new code. Stephen's work is helping tremendously, although
Andrew feels that the code base is not getting tested by as many
people as it should be.

Where are the biggest problems in Linus' tree coming from? Andrew:
typically they seem to be code that has skipped over linux-next and
gone straight to Linus' tree. That seems to be a bad model and more
people should be planning for including first in linux-next.

Should people be developing against linux-next? Andrew: probably not.
The code base is really not stable enough for that and the various git
trees are not not really well set up for this.


Are there too many developers? Has the rate of change decreased? Andrew:
no, not really. There seems to be a trend of established developers not
always seeing the changes from new developers that make it into the kernel.
In some cases an established developer will stumble across a new
directory in the source tree and find that the code is filled with newbie
mistakes. While this has the potential to be a problem in the long term,
it seems like the openness of the tree is helping to maintain the quality and as subsystems are used and encounter bugs or problems they still get fixed by the community.

Linux-next is causing a lot of email about merge conflicts between subsystems. Is that causing a problem and is it too hard to do develop in the kernel now? Answer from several: no, it seems to make it easier and points out the problems soon.

Question from the audience; Everyone pushes new developers to get code upstream. However, many subsystems seem to meet extreme resistence when pushing code upstream, e.g. uprobes, systemtap. Andrew points out that these subsystems are impacting the very core of the kernel and thus are more heavily scrutinized. Someone suggested that the code being pushed upstream would meet less resistence if the code were cleaner and better designed. Is this something that could be better documented for existing developers? utrace probably has a better chance of getting merged now that several core kernel developers are helping shepherd the code. In many cases, core kernel code being pushed by non-core kernel developers requires a level of responsiveness that those non-core kernel deveopers typically do not respond to. Questions about locking, API changes, overlapping capabilities, implications on other subsystems, etc. need to be answered and code review comments need to be agressively addressed by the developer to have a chance of adoption. Write and Post and never respond will clearly never get code into mainline/core kernel.

Some have suggested that your code should be so good that core developers want to pull your code into the kernel rather than having you push your code into the kernel.

Are there too many tracers in the kernel already? Is anyone actually using any of the tracers? Ftrace alone has probably a dozen tracers built into it. However, most of the documentation for tracing is only in the git logs for the code checkins, which is pretty pathetic.

Question from the audience: it seems that there are lots of functions duplicated in the architecture specific trees. How does one test all architectures for factoring out common code like that? Andrew: there is a linux-arch mailing list which is the contact point for all architecture maintainers to see this type of common factoring. Or, send to Andrew and he will send them out to the architecture maintainers until they stick.

Question from the audience: Things change rapidly, including drivers moving to different directories. Greg KH: the rate of change is still increasing at a linear rate. The usb subsystem changed from 2.6.10 to today, for instance. Is that rate of change going to continue? GregKH: Yes - that was a several year period of time and things change rapidly. Git logs and such help track those changes but things will continue to change.

Question from Christine Hansen: How are the highly experienced (aggregate 100 years of Linux use?) developers mentoring new developers? GregKH: I think about this a lot and we document more, we train people how to accept patches, contribute patches, etc. Christine: There is still a world-wide perception that most of the development is centered around Portland, OR, USA. At some point, all of you n the panel are going to Ascend. To a Mountain. But who will replace you? Is there anything done on the mailing list or within corporations? Andrew: Oleg Nastrov from Russia is a good example of an up and coming individual. And is an example of how someone can rise from nowhere and become a proficient developer. But the community seems to evolve - including people with great capabilities who disappear to other jobs, etc. Keith: Corporations have an ability to do internal mentoring driven in part by the corporations need to build and develop new engineers which helps with some of this within the ecosystem. Keith points out that the financial incentive for some programmers also help them fit into this mentoring environment. (My observation: corporations help make the best of any given individual, but Linux really draws extraordinary developers which often come from unexpected backgrounds. As one of my old mentors and teachers said, "Great Engineers are born, not trained. You can improve a good engineer with training and improve a great engineer with experience, but there is no replacement for a naturally excellent engineer).

A question on interface stability: As an example, iptables as a command needs to remain relatively stable but the kernel interfaces are rarely used by anything other than a small number of applications. Therefore an interlock between the key applications and the kernel is sufficient. There are some cases where this is just a plain old hard problem, such as X11 interfaces for mode setting in kernel - when a user level application does its own mode setting, you wind up with "impossible" conditions which can lead to a kernel crash. Ideally, the APIs would be stable enough for the applications but they should not have to be locked forever preventing new evolutions of a subsytem. Application interfaces tend to stay very stable but are handled on a case by case basis. If you can find a way for new applications to use new interfaces and old applications to continue to survive or to be updated to the new interface, we can over time migrate to a new interface and ultimately retire an API. BTW, this same level of stability is not applied to the in-kernel APIs since all providers and consumers of an API can be updated simultaneously in the kernel source thus obviating the need for long term compatibility interfaces. And, of course, good interface design enabling extensibility is ideal, although it is also impossible to envision all ways that an interface may need to evolve over time.

Ted: I don't know how to get the latest X server onto my distro (Greg: Get a Better Distro!) but some of these challenges are just inherent in the interlock between applications and the kernel.

Question from the audience: Where is new code coming from? Vendors, Hobbyists, etc.? Andrew: Seeing a lot of involvement from vendors to support their new hardware, e.g. Texas Instruments. Greg KH, over 20% of all kernel changes can still not be tracked back to a specific vendor. Ted: In the filesystem space, we still see a lot of people "scratching their own itch" - fixing the one thing that really bothers them. That often comes from non-corporate sponsored users trying to solve problems that annoy them, including university students, home users, etc.

GregKH collects per-release contributor information and feeds that to Jonathan and the LWN site to allow open tracking of the source of contributions.

A good session - very well received by the audience. After this, we are all off to lunch for a while. If my battery lasts, I'll continue tracking sessions.

Final comment from Jim Zemlin (with sustained applause) the LF is awarding an "unsung hero" award. The recipient is Andrew Morton. Andrew happens to be an avid racer and the Linux Foundation has arranged for a track day for Andrew. Only condition is to not get killed at the track.

Andrew tells a story about Mark Merlin and a Ferrari where he forgot to brake on the track and had a near-miss. Oops - we hope he remembers the break on his track day! ;)

No comments:

Post a Comment