Articles and Papers
Below is a selection of papers that I've written (with links when available online).
"
Inside the Linux 2.6 Completely Fair Scheduler"
IBM Developerworks, December 2009
The task scheduler is a key part of any operating system, and Linux® continues
to evolve and innovate in this area. In kernel 2.6.23, the Completely Fair
Scheduler (CFS) was introduced. This scheduler, instead of relying on run queues,
uses a red-black tree implementation for task management. Explore the ideas behind
CFS, its implementation, and advantages over the prior O(1) scheduler.
"
Linux introspection and SystemTap"
IBM Developerworks, November 2009
Modern operating system kernels provide the means for introspection, the ability to
peer dynamically within the kernel to understand its behaviors. These behaviors can
indicate problems in the kernel as well as performance bottlenecks. With this knowledge,
you can tune or modify the kernel to avoid failure conditions. Discover an open source
infrastructure called SystemTap that provides this dynamic introspection for the
Linux® kernel.
"
Next-generation Linux file systems: NiLFS(2) and exofs"
IBM Developerworks, November 2009
Linux® continues to innovate in the area of file systems. It supports the largest
variety of file systems of any operating system. It also provides cutting-edge
file system technology. Two new file systems that are making their way into Linux
include the NiLFS(2) log-structured file system and the exofs object-based storage
system. Discover the purpose behind these two new file systems and the advantages
that they bring.
"
Virtual appliances and the Open Virtualization Format"
IBM Developerworks, October 2009
Not only has virtualization advanced the state of the art in maximizing server
efficiency, it has also opened the door to new technologies that were not possible
before. One of these technologies is the virtual appliance, which fundamentally
changes the way software is delivered, configured, and managed. But the power
behind virtual appliances lies in the ability to freely share them among different
hypervisors. Learn the ideas and benefits behind virtual appliances, and discover a
standard solution for virtual appliance interoperability called the Open
Virtualization Format.
"
Linux Virtualization and PCI Passthrough"
IBM Developerworks, October 2009
Processors have evolved to improve performance for virtualized environments, but
what about I/O aspects? Discover one such I/O performance enhancement called device
(or PCI) passthrough. This innovation improves performance of PCI devices using
hardware support from Intel (VT-d) or AMD (IOMMU).
"
Meet the Extensible Messaging and Presence Protocol (XMPP)"
IBM Developerworks, September 2009
XMPP is a open protocol for XML-based communication over the Internet. Although
it is most popular as an instant-messaging protocol, you can use it as a general
messaging service, as well. Discover the ins and outs of XMPP, and learn how to
use it for simple messaging.
"
Conversing through the Internet with cURL and libcurl"
IBM Developerworks, September 2009
cURL is a command-line tool that speaks a number of protocols for file transfer,
including HTTP, FTP, Secure Copy (SCP), Telnet, and others. But in addition to
conversing with endpoints over the Internet from the command line, you can also
write simple to complex programs using libcurl to automate application-layer
protocol tasks. This article introduces the cURL command-line tool, then shows
you how to build an HTTP client in C and Python using libcurl.
"
Anatomy of the Linux virtual file system switch"
IBM Developerworks, August 2009
Linux is the very definition of flexibility and extensibility. Take the virtual
file system switch (VFS). You can create file systems on a variety of devices,
from traditional disk, USB flash drives, memory, and other storage devices. You
can even embed a file system within the context of another file system. Discover
what makes the VFS so powerful, and learn its major interfaces and processes.
"
The Blue Programming Language"
IBM Developerworks, August 2009
Languages are the means by which we express our desires to computers systems,
and, as far as I'm concerned, there's no such thing as too many. One unique
language, called Blue, is an open source object-oriented language that is
multipurpose and intuitive to use. This tip provides the foundation for Blue
and shows you how to build simple networking applications.
"
Anatomy of a Linux Hypervisor"
IBM Developerworks, May 2009
One of the most important modern innovations of Linux® is its transformation
into a hypervisor (or, an operating system for other operating systems). A
number of hypervisor solutions have appeared that use Linux as the core. This
article explores the ideas behind the hypervisor and two particular hypervisors
that use Linux as the platform (KVM and Lguest).
"
Linux Kernel Advances"
Life's certainties include death and taxes but also the advancement of the
GNU/Linux® operating system, and the last two kernel releases did not
disappoint. The 2.6.28 and 2.6.29 releases contain an amazing amount of new
functionality, such as a cutting-edge enterprise storage protocol, two new
file systems, WiMAX broadband networking support, and storage integrity
checking. Discover why it's time to upgrade.
"
Anatomy of ext4"
IBM DeveloperWorks, February 2009
The fourth extended file system, or ext4, is the next generation of
journaling file systems, retaining backward compatibility with the previous
file system, ext3. Although ext4 is not currently the standard, it will be
the next default file system for most Linux® distributions. Get to know ext4,
and discover why it will be your new favorite file system.
"
GCC Hacks in the Linux Kernel"
IBM DeveloperWorks, November 2008.
The Linux kernel uses several special capabilities of the GNU Compiler
Collection (GCC) suite. These capabilities range from giving you shortcuts
and simplifications to providing the compiler with hints for optimization.
Discover some of these special GCC features and learn how to use them in
the Linux kernel.
"
Get to know GCC4"
IBM DeveloperWorks, October 2008.
In the last few years, the GNU Compiler Collection (GCC) has undergone a major
transition from GCC version 3 to version 4. With GCC 4 comes a new optimization
framework (and new intermediate code representation), new target and language
support, and a variety of new attributes and options. Get to know the major new
features and their benefits.
"
Cloud Computing with Linux"
IBM DeveloperWorks, September 2008.
Cloud computing and storage convert physical resources (like processors and
storage) into scalable and shareable resources over the Internet (computing
and storage "as a service"). Although not a new concept, virtualization makes
this much more scalable and efficient through the sharing of physical systems
through server virtualization. Cloud computing gives users access to massive
computing and storage resources without their having to know where those
resources are or how they're configured. As you might expect, Linux plays a
huge role. Discover cloud computing, and learn why there's a penguin behind
that silver lining.
"
Anatomy of Linux Dynamic Libraries"
IBM DeveloperWorks, August 2008.
Dynamically linked shared libraries are an important aspect of GNU/Linux. They
allow executables to dynamically access external functionality at run time and
thereby reduce their overall memory footprint (by bringing functionality in when
it's needed). This article investigates the process of creating and using dynamic
libraries, provides details on the various tools for exploring them, and explores
how these libraries work under the hood.
"
Anatomy of Linux Loadable Kernel Modules"
IBM DeveloperWorks, July 2008.
Linux loadable kernel modules, introduced in version 1.2 of the kernel, are one
of the most important innovations in the Linux kernel. They provide a kernel
that is both scalable and dynamic. Discover the ideas behind loadable modules,
and learn how these independent objects dynamically become part of the Linux kernel.
"
Anatomy of Linux Journaling File Systems",
IBM developerWorks, June 2008.
In recent history, journaling file systems were viewed as an oddity and thought of
primarily in terms of research. But today, a journaling file system (ext3) is the
default in Linux®. Discover the ideas behind journaling file systems, and learn how
they provide better integrity in the face of a power failure or system crash. Learn
about the various journaling file systems in use today, and peek into the next
generation of journaling file systems.
"
Anatomy of
Linux Flash File Systems", IBM developerWorks, May 2008.
You've probably heard of Journaling Flash File System (JFFS) and Yet Another Flash File
System (YAFFS), but do you know what it means to have a file system that assumes an
underlying flash device? This article introduces you to flash file systems for Linux
explores how they care for their underlying consumable devices (flash parts) through wear
leveling, and identifies the various flash file systems available along with their fundamental
designs.
"
Anatomy of
Real-Time Linux Archictures", IBM developerWorks, April 2008.
It's not that Linux isn't fast or efficient, but in some cases fast just isn't good enough.
What's needed instead is the ability to deterministically meet scheduling deadlines with
specific tolerances. Discover the various real-time Linux alternatives and how they achieve
real time from the early architectures that mimic virtualization solutions to the options
available today in the standard 2.6 kernel.
"
Anatomy of Security-Enhanced Linux (SELinux), IBM developerWorks, April 2008.
Linux has been described as one of the most secure operating systems available, but the
National Security Agency (NSA) has taken Linux to the next level with the introduction
of Security-Enhanced Linux (SELinux). SELinux takes the existing GNU/Linux operating
system and extends it with kernel and user-space modifications to make it bullet-proof.
If you're running a 2.6 kernel today, you might be surprised to know that you're using
SELinux right now! This article explores the ideas behind SELinux and how it's implemented.
"
Explore Ubuntu Mobile and Embedded Edition", IBM developerWorks, January 2008.
Ubuntu Mobile and Embedded Edition (Part of the Ubuntu Gutsy Gibbon Linux Distribution)
is an environment designed to simplify integration with UMPC (ultra-mobile PC). Ubuntu
includes this support as part of its standard Linux environment. This tutorial introduces
you to UME, the tools provided within it (such as Moblin), and how to build a full
embedded environment for a supported UMPC.
"
Application Development for the OLPC Laptop", IBM developerWorks, December 2007.
This tutorial continues from an earlier article on the OLPC (now called the XO-1).
After providing an introduction to the XO-1 and its history, the tutorial shows you
how to build a simple graphical application in Python for the XO-1. This includes an
introduction to the basic interfaces and methods for integrating with the Sugar
graphical environment.
"
Anatomy of the Linux
SCSI Subsystem", IBM developerWorks, November 2007.
SCSI is a collection of standards that define how to communicate with a large number of devices
(mostly storage related). This article explores SCSI and how it's implemented within the Linux
kernel. It also introduces some of the acvances being made in SCSI such as SAS, FCoE and DIF.
"
Anatomy of
Linux Synchronization Methods", IBM developerWorks, October 2007.
This article explores the various synchronization methods available in the kernel (such as the
atomic operations, spinlocks, reader/writer locks, and kernel semaphores). It also discusses
concurrency and the reasons behind the need for the methods.
"
Anatomy of the
Linux file system", IBM developerWorks, October 2007.
This article explores filesystems within Linux, which is a another great example of abstraction.
For example, when you perform a 'read' operation, that could be to a ext2, ext3, JFFS or any
other type of filesystem, but on many different types of storage medium (ramdisk, USB flash
stick, SAS disk, etc.). The combinations are huge, but the Linux filesystem layer provides a
model abstraction for dealing various filesystems on various mediums. This article will
introduction the Linux filesystem layer and then explore the major structures and APIs that
implement this module.
"
System Emulation with QEMU",
IBM developerWorks, September 2007.
QEMU is a platform emulator which means that you can emulate an entire PC on another operating
system (such as Linux or Windows). This article introduces the ideas behind QEMU, discusses some
of its internals, and then demonstrates emulating another operating system on top of Linux.
"
Anatomy of
the Linux Networking Stack", IBM developerWorks, June 2007.
One of the greatest features of the Linux® operating system is its networking stack. It was
initially a derivative of the BSD stack and is well organized with a clean set of interfaces.
Its interfaces range from the protocol agnostics, such as the common sockets layer interface or
the device layer, to the specific interfaces of the individual networking protocols. This article
explores the structure of the Linux networking stack from the perspective of its layers and also
examines some of its major structures.
"
Anatomy of
the Linux Kernel", IBM developerWorks, June 2007.
The Linux® kernel is the core of a large and complex operating system, and while it's huge, it
is well organized in terms of subsystems and layers. In this article, you explore the general
structure of the Linux kernel and get to know its major subsystems and core interfaces.
"
Anatomy of the
Linux slab allocator", IBM developerWorks, May 2007.
An operating system commonly allocates and deallocates objects of
a fixed size. Additionally, these objects can be initialized to
a given structure. The slab allocator exploits the common size
object behavior of an operating system, and also makes it easy to
expand or shrink the memory requirements for a given object pool
very simple. The slab allocator originated in the SunOS, but now
finds its home in the Linux kernel.
"
Discover the Linux Kernel
Virtual Machine", IBM developerWorks, May 2007.
The newcomer to Linux virtualization is the Linux Kernel Virtual
Machine, or KVM. This modification to the Linux kernel converts
it into a Hypervisor, allowing it to host other operating systems
such as Linux and Windows. The Linux KVM requires a processor
with virtualization instructions, as can be found with the AMD
Pacifica or Intel Vt.
"
Sugar, the XO laptop,
and One Laptop per Child", IBM developerWorks, April 2007.
OLPC is the One-Laptop-per-Child initiative, and its goal is to
develop a $100 laptop for children around the world (now $150).
The laptop itself is very interesting, as the laptop must be
useful in different environments than our own. But what's most
interesting about the XO laptop is that it runs GNU/Linux and
is programmable using the Python language. This article explores
the XO laptop and shows you how to build a simple activity
(application) for a virtualized XO.
"
Virtualization
with coLinux", IBM DevelopWorks, April 2007.
Cooperation is probably the last thing that comes to mind when
considering Linux and MS Windows, but that's what you get with
coLinux. The coLinux is a cooperative Linux kernel that
virtualizes an entire Linux operating system on top of MS
Windows. You can get something similar wtih Cygwin, but coLinux
has some advantages.
"
Linux
and Symmetric Multiprocessing", IBM developerWorks, March 2007
Linux and SMP are two great tastes, that taste great together.
This article provides an introduction to multiprocessing (in particular
Chip-Level Multiprocessing, or CMP) and then discusses some of the SMP
features of the Linux kernel. It also briefly discusses how to
exploit SMP for user-space applications.
"
Parallelize
Applications for Faster Linux Booting", IBM developerWorks, March
2007.
Linux out of the box is a general solution for desktop and server
platforms. But booting Linux, especially if you're a developer
(particularly a kernel developer) can be a pain due to the time it
takes to complete. This article reviews two approaches of
parallelizing the boot process through init replacements. Initng
is a dependency-based solution, services are dependent upon
one-another, and once one service has started, other services that were
dependent upon that can start. Upstart is an event-based solution
to init. When a service starts, it can send events to kick-off
other services. Also explored in this article is bootchart, which
is used to visualize the Linux boot process.
"
Virtual
Linux", IBM developerworks, December 2006.
Linux virtualization has many solutions, from full virtualization,
para-virtualization, emulation, and many others. This article
explores the various methods that are available today for Linux
virtualization, including the new kid no the block, the Kernel Virtual
Machine (KVM). Read the comments at
Slashdot.
This article has been translated into
Russian
and
Korean.
"
Build
a Web Spider on Linux", IBM developerworks, November 2006.
The goal of this article is to explore the various methods for
developing web spiders on Linux. It illustrates spider
development using Python and Ruby. You can read the sordid
comments on
slashdot,
or
digg.
"
Data
Visualization with Linux", IBM developerworks, Nobember 2006.
Linux is a great platform for data manipulation and
visualization. From GNUPlot and Octave to Scilab and IBM's
OpenDX, this article covers that most useful with examples presented
for each.
"
Version
Control for Linux", IBM developerworks, October 2006.
One of the great aspects of Linux for developers is the wide range of
source configuration management (SCM) systems that are available.
From centralized to distributed repositories, and change-set versus
snapshot models, this article explores the major SCM architectures and
provides examples of each. Also available in
Japanese.
"
New to
IBM Systems", IBM developerworks, September 2006.
This article is fundamentally a marketing piece that introduces readers
to IBM servers.
"
Open
Source Robotics Toolkits", IBM developerworks, September 2006.
This article reviews a number of open-source toolkits for robotics
simulation and development. From the Open Dynamics Engine (ODE)
for modelling realistic physics, to TeamBots for modelling multi-agent
systems. Here's an
intro from
LinuxDevices, and a discussion at
robots.net.
"
Boost
Application Performance Using Asynchronous I/O", IBM
developerworks, August 2006.
Asynchronous I/O (or AIO) is a POSIX mechanism to increase performance
of overlapped I/O applications by providing callback mechanisms for I/O
completion. This article explores the variety of I/O models
available for Linux, and then digs into the AIO model with source
demonstration. The article is now a
reference on
Wikipedia.
"
BusyBox
simplifies embedded Linux Systems", IBM developerworks, August 2006.
BusyBox is the swiss army knife of Linux utilities. BusyBox is
interesting because it combines a large number of utilities into a
single binary, allowing them to share the underlying common code.
This makes it a perfect utility for embedded Linux systems.
Here's an
intro
at LinuxDevices.com. Also availabe in
Chinese.
"
Anatomy
of the Linux Initial Ramdisk (initrd)", IBM developerworks, July
2006.
The Initial Ramdisk (or initrd) is a temporary root filesystem in ram
that acts as an intermediary filesystem for module loading while the
real root filesystem is not yet available. This article explores
the anatomy of the initrd, and demonstrates how you can build one from
scratch. This article has been translated into
Japanese.
"
Inside
the Linux Scheduler", IBM developerworks Linux Zone, June 2006.
The Linux scheduler has evolved greatly over the years, and with the
2.6 kernel, has been transformed from an O(N) (linear time) scheduler
to an O(1) (constant time) scheduler. This article discusses the
new Linux 2.6 kernel and other aspects such as SMP support and load
balancing. Here's an
intro
from LinuxDevices. This article has been translated into
Japanese.
"
Inside
the Linux boot process", IBM developerworks Linux Zone, May 2006.
This Linux boot process is very flexible, and supports booting on a
large number of platforms from a variety of devices (hard disk, floppy,
CD-ROM, USB Flash, network, etc.). This article will walk you
through the desktop x86 boot process, but also provide some information
for embedded and network booting.
"
Access
the Linux Kernel using the /proc filesystem", IBM developerworks
Linux Zone, March 2006.
The /proc virtual filesystem is a great way to permit communication
(configuration and monitoring) between user-space applications and the
kernel. In this article you'll learn about /proc and
explore a demonstration of a fortune cookie dispenser implemented as a
kernel module with /proc. This article has been translated into
Chinese,
Japanese,
and also an introduction in
Korean.
"
Better
networking with the Stream Control Transmission Protocol (SCTP)",
IBM developerworks, February 2006.
In this article, I review the benefits of SCTP over TCP (from
multi-streaming to multi-homing). Sample code is also presented
demonstrating the multi-streaming feature. This article was also
slashdotted.
It has been translated into
Chinese.
"
Automate
Client Management with the Service Location Protocol (SLP)", IBM
developerworks, February 2006.
Discusses the zero-configuration networking capabilities of SLP and
demonstrates its use using a simple Daytime protocol example.
This article has been translated into
Chinese.
"
Boost
Socket Performance on Linux", IBM developerworks Linux Zone,
January 2006.
This article demonstrates four ways to boost the performance of sockets
applications, from socket buffer tuning to kernel proc filesystem
tuning. Also read the
sordid
Slashdot comments on this article.
"
Sockets
Programming in Ruby", IBM developerWorks Linux Zone, October 2005.
Tutorial exploring the Sockets API and its integration into the Ruby
object-oriented scripting language. Discusses Ruby-specific
features for sockets programming.
"
Sockets
Programming in Python", IBM developerWorks Linux Zone, October 2005.
Tutorial exploring the Sockets API and its integration into the Python
language. Discusses Python-specific features for sockets
programming.
"
Five
pitfalls of Linux sockets programming", IBM developerWorks Linux
Zone, September 2005.
Discusses the development of reliable networking applications in
heterogeneous environments. Translations are available for
Japanese
and also
Chinese.
"
Visualize
Function Calls withi Graphviz", IBM developerWorks Linux Zone, June
2005.
Using the GNU Compiler Toolchain, and a small amout of glue code, a
dynamic graphical function call generator can be easily created.
Translations to this article are available in
Japanese
and also
Chinese.
"
GNU's C
Language Extensions",
C/C++ Users
Journal, March 2005.
The GNU Compiler includes a variety of language extensions. This
article explores some of the more useful elements.
"
Optimizing with GCC",
Linux
Journal, March 2005.
The GNU Compiler Collection (otherwise known as
GCC)
is the de facto standard compiler for Linux and also multi-platform
embedded development. This article discusses the 3.3 GCC
optimizer and how to use it effectively to build optimized applications.
"Defensive Programming",
C/C++ Users
Journal, February 2005.
This article focuses on programming for reliability -- given that we
make mistakes in the development of software, how can we program in a
way that minimizes some common, and difficult to debug, mistakes.
"
GNU
Development", Circuit Cellar, January 2004.
This article provides a tour of software development with
GNU tools, including the GNU compiler
toolchain, build automation with make and a variety of other
utilities. This article is also available on
Developer::Pipelines.
"
An Embeddable
Lightweight XML-RPC Server", Dr. Dobb's Journal, June 2003.
The
XML-RPC protocol is
explored in this article, with a simple implementation of a server in
C. The server is then demonstrated using a C and
Python client.
"
Personalization
and
Adaptive Resonance Theory", Dr. Dobb's Journal, October 2002.
This article discusses the use of the ART1 clustering algorithm for
personalization (recommendation).
"
Java Mobile
Agents
and the Aglets SDK", Dr. Dobb's Journal, January 2002.
Demonstrates the construction of simple mobile agents (migratory
programs) in Java using
IBM's Aglets SDK.
"
Embed with
the Mailman", Embedded Systems Programming, October 2001.
An
SMTP server
and client is discussed with source code suitable for use in embedded
systems. The
SMTP client
is discussed in applications of remote statusing (emitting data to a
remote client). The
SMTP server
is discussed from the perspective of command and control (sending
emails to the embedded device with commands, and receiving back
responses from the onboard
SMTP client).
"
An
Embeddable HTTP Server", Dr. Dobb's Journal, October 2001.
An embeddable
HTTP server
is presented that is suitable not only for embedded systems, but those
without file systems (EEPROM-based). The concept of an
application filesystem is presented along with the tools to build and
integrate it with the
HTTP server.
"
Embedded
Linux on the PowerPC", Embedded Linux Journal, July 2001.
In this article, I explore the use of Linux (
Montavista Linux) on the
Embedded Planet RPX-Net
PowerPC board.
"
CPJazz
-- A Software Framework for Vehicle Systems Integration and
Wireless Connectivity"
SAE 2000 World Congress. Also appears in the book "
Intelligent
Vehicle Systems", ISBN 0-7680-0588-4.
Discusses research in connectivity between disparate devices and
vehicle buses to disparate wireless assets in a vehicle
environment. The flexible CPJazz architecture provides a
"software bus" architecture to seamlessly integrate buses and devices
for intercommunication.
Presentations
"
High Performance Networking",
May 2003
I gave this presentation as a guest lecture in Sam Siewert's "Real-Time
Embedded Systems" course at Colorado University in the Spring of
2003. In this presentation, I discussed problems and solutions
for scaling TCP/IP networking to gigabit networks through a variety of
means.
Resume
Here's my resume as of Fall 2005 (
pdf or
html).
mtj@mtjones.com.
Last Updated November 2009.