DAS Stack: Let’s continue the journey – About the “Disk”

From a typical internal construction perspective, we can say the disks can be broadly classified as three types. The physical hard disk, a flash based Solid State Disk (SSD) and virtual disk from Storage Arrays (Network Disks).

The physical hard disks are usually made up of a hard metal disk (usually aluminum) coated with a magnetic material. The magnetic material records the data and the aluminum disk provides the rigid support.

The diagram below shows a quick view of insides of a typical hard disk drive.

More detail at http://news.bbc.co.uk/2/hi/technology/6677545.stm

Let’s quickly touch a few ares in a word or two. For further information http://en.wikipedia.org/wiki/Hard_disk_drive is an excellent source of detailed information.

The disk or platters here is held and rotated by the spindle. The Read/Write head moves about the surface of the disk and, of course, reads or writes the disk as it rotates. The head arm holds the Read/Write head(s) and is positioned by the voice coil actuator. The air-filter filters out the dust etc. from entering the hard drive compartment.

Just for notes, the head “flies” above the disk with a very thin gap between the head and the disk. If it ever touches the disk during spin, the head “crashes” and the disk is essentially useless. There are some data recovery technologies which can try and recover the data that is not on the crash path. However, as far I know, it is not practical to extract data under the tracks that is actually in the crash zone.

Disk Block Layout

This diagram shows a rough view of Disk Layout. The disk itself if split in to tracks and sectors. There are two possibilities with this kind of arrangement, as you might guess or know. The first is variable track length (or circumference, if you prefer). This is a typical concentric circle division. The center circles are smaller and hence will have smaller circumference and the outer circles are larger having greater circumference.

The other is fixed track length. This can normally be achieved by spirals instead of concentric circles. The track is of fixed length. Near the center of the disk, the circle can have lesser tracks than near the edge of the disk. This is more standard industry practice. The diagram below shows a typical tracks and sector division. Each sector is further divided in to blocks and we have that division on display here as well.

The block is the smallest individually addressable entity on the disk. It is usually 512 or 520 bytes, with 512 is most common. There is an advanced format proposal which makes the block size to 4096 (or 4K) bytes. The host system might need to have some support for this version, though. More on this later.

Shown at http://computershopper.com/feature/how-it-works-platter-based-hard-drive

The drive has the disk or platter(s) (1), spindle(2) to hold and spin them, the head arm (3) to hold the Read/Write heads, the voice coil (4) to move the head around, the heads (5), and the head landing zone(9) where the head can rest without crashing on the disk during power down. The tracks (6), sectors (8) and blocks (7) are the locations on the disk where data is stored and retrieved.

Typically, the disk is accessed as a set of serially numbered blocks – the Logical Block Address (LBA). The on disk electronics worry about translating disk LBA in to a location on the disk and store retrieve information. This also makes the life of host system easier as it only needs to worry about the block address as a logical block number – no need to remember on which platter, on which side, on what track in which sector is the block of our interest.

The logical block addressing also helps the on disk controller map out bad blocks without the host system worry about them. This is an unwanted behavior in some cases, though.

For the Solid State Disks, or flash drives – there are no spindles as you might guess. There are just a few chips under the hood storing and retrieving blocks of information with a bit of glue logic to attach these storage chips to the bus. The blocks can again be accessed with LBA and the controller translates that in to chip, sector/group and block address depending on flash chip organization. Let me try to touch SSDs in some more detail in a separate post. A typical small scale example of SSD is our USB pen drive or SD Card.

Storage: The DAS stack

Let’s talk about a typical Direct Attached Storage stack from a server system’s perspective.

Let’s cover the storage stack in both software and hardware layers, which can later help us bring in the networked storage concepts with ease.

Let’s take a quick look at the stack for ease of understanding.

The Storage Subsystem

The storage subsystem stack diagram has software components in blue  and hardware components in red. As with common sense, both are in just reverse order to each other. In a way, this stack might look just like networking stack. Let’s walk through the maze.

When an application requests for some operation a file, the request is passed on to the filesystem layers. Like in Linux, it’s possible to have a Virtual File System or VFS layer which will then switch to actual filesystem drivers to do the job. The filesystem then starts working through it’s magic on it’s internal data structures to figure out where and how to go about serving the request. One best example is looking up an inode and the indirect blocks. Let’s save some for detailed discussion later.

The filesystem then asks the disk accessing components, e.g. SATA or SCSI about the blocks it’s interested in. Once the request arrived in here, from now on we are only talking in terms of block numbers, be it LBA or some other mechanism. We don’t know anything about filesystem at all. Tell the block number and get it, read or write. The disk accessing subsystem then, looks up to do what best it can do, often re arranging the commands for a sequential operation improving the performance. Finally places the request packet to the HCI.

The host controller interface (HCI) or host bus adapter (HBA) in some cases such as SCSI drivers take in the packet , does the job of taking the series of commands for the disks, appropriate buffers and the disk ID as input, makes the queue up to place the commands to the disk along with disk ID. The appropriate queue is then passed on as a bunch of bytes/words to the BUS interface driver. In some cases, the HCI is rather really complex piece of software, such as for SATA, containing may layers such as transport layer and link layer within itself.

Finally the data to be sent on the bus arrives at the bus drivers. The bus drivers often are just simple ones to fiddle with a few flags and place the data on the bus, often with DMA or direct memory accessing subsystem and then let things go. When the DMA is done an IRQ comes up and informs the driver that the operation is complete. Here, the driver then returns control the HCI driver and HCI driver might wait for it’s own IRQ telling the command queue is sent successfully to the disk or just return the control to the Disk driver. Here the disk driver will have to wait for the disk job done and the IRQ tells that it’s complete. Only then the control can reach back to the filesystem telling that job was well done and the day is going good.

The skipped RAID component, let’s talk about it in a while.

Talking about the hardware, let’s start with the disks. The disks are just a few stores for the data and a few electronics glued to transfer data in and out. Often, disks take a single Logical Block Address and return the appropriate block by maintaining internal LBA to track/sector mapping. In the initial days, the disks just had a minimal electronics, but later the integrated drive electronics or IDE came in to picture. The same way, the Small Computer System Interface (SCSI) came in to existence. Though this is a dinosaur age story, that might just amuse you a bit. Though the SCSI is a common interface for the system, it’s mostly used for mass storage. Often, most storage systems emulate SCSI, such as USB pen drives. Then came along the AT Attachment or ATA (AT coming from PC-AT) and then came down the Serial ATA or SATA.

The SATA, however has it’s own protocol stack, though it supports a legacy mode in which the system can access a SATA subsystem as if it were PATA. As the SATA HCI can be operated in both backward compatibility mode with PATA and advanced HCI mode, the controller includes both standard IDE electronics & advanced HCI components.

Finally, the Bus interfaces. Most of today’s systems run on top of PCIe or Peripheral Component Interconnect – Express bus. In few words, it’s an advanced serial IO bus system that interconnects the CPU with the peripherals at a very high speed. Please visit http://en.wikipedia.org/wiki/PCI_Express for a quick look. Some posts on it a bit later.

In the next post, let’s talk about blocks, RAID and stripes.

Storage: The first words – DAS, NAS & SAN

Let’s discuss about storage for a while. No I am not talking about food store or cold storage. Let’s talk about electronic data storage.

As most of us know, the data is stored in the computer at least two places. One is temporary store and the other is permanent. The temporary store holds the data and the programming as a book opened for reading/writing. This resource is often limited in size, volatile and costs pretty high. On the other hand, the permanent store is relatively cheap, like almost the cost of MB temporary store is equivalent to cost per GB permanent store. Also the permanent store or secondary store is abundant, and non volatile, but significantly slow.

Let’s call the temporary store ‘ the RAM’, which collectively refers to the SDRAM, the SRAM, the DRAM, the L1/L2/L3 Caches and such stores. And the let’s call the permanent store ‘the disk’, which includes hard disks, optical disks and such devices.

The way information is stored in the RAM and disk is also significantly different. The RAM is often divided in to pages and segments and addressed as linear address space (by linear, I mean all logical, virtual and physical address spaces).

The way information stored on the disk is a bit different and non linear. The disk itself is organized in to tracks and sectors to store the information physically. Logically information or data is organized in files & directories. Though there are other tiny stores, it is mostly meta data, like boot records or partition information.

The way to map the tracks and sectors in to files and directories is often called filesystem. There are more than one way to deal with such mapping, hence different filesystems – ext3, FAT, reiserFS, NTFS or less known, proprietary filesystems such as WAFL.

The way to attach the tracks, sectors & disks to the computing system is also diverse, Direct Attached Storage, Network Attached Storage & Storage Area Networks. The best way to study the storage stack is to start off with Direct Attached Storage. The networked storage works inserting the network either in between application and the filesystem (NAS) or the filesystem and the disks (SAN).


DAS

Let’s talk about DAS first, hence. IMO, the easiest diagram is what we have above here. Though the diagram is just a few a few blocks, I’m also trying to talk about the same, which I’m intending to reuse again and again, at least for a while.

The application never needs to know about the underlying storage architecture. The application needs the secondary store for at least two purposes, first to get the application program instructions such as the application itself or shared libraries. The other is for storing and retrieving the user data, such as databases, documents or spreadsheets.

The moment it starts execution, it might need program instructions such as a shared library or some part of the executable file itself. For this part, the application does not even need to deal with the filesystem at all. Here the application simply calls the required routine and the operating system worries about talking to the filesystem, identifying the way through tracks and sectors maze.

However for the user data part, the application needs to worry about the filesystem, though to a minimum extent. The application needs to ask the operating system to open a given file, read from it or write to it, delete it or create it and close it when job’s done. The operating system and the underlying filesystem stuff do the dirty job, just as a clerk at the filing cabinets. You would ask for the required file, the clerk is happy to get that for you, if not available you may ask to add a new one, he opens the file reads or modifies the same as you ask for and returns it to the cabinet when done. Similar way, the application gets the file (file handle more precisely), reads, modifies the file and closes when done. All that application worries about is whether the file at given path exists, if exists can be accessed, if it can access, open, read or modify, rinse and repeat till the job is done and then close the file. It never needs to worry about what happens under the hood, it may not even worry about where the file actually is stored. At times, the application is not using the data files, instead using the database, the application simply talks to the database program, which worries about storing and retrieving data from the filesystems.

The databases, often the small scale databases, store their data, records and tables in the files on the disks. The large scale databases however, toss off the filesystem, take the tracks and sectors off the disk directly and manage them using their own filesystem. In either case, both should have some sort of filesystem under the hood to manage the tracks and sectors.

The OS, with the help of filesystem code, often in the form of filesystem drivers, translates the application requests for files in to tracks and sectors, retrieves or writes the ones required and thus manages them. How does the filesystem track and manage the tracks and sectors? How does the mapping happen between logical entities such as files & directories to physical tracks and sectors? Let’s bother about them in a while. For, now, let’s continue for the next layer.

The RAID/HBA deals with translating the operating system request in to a language that’s understood by the disks and disk interfaces. Often, they sit on top of some stacked bus interfaces such as SATA on PCI-X bus. The HBA then talks to the disks in the language of the disks, such as LBA or Logical Block Addressing.

The Device Drivers – RTOS Vs. Linux


Typical RTOS device driver model

Let’s take a quick look at a typical RTOS device driver and throw it in ring with Linux driver.

The device driver’s philosophy
From a general perspective, the device driver, usually is a set of routines enabling typical applications to talk to the hardware. In other words, often, the device drivers are translators, translating applications language into hardware language. The device drivers also enable application programs to ignore hardware dependencies and focus on the actual job in hands.

The RTOS device driver
The device drivers for RTOS, as shown in the diagram, usually a bit different from a GPOS or firmware device drivers. The device drivers can be broadly classified in to three types, OS less drivers, GPOS drivers and RTOS drivers.

OS less drivers usually are just a bunch of routines enabling applications without OS to do their job. Usually such applications are bootloaders, firmware boot up routines or simple systems with round robbin applications.

The device drivers for GPOS, for the most part, are also a bunch of device access routines. However, the key difference between drivers for OS less systems and drivers for GPOS systems is lying in the driver operation concurrancy, entry & exit points. More frequently, if not always, the drivers should also deal with security stuff.

However, the RTOS drivers themselves make their home in an entirely different world. The drivers should follow the same basic principles and the philosophy of the RTOS itself. The driver execution should be predictable and also should enable the whole system to be predictable and fault tolerant.

Let’s consider some scenario to picture the above two phrases. When we open a file in, say the Windows explorer, from a DVD ROM. The read encoutnered some error, causing the hardware & software to enter a retry loop. From the explorer, under normal scenario, one cannot end that loop, by a key stroke or even from task manager. Under Linux, the only way to end that loop, only the software loop, is to kill the process. However, the driver still continues to retry till the given number of retries are exhausted and only then it tries to return error code to the process, only to know that the process was long gone.

Under a typical RTOS scenario, the driver starts executing the reads and enters the failure loop. The difference starts here. Both the driver and the application start the operations with predefined timeout values. Hence, when a command is issued, say by the driver to the hardware, the driver waits for, say, the ISR, the driver waits for a timeout period. Hence, if the driver did not get response from the hardware within given time, the driver can choose to return error to the application indicating it needs more time to do the job.

On the other hand, the drivers for most monolithic operating systems or GPOS’ are mostly a bunch or routines with data structures tied to the calling process or a file descriptor’s context. Hence, the executing driver code contends for the hardware at the lowest level in the driver code if executed by multiple processes and the first one wins, while others enter sleep on the hardware lock. Thus, the calling process, if it did not get the lock, has to sleep/block. In some cases, the driver can also export a little framework to return immediately registering the request and respond later with the data, e.g. async file IO. Though, this allows the application process to be a bit more predictable at the execution, the application still lacks complete predictability. However, the driver is a lot more scalable with high peformance on the more capable hardware. Some other topic on that again.

The RTOS driver, on the other hand, for the most part runs it’s own thread managing the requests. A typical driver implementation is depicted by the picture above. Let’s see what makes this driver under RTOS more predictable.

The driver thread maintains a message queue. The application thread sends a request message along with it’s own queue which should receive the response. The request message contains the necessary data for the driver to execute the request, say read location or write pixel/framebuffer. Thus, once the application sends the request, it just registered it’s request with the driver and is free to do other stuff till it gets it’s response back. Now, the application can sleep on the response queue with a timeout. Thus, the CPU is free to do other jobs instead of either continuously polling the queue or waiting indefinitely. If the application is back in execution and found that it did not receive the response from the driver yet, it may choose to do some other stuff, such as informing the user or taking other corrective actions.

What makes the driver predictable is not much different from the above scenario. The driver picks up the first request form the message queue and starts executing the job. It issues the request to the hardware and waits for the response, on a semaphore with timeout, shared between driver and ISR. The hardware, after the job is done, responds by raising the interrrupt causing the CPU to start the ISR. The ISR in turn releases the semaphore, thus waking the driver process up again. The driver process, then, responds with the result to the application. If there was an error, the hardware did not respond in time back. The driver wakes up because of timeout error, figuring out that the hardware failed to do the job in time. Thus, the driver responds with an error to the application and issues an abort to the hardware for the current job. The application, can then choose whether to retry or not, by sending another request again. However, for occassional realtime systems, the driver tries itself for a couple of times before giving up. The application, upon deciding to abort the curret operation, issues a high priority message to the driver process, causing it to process the one immediately, like a backdoor, thus aborting the current operation.

The facilities of priorities, such as task priorities and timeouts against sleeping contexts such as semaphore wait enable the designer to design a system, highly predictable. Often, the life support systems are some real time systems with redundancies built in. That means, if a hardware fails to do it’s job in time, the driver immediately turns the device off bringing the system to a known state and picks the secondary hardware to do the job. The same is the case with mission critical systems, be the nuclear control systems or avionics. As it shows, handling redundancies is another speciality for a RTOS device driver.

However, the message queue and the serializaiton of the code with priorities often tend to limit the scalability of the systems.

For high performance hardware which may tend to take in multiple commands at once and respond with results one by one such as DMA controllers, instead of releasing a semaphore, the ISR sends a message to the same driver queue with results read from the hardware. The driver, when gets a chance to process the response message, looks at the context stored by the ISR and figures out the message from application, thus responding with appropriate result.

That closes the topic for RTOS drivers for now. I’ll try to throw some light  at tasks and task priorities some time.

Hardware with RTOS & Device Drivers

Let’s look at how the hardware lives with RTOS.

RTOS_HW

The hardware from, at least from the RTOS perspective, can be broadly classified of two types. The first part is what is used by the RTOS, the second is what RTOS supports the applications with.

The fundamental hardware required for a given RTOS is CPU, Memory and some boot store, the red blocks. Most RISC CPUs often tend to have a small static memory on the die, about a few KB on board. Though a given RTOS can be implemented using just the internal memory,  the RTOS still needs to manage the memory resources for itself, house-keeping and of course the applications. Hence, the external memory is also a required hardware resource, though it is not something it cannot do without.

The additional hardware, or the ‘application’ hardware, the blue blocks, are not used by the RTOS itself, but they should have drivers developed and run on top of RTOS kernel itself to make up the BSP or Board Support Package. Unless all the four blocks are effectively covered by the RTOS and the BSP, an application cannot use the complete system effectively.

Next, let’s look at the boot up process till we bring up the applications, which should help us cover all the hardware blocks.

typical_RTOS_boot

Let’s start with the Power On Self Test, a.k.a POST and pre boot phase. These routines must be stored in a permanant non voaltile storage, usually ROM or Flash. The post and pre boot phase normally begins with the execution jumping to the reset vector. In other words, the POST and pre boot stuff live at the reset vector. If you got mail for them, you know where to deliver now.

Alright, fun aside. The POST, as it abbreviates, are a set of routines helping the processor to verify that the coast is clear for it to boot. Once the processor gets a go from the POST, the pre boot kicks in. Usually, the pre boot up routines are packaged and shipped along with POST and most probably the boot loader itself.

The pre boot routines setup minimal environment for the boot loader to take over. At the minimum, they either initialize the internal memory, if available. Otherwise, they have to enable the external memory controller, initialize the external memory itself, e.g SDRAM. Once the memory is initialized, the pre boot routines give way to the boot loader. Occassionally, the pre boot routines relocate the boot loader from the permanant storage to RAM.

Ah, the big guy, the boot loader. Usually, the boot loader has got more than one job to do. The boot loaders are often designed to do a bit more than just setting up the memory space, configuring the resources, loading the kernel and OS into the memory and passing the control to it. Often, boot loaders are also designed to download the OS images, verify the filesystems, uncompress the OS images or even give an early access to board diagnostics. In any case, let’s just follow the regular boot path.

Once the boot loader is in control, it sets up a bit more of the hardware, configures the resources and address space, fiddle with caching and starts unpacking the OS image. In some cases, the OS image may need a bit of uncompression, in other it’s a good to go off the shelf. Usually, the OS image has it’s home in the flash store.

In cases, there could be two flash stores, one a bit of high reliability parallel flash at the reset vector and another high capacity serial or parallel flash living somewhere else. May even be a NAND flash or hard disk also. In such cases, the boot loader should also know a bit about how they are stored in the secondary store, e.g. filesystem and drivers to access the media.

If the boot loader store is separated, it could be because of multiple reasons. One, the flash store large enough for entire OS image could be really costly, because it has to be high reliability NOR type. Two, OS upgrades may impact the stability of the boot loaders also. Another, in case of corruption by some OS writes, the bootloader may be overwritten, corrupting the same. A serial flash store such as SPI flash or even SD cards cannot provide parallel access during POST phases.

On with the tour, once the boot loader gets hold of OS image, it may unpack the same in to RAM, uncompress if needed. Some smaller systems may directly run OS off the flash. In such cases, boot loader simply passes on the control to the OS image after checking for any OS update events pending. Otherwise, the loading into the RAM goes first and then control is passed to the OS boot up routines.

Once the OS gets control, the boot up part of the kernel routines take up the charge. They verify internal resources, setup address spaces etc. Occassionally, an uncompressing job is handled by the pre-kernel part itself. The pre-kernel may also set the stack up for the kernel, heap and other minimal memory management stuff. Usually, the pre-kernel messes with only a minor part of the avaialble RAM, leaving rest for the big guys. Finally, the kernel kicks in.

The kernel does a bit of housekeeping work, such as reconfiguring the memory regions, brings up the memory management and resource management stuff such as traps, task queues, timer queues etc. Once the kernel is done with the housekeeping work, it will finally enter the scheduling routine, from where it will load the task whose address is given to it at the shipping time and starts the same. A simple mechanism is a pre-defined function, with resources also predefined. In a way, we may call this the init task.

Usually, timer routines are provided by the BSP to the kernel. Just before starting off the first task, the timer is kicked, ISRs are setup and the control is passed to the init task. This way, the RTOS can flex it’s time slicing muscle at tasks that refuse to give the CPU up.

The init task, by design, spawns a bunch of tasks. The ones that go in first are the device driver tasks. If you are familiar with *nix, you may ask, what does this mean? Let’s look at the RTOS drivers, a bit later, to understand what they really are.

Once the driver tasks are up and running, the application tasks are brought in. These are the guys who do the final job. Be it showing up some video on the screen or issuing a defibrillating shock to the heart or even launching an atom bomb, depending upon the hardware resources and what they are programmed to do.

One thing to notice in the context of RTOS applications is that they directly work on the hardware, with drivers are just a bunch of synchronized routines for doing the dirty job of working with the hardware. In other words, the RTOS applications, for the most part, should be aware of the harware to do their job. There are a bit of exceptions, but most real time tasks, do their job, of course, both As Simple As Possible and As Quick As Possible.

Though some POSIX and other library implementations exist on a given RTOS, for the most part, at least for time critical jobs, the task should know what it is dealing with. Abstractions are distractions.

Coming up next: A typical RTOS device driver.

Components of RTOS

(Or how can RTOS do what it does)

Let’s look at a typical Real Time system implementation.

typical-realtime-system

The RTOS component of the real time system is just the micro kernel part, the yellow region. The RTOS just exports a set of basic components as part of the OS, a priority based task scheduler, a minimalistic memory management subsystem, a set of timers, some task synchronization facilities like semaphores and conditional variables and any other such OS services. The timers and synchronization facilities trigger the scheduler almost everytime. If time slicing is implemented, timer also issues a regular tick which causes currently running tasks to be preempted and next higher priority task to be rescheduled.

As one can notice, the memory space is linear, often physical address space. The execution domain is single, no user – kernel modes of execution. The device drivers are just tasks or threads of execution running at higher priority, as can be seen above. From the RTOS perspective, tasks are often just threads of execution.

Let’s try to look at each of the subsystem and components.

The scheduler is usually a simple round robbin scheduler with priorities. Usually, the scheduler maintains a set of task queues, number depending upon the number of priorities supported. If a system supports 16 priorities for the tasks, the scheduler usually maintains multiples of 16 task queues, each task is queued in one of it’s priority queues. The task queues might consist, at least, ready to run and sleeping. Some relatively simple OS may choose to have single queue for both ready to run and sleeping tasks. Even simpler systems support very minimal priorities, e.g. 4 priorities, maintain a single queue for all the tasks in all states. However, such systems should have very minimal number of tasks, otherwise the scheduler may end up eating all the CPU cycles.

Almost every kernel subsystem invokes the scheduler. A timer event, a semaphore or mutex release and in some cases even an ISR return can invoke the scheduler. The scheduler scans for the next highest priority task to be run. If a new higher priority task came up in running queue, the current task is put to sleep and the newer one starts executing. The scheduler can be locked out by the tasks running critical code, but any use of such facility should be reviewed and reviewed again with every caution. Otherwise, the system may enter deadlock, behavior may become unpredictable or the system may fail to meet the requirements.

If the time slicing is configured and enabled, even the tasks running at the same priority may be preempted to run next task in the same priority queue. At the timer event for time sclicing, the current task is appended at the end of the current priority run queue and the next task is picked up, only after ensuring no higher priority tasks are ready to run.

The task priorities are one of the serious issues which tend to give nightmares to the designers. However, a system with properly decided task priorities runs almost bullet proof. I personally witnessed unstabilities during the development of real time systems without proper calculated/assigned priorities, till the task priorities achieve their perfect balance. From that point, the development, testing and deployment is just a walk in the park.

The memory manager, often, tends to be a simple slab allocator and collector. It is designed to be as simple as possible. Most often, a typical RTOS tends to have a linear physical address space exported to all of it’s tasks. No segmentation, no logical addressing or virtual addressing and paging. Some RTOS tend to provide such facilities, though the tasks utilizing such facilities may be guaranteed to meet real time performance goals. The address space or memory for the memory manager is allocated at the compile time itself. Often, the stack get’s it’s own space allocation and the heap it’s own. In some cases, the stack for a new task is also allocated from the heap and in such cases, the memory manager might need to do a little extra job.

Timer are one integral part for any given RTOS. The timers, in my view, are the heart beat for the RTOS. Without timer events, achieving real time performance is really hard, though not impossible.

Timer subsystem can be broadly divided in to two parts, the timer event generation part and the timer queues. The timer event generation part, usually, depends on the timer device driver for the specific platform. Once the system is up and running, the timer events are genenrated as configured during the compile time. Timer events, often in milliseconds resolution, can either invoke the scheduler or the timer queue management part. If the timer queue management part is invoked, all the timer queues are scanned for the events that need to be shuffled, e.g. task sleeping on a condition, the tasks that have expired timers are added to the respective priority run queue and the scheduler is invoked. Often, different hardware timers are used for time slicing and timer queue management.

The timer subsystem can also cause event posting to the tasks, that a given task can subscribe to. A task can subscribe to timer events at a perticular interval, as per compile time configuration, and can receieve timer events at the configured intervals. Software watchdog systems are one best example. A software watchdog can kill and restart a task or even reset a system if it didn’t receive a predefined event within given time from the given task or subsystem.

The RTOS may also export other subsystems such as system management, interrupt management, libraries for application and even POSIX compliant libraries. Some RTOS may even come with predefined tasks such as for networking or TCP/IP stack and filesystem management tasks.

Let’s look at device drivers, hardware interaction and user task related stuff in the next post.

Real Time Operating Systems – A view from the shore

The Real Time Operating System from a common man’s perspective.

Computing systems can broadly be classified in to Generic computing systems, Embedded computing systems and High performance computing systems. The line  between generic computing systems such as desktop computing systems and the embedded computing systems is getting blurred day by day.

An operating system functions to be the mind of a given computing system. It creates and manages jobs, it deals with allocating and managing resources and much more. Often operating systems come bundled with user interfaces and programmer friendly libraries.

While desktop operating systems can be complex monsters, embedded operating systems usually tend to be much simpler and lighter. As modern embedded systems became much more powerful than earlier general purpose computing systems, the desktop operating systems such as Windows and Linux are finding their way in to embedded computing space, mostly in their miniature forms.

The embedded operating systems can be classified between real-time (RTOS) and non real-time systems. The non real-time systems can further be divided in to two further classes, round-robbin simple kernels and general purpose OS (GPOS) for embedded systems.

The RTOS is the key component in implementing systems with predictable responses. Imagine a mobile processing system, which has a very limited resources such as slow processor and often little RAM and storage facilities. Now, if one does not have a way to implement a predictable behavior software on top of the hardware, the resources are pretty small compred to the job they have to do: process the call data, process the control information, play a song or shoot a picture.

However, RTOS has it’s own set of limitations. First, RTOS cannot, at least at the heart, enforce security such as user/kernel space. Hence, typical RTOS systems cannot run user programs or third party applications. Though there are some exceptions, they run at low priority and consume extra resources. In either case, the application space is limited. Let’s look at that in some other post.

Besides, adding any more tasks can easily change the predictable behavior or the overall system. However, in most cases, the impacted are only the tasks below the new task’s priority. However, since the system is designed to be specific purpose system, complex interdependancies change their equations changing the overall behavior of the system.

On the other hand, a General Purpose Operating System or GPOS, does not have these or other such limitations, which RTOS has. However, GPOS cannot guarantee a predictable execution environment with constrained resources.

Let’s consider the case of a slower PC with Windows or Linux operating system. If you open a file or open the browser while you’re playing a video or even a song in the background, the video or the song will be interrupted for brief durations. In order to be able to run such multiple tasks smooth, the PC should have minimum resources, while a mobile can easily handle such tasks with even minimal resources.

The advantage with embedded GPOS is that it can allow user to build and deploy third party applications without worrying about security or other such issues. That might be the convincing reason behind mulitple processors, often, found in the smart phones.

Let’s take a quick look at RTOS components in the next post.

%d bloggers like this: