The Device Drivers – RTOS Vs. Linux


Typical RTOS device driver model

Let’s take a quick look at a typical RTOS device driver and throw it in ring with Linux driver.

The device driver’s philosophy
From a general perspective, the device driver, usually is a set of routines enabling typical applications to talk to the hardware. In other words, often, the device drivers are translators, translating applications language into hardware language. The device drivers also enable application programs to ignore hardware dependencies and focus on the actual job in hands.

The RTOS device driver
The device drivers for RTOS, as shown in the diagram, usually a bit different from a GPOS or firmware device drivers. The device drivers can be broadly classified in to three types, OS less drivers, GPOS drivers and RTOS drivers.

OS less drivers usually are just a bunch of routines enabling applications without OS to do their job. Usually such applications are bootloaders, firmware boot up routines or simple systems with round robbin applications.

The device drivers for GPOS, for the most part, are also a bunch of device access routines. However, the key difference between drivers for OS less systems and drivers for GPOS systems is lying in the driver operation concurrancy, entry & exit points. More frequently, if not always, the drivers should also deal with security stuff.

However, the RTOS drivers themselves make their home in an entirely different world. The drivers should follow the same basic principles and the philosophy of the RTOS itself. The driver execution should be predictable and also should enable the whole system to be predictable and fault tolerant.

Let’s consider some scenario to picture the above two phrases. When we open a file in, say the Windows explorer, from a DVD ROM. The read encoutnered some error, causing the hardware & software to enter a retry loop. From the explorer, under normal scenario, one cannot end that loop, by a key stroke or even from task manager. Under Linux, the only way to end that loop, only the software loop, is to kill the process. However, the driver still continues to retry till the given number of retries are exhausted and only then it tries to return error code to the process, only to know that the process was long gone.

Under a typical RTOS scenario, the driver starts executing the reads and enters the failure loop. The difference starts here. Both the driver and the application start the operations with predefined timeout values. Hence, when a command is issued, say by the driver to the hardware, the driver waits for, say, the ISR, the driver waits for a timeout period. Hence, if the driver did not get response from the hardware within given time, the driver can choose to return error to the application indicating it needs more time to do the job.

On the other hand, the drivers for most monolithic operating systems or GPOS’ are mostly a bunch or routines with data structures tied to the calling process or a file descriptor’s context. Hence, the executing driver code contends for the hardware at the lowest level in the driver code if executed by multiple processes and the first one wins, while others enter sleep on the hardware lock. Thus, the calling process, if it did not get the lock, has to sleep/block. In some cases, the driver can also export a little framework to return immediately registering the request and respond later with the data, e.g. async file IO. Though, this allows the application process to be a bit more predictable at the execution, the application still lacks complete predictability. However, the driver is a lot more scalable with high peformance on the more capable hardware. Some other topic on that again.

The RTOS driver, on the other hand, for the most part runs it’s own thread managing the requests. A typical driver implementation is depicted by the picture above. Let’s see what makes this driver under RTOS more predictable.

The driver thread maintains a message queue. The application thread sends a request message along with it’s own queue which should receive the response. The request message contains the necessary data for the driver to execute the request, say read location or write pixel/framebuffer. Thus, once the application sends the request, it just registered it’s request with the driver and is free to do other stuff till it gets it’s response back. Now, the application can sleep on the response queue with a timeout. Thus, the CPU is free to do other jobs instead of either continuously polling the queue or waiting indefinitely. If the application is back in execution and found that it did not receive the response from the driver yet, it may choose to do some other stuff, such as informing the user or taking other corrective actions.

What makes the driver predictable is not much different from the above scenario. The driver picks up the first request form the message queue and starts executing the job. It issues the request to the hardware and waits for the response, on a semaphore with timeout, shared between driver and ISR. The hardware, after the job is done, responds by raising the interrrupt causing the CPU to start the ISR. The ISR in turn releases the semaphore, thus waking the driver process up again. The driver process, then, responds with the result to the application. If there was an error, the hardware did not respond in time back. The driver wakes up because of timeout error, figuring out that the hardware failed to do the job in time. Thus, the driver responds with an error to the application and issues an abort to the hardware for the current job. The application, can then choose whether to retry or not, by sending another request again. However, for occassional realtime systems, the driver tries itself for a couple of times before giving up. The application, upon deciding to abort the curret operation, issues a high priority message to the driver process, causing it to process the one immediately, like a backdoor, thus aborting the current operation.

The facilities of priorities, such as task priorities and timeouts against sleeping contexts such as semaphore wait enable the designer to design a system, highly predictable. Often, the life support systems are some real time systems with redundancies built in. That means, if a hardware fails to do it’s job in time, the driver immediately turns the device off bringing the system to a known state and picks the secondary hardware to do the job. The same is the case with mission critical systems, be the nuclear control systems or avionics. As it shows, handling redundancies is another speciality for a RTOS device driver.

However, the message queue and the serializaiton of the code with priorities often tend to limit the scalability of the systems.

For high performance hardware which may tend to take in multiple commands at once and respond with results one by one such as DMA controllers, instead of releasing a semaphore, the ISR sends a message to the same driver queue with results read from the hardware. The driver, when gets a chance to process the response message, looks at the context stored by the ISR and figures out the message from application, thus responding with appropriate result.

That closes the topic for RTOS drivers for now. I’ll try to throw some light  at tasks and task priorities some time.

Advertisements
%d bloggers like this: